Call for review: Hashing by hand algorithm [Cryptography]

Prev: Dynamic Hill cipher
Next: PE Scrambler

From: Maaartin on 28 Apr 2010 15:51

On Apr 28, 8:08 pm, bmearns <mearn...(a)gmail.com> wrote:
> > I've got an idea: When dealing, you must go through all the cards.
> > When searching an instance of a card (or both instances), you go throu
> > many of them. So we could try to join the steps in e.g., some thing
> > like this:
>
> > 1. Take 2 (variant: or 3) cards from the top, look at them
> > 2. If the searched card is not there, put them to the bottom (variant:
> > to another deck) and goto 1
> > 3. Otherwise, start another deck. This way you could perform something
> > like the double or the riple cat, too.
>
> I like the idea, but I'm a little fuzzy on the last part. What are you
> doing when you find the searched card?

I don't know yet. Maybe forget about the current target deck and start
another one, thus assigning the cards to two (or three in case of
using two instances) decks. Maybe use more decks, I'll think about it.

> Oh, sorry, FDR refers to Franklin Delano Roosevelt. He was president
> in the US in the 1930's and 1940's. He's famous for an economic
> revitalization program called "the new deal".

My fault, I just had to UTFG. I knew about "the new deal".

> You mean there are multiple initial deck arrangements that would lead
> to the same output? I'm not seeing how that would be the case.

But it must be. Before the step, you know nothing, each state is
possible. Afterwards you know that somewhere at the bottom there's a
card with a value corresponding with its position (either a 1 is the
last card, or 2 is the last but one, etc.). Any state without such a
card is unreachable.

The probability that a state is unreachable is about (12./13) ** 13 =
0.35, so you loose about -log(1 - (12./13) ** 13) / log(2) = 0.6 bit.

> > It can be done differently using other transformation to numbers:
> > clubs=1, diamonds=2, hearts=3, spades=4. This leads to destroying the
> > sequence more.
>
> Nice spotting, that could improve things considerably.

And this is the first use of the card color.

> > > So complimentary cards are no longer equivalent, but we can still use
> > > the triple cut which I think is better than a single cut.
> > But is it also better than two single cuts?
> I supposed it depends on how you do those two single cuts...

So do I.

> I don't think our deck looses any information, per se, I just meant to
> point out that the deck is quite vulnerable at this point: were an
> attacker to get their hands on it at this point in the algorithm, it
> would not be difficult for them to reverse it, at least to one of just
> a handful of possibilities. More of just an observation than a
> practical concern. Something to keep in mind if, for instance, we had
> the idea to apply this in the final step for some reason.

But as I already wrote, there's a condition which holds after the step
and not before. So some information must have gone lost.

> You're right of course. It loops far more easily than I could have
> imagined if I hadn't seen it.

> > > but if it
> > > stops looping it's obviously worth it, and we should still be able to
> > > get a good enough password from it: 26 permute 12 is still 52 bits.
> > I don't understand this.
>
> The output would be a permutation actually, just not of the whole set.

Not with my output routine. The card I throw away is not the one sent
to the output.

> If you remove each card from the deck as you pick it, that means you
> can't pick that card again. That means each sequential card chosen is
> selected from a smaller set, so it encodes fewer bits.

Not really, there's only log(N!)/log(2) bits of information there, and
you get it all if you output the cards from top to bottom, right? What
changes if you throw the output card away? Nothing, right?

> Even without dropping the card every time, we don't quite get 26^N
> possible outcomes, because when a card is moved to the end, we won't
> be able to select it in the next round.

This is impossible because of 26! < 26**26, you need no nore reasons.
But I agree that thins could lead to a fast distinguisher. Using 52
cards helps a lot here.

> We may be able to select it in
> the following round, but it's still unlikely, and so on for a few more
> rounds. But it will still be no less than if we do drop the card.

I don't drop the output card, so repetitions are quite possible. The
dropped card could be used later again, e.g., you could take them all
back into your deck, when it gets too small for the current operation.

> That's moot, however, because the looping is inexcusable.

Sure, looping is the worst. One version of my output routine looped
very often even with perfectly mixed decks.

From: J.D. on 28 Apr 2010 16:55

I have some preliminary speed data for my algorithm and Maaartin's
algorithm. I also have a few 'practicality' observations from doing
this stuff with real cards. If you have settled on an algorithm,
bmearns, let me know what it is so I can speed-test it too.

I decided to split up the hashing process for both of our algorithms
into three phases:
1) Translate the entire message to be hashed into cards.
2) Absorb all of the letters into the hash.
3) Finalization (doing any last steps that only occur after all
letters are absorbed, then writing down or otherwise using the
resulting deck as the digest)

Since all of the various proposals would require translating the
message into cards, and all the various proposals would require
writing down or using the resulting digest, and thus all proposals
would be on the same playing field for those operations, I decided to
just focus on the difference of speeds for step 2. Maaartin also has
three buries in step 3, but those are so quick (about 20 - 30 seconds
for all three) that I am not counting those for the purposes of speed.

As stated before, I also decided not to test accuracy at this stage --
that would require resorting the deck into the proper starting order
after each run, which takes too long for my limited patience.

The sample text I decided to use for hashing is as follows (omitting
all spaces, numbers and non-letter characters):
DIFFERENTIALCRYPTANALYSISOFDESLIKE
CRYPTOSYSTEMSELIBIHAMADISHAMIRTHE
DATAENCRYPTIONSTANDARDDESISTHEBEST
KNOWNANDMOSTWIDELYUSEDCRYPTOSYSTEM
FORCIVILIANAPPLICATIONSITWASDEVELOPED
ATIBMANDADOPTEDBYTHENATIONALBURAEU
OFSTANDARDSINTHEMIDSEVENTIESANDHAS
SUCCESSFULLYWITHSTOODALLTHEATTACKS
PUBLISHEDSOFARINTHEOPENLITERATUREIN
THISPAPERWEDEVELOPANEWTYPEOF
CRYPTANALYTICATTACKWHICHCANBREAKDES
WITHUPTOEIGHTROUNDSINAFEWMINUTESON
APCANDCANBREAKDESWITHUPTOFIFTEEN
ROUNDSFASTERTHANANEXHAUSTIVESEARCH
THENEWATTACKCANBEAPPLIEDTOAVARIETY
OFDESLIKESUBSTITUTIONPERMUTATION
CRYPTOSYSTEMSANDDEMONSTRATESTHE
CRUCIALROLEOFTHEUNPUBLIHEDDESIGNRULES
(you may recognize this as the title, authors and abstract of a famous
paper in cryptanalysis)

I started with my algorithm. I hashed the first 47 letters of the
sample text all in one go, and took 36 minutes to do so, at a rate of
1.31 letters per minute. Then I hashed the next 17 letters in 12
minutes, at a rate of 1.42 letters per minute. The next 28 letters
took only 18 minutes, at a rate of 1.56 letters per minute.

I did not start out intending to do different size groups -- those
were just when I wanted to take a break from hashing (my hands would
cramp up after a while). However, I noticed that the rate was
increasing as I had more practice, and I decided this was probably
something I would want to measure more reliably.

I switched to Maaartin's algorithm, and this time hashed four segments
of 15 letters apiece. The results were as follows:
First 15 letters - 14 minutes (rate 1.07 letters/minute)
Second 15 - 13 minutes (rate 1.15 letters/minute)
Third 15 - 12 minutes (rate 1.25 letters/minute)
Fourth 15 - 11 minutes (rate 1.36 letters/minute)

Note for clarity: as stated above, none of these times include the
time necessary to translate the letters into card-instances, or to
perform any finalization steps. I will try again with both algorithms
(and bmearns) to see if we can't get more data, and thus get a better
approximation. Right now though I would guess that neither algorithm
can do more than 2 letters per minute (even with a lot of practice),
but also both algorithms can comfortably do more than 1 letter per
minute.

Specific observations on the algorithms:

There are a couple of things that seem to be slowing down Maaartin's
algorithm;
- There are two shuffle-cuts, divided by a bury, and this means that
for each letter I often had to search _twice_ through most of the deck
in order to find the first instance of the letter and perform the
shuffle cuts (though sometimes I could remember exactly where it was
from doing the first shuffle-cut). I also sometimes forgot what the
two instances were for the second shuffle-cut (the bury can distract
your short term memory), and had to glance back at my translation
sheet.
- The simple deal operation can be reasonably quick, once you get the
hang of it. However I think you might increase the speed if the
increments were larger and were not dissimilar in amount (e.g. 3 and 3
instead of 1 and 2). However you would have to do statistical testing
to see if that kind of deal function does as well as your (1,2) deal
function.

For my algorithm, the variable deal function seems slow and is kind of
a pain, and is probably the area where mistakes are most likely.
However I noticed that counting by threes increased the speed -- i.e.
in order to pull off 11 cards from the top, pull three into your hand
at once, then another three, then another three, and then two more. I
think this is because the brain can quickly distinguish a triplet of
items -- more quickly than a group of four or more items. You can
also check whether you've screwed up at the end of the variable deal
function -- if you're pulling off increments of five, and you end up
with 3 cards remaining, then you know you've messed up somewhere
(because 52 mod 5 = 2). Keeping each increment-pile slightly separate
(with one edge resting on the pile below so you can quickly scoop them
all together at the end) should enable you to go back through and redo
the deal if your check reveals a mistake.

Both Maaartin's bury function and my variable deal function benefit
significantly from using the face numbers (with ace = 14, jack = 11,
queen = 12 and king = 13) to determine the amounts. That is a very
easy rule to remember, and quick to use.

Well, that's all for now. I will continue to do more tests and
collect more data, and I invite you to do the same -- more data can
only help. For now I have some real (i.e. non-cryptography) work to
do (I know, shocking!). So I'll have to continue the tests tomorrow.

From: Maaartin on 28 Apr 2010 20:07

An idea (still very incomplete) for dealing_find:
1a. Take the uppermost 3 cards.
1b. Put first 1 of them on deck1 and the others on deck2.
1c. If none of them corresponds with the input letter, goto 1a.
1d. Do something according to where the corresponding card exactly is.
2a. Take the uppermost 4 cards.
2b. Put first 1 of them on deck3 and the others on deck4.
2c. If none of them corresponds with the input letter, goto 2a.
2d. Do something according to where the corresponding card exactly is.
3a. Take the uppermost 5 cards.
3b. Put first 2 of them on deck5 and the others on deck6.
3c. If there cards left, goto 3a.
4a. Combine the decks somehow to a single one.

Of course, Brian won't like the 6 decks, but let's see how it can
work, and then maybe do the deck combining in between. The number of
cards taken was chosen to be different at each stage, there's nothing
more behind.

On Apr 28, 10:55 pm, "J.D." <degolyer...(a)yahoo.com> wrote:
> Since all of the various proposals would require translating the
> message into cards, and all the various proposals would require
> writing down or using the resulting digest, and thus all proposals
> would be on the same playing field for those operations, I decided to
> just focus on the difference of speeds for step 2. Maaartin also has
> three buries in step 3, but those are so quick (about 20 - 30 seconds
> for all three) that I am not counting those for the purposes of speed.

20 seconds is more than I expected, but not too much.

I see, mine algorithm is slightly slower than yours, that's sad.

> Right now though I would guess that neither algorithm
> can do more than 2 letters per minute (even with a lot of practice),
> but also both algorithms can comfortably do more than 1 letter per
> minute.

That's very slow, I'm afraid.

> There are a couple of things that seem to be slowing down Maaartin's
> algorithm;
> - There are two shuffle-cuts, divided by a bury, and this means that
> for each letter I often had to search _twice_ through most of the deck
> in order to find the first instance of the letter and perform the
> shuffle cuts

That shouldn't be. But it's true. I computed that my program on the
average goes through about 3/4 of the stack for each card. It seems
like the first instance land at the bottom and the other has a fair
chance to get buried.

Doing the suit-based bury could help here, since it moves less cards
most of the time.

> (though sometimes I could remember exactly where it was
> from doing the first shuffle-cut).

You can start from the bottom (skipping over the first instance found)
in case the other instance get burried. This way you need to go
through up to about 15 cards.

> I also sometimes forgot what the
> two instances were for the second shuffle-cut (the bury can distract
> your short term memory), and had to glance back at my translation
> sheet.

This is bad, as it ideally should work without writing anything down.
I'll think about using triple-cut.

> - The simple deal operation can be reasonably quick, once you get the
> hang of it. However I think you might increase the speed if the
> increments were larger and were not dissimilar in amount (e.g. 3 and 3
> instead of 1 and 2). However you would have to do statistical testing
> to see if that kind of deal function does as well as your (1,2) deal
> function.

The (1,2) deal is very good at breaking pairs; it breaks 2 out of 3
pairs. But we don't need it, since it gets executed many times. So
it's too good and can be changed. I'd say that (3,4) deal is good
enough (but let's make the stats). Using different amounts may be
important, or not, we need the stats.

> For my algorithm, the variable deal function seems slow and is kind of
> a pain, and is probably the area where mistakes are most likely.
> However I noticed that counting by threes increased the speed -- i.e.
> in order to pull off 11 cards from the top, pull three into your hand
> at once, then another three, then another three, and then two more. I
> think this is because the brain can quickly distinguish a triplet of
> items -- more quickly than a group of four or more items. You can
> also check whether you've screwed up at the end of the variable deal
> function -- if you're pulling off increments of five, and you end up
> with 3 cards remaining, then you know you've messed up somewhere
> (because 52 mod 5 = 2). Keeping each increment-pile slightly separate
> (with one edge resting on the pile below so you can quickly scoop them
> all together at the end) should enable you to go back through and redo
> the deal if your check reveals a mistake.

Agreed.

> Both Maaartin's bury function and my variable deal function benefit
> significantly from using the face numbers (with ace = 14, jack = 11,
> queen = 12 and king = 13) to determine the amounts. That is a very
> easy rule to remember, and quick to use.
>
> Well, that's all for now. I will continue to do more tests and
> collect more data, and I invite you to do the same -- more data can
> only help. For now I have some real (i.e. non-cryptography) work to
> do (I know, shocking!). So I'll have to continue the tests tomorrow.

I'll do some tests later, now I need to do different things, too. In
the meantime the algorithms may change using the information gained,
so don't hurry to do more hand tests; what you did is enough for now.

From: J.D. on 29 Apr 2010 00:30

On Apr 28, 8:07 pm, Maaartin <grajc...(a)seznam.cz> wrote:

>
> I see, mine algorithm is slightly slower than yours, that's sad.
>

I don't think we have enough data to draw that conclusion. Technique
can make a critical difference. For example, I did not think of
starting the search for the second shuffle-cut from the bottom (and
then skipping the first instance from the bottom). That might boost
the rate somewhat. With more practice, or data from other people, the
two rates might converge or yours could even overtake mine.

The only conclusion I am comfortable drawing from the data we have now
is that both of our algorithms probably hash at a rate somewhere
between 1 and 2 letters per minute.

> > Right now though I would guess that neither algorithm
> > can do more than 2 letters per minute (even with a lot of practice),
> > but also both algorithms can comfortably do more than 1 letter per
> > minute.
>
> That's very slow, I'm afraid.

Experience with doing ciphers by hand tells me that anything that is
both cryptographically non-trivial and that can process characters at
a rate faster than 1 per minute is pretty good. Solitaire, for
example, is quite slow.

> In
> the meantime the algorithms may change using the information gained,
> so don't hurry to do more hand tests; what you did is enough for now.

I am only too glad to stop for now...

From: MrD on 29 Apr 2010 02:48

J.D. wrote:
>
> The sample text I decided to use for hashing is as follows (omitting
> all spaces, numbers and non-letter characters):
> DIFFERENTIALCRYPTANALYSISOFDESLIKE
> CRYPTOSYSTEMSELIBIHAMADISHAMIRTHE
> DATAENCRYPTIONSTANDARDDESISTHEBEST
> KNOWNANDMOSTWIDELYUSEDCRYPTOSYSTEM
> FORCIVILIANAPPLICATIONSITWASDEVELOPED
> ATIBMANDADOPTEDBYTHENATIONALBURAEU
> OFSTANDARDSINTHEMIDSEVENTIESANDHAS
> SUCCESSFULLYWITHSTOODALLTHEATTACKS
> PUBLISHEDSOFARINTHEOPENLITERATUREIN
> THISPAPERWEDEVELOPANEWTYPEOF
> CRYPTANALYTICATTACKWHICHCANBREAKDES
> WITHUPTOEIGHTROUNDSINAFEWMINUTESON
> APCANDCANBREAKDESWITHUPTOFIFTEEN
> ROUNDSFASTERTHANANEXHAUSTIVESEARCH
> THENEWATTACKCANBEAPPLIEDTOAVARIETY
> OFDESLIKESUBSTITUTIONPERMUTATION
> CRYPTOSYSTEMSANDDEMONSTRATESTHE
> CRUCIALROLEOFTHEUNPUBLIHEDDESIGNRULES
> (you may recognize this as the title, authors and abstract of a famous
> paper in cryptanalysis)

"Bureau" is mis-spelt (this might conceivably affect your results).

--
MrD.

First | Prev | Next | Last
Pages: 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Prev: Dynamic Hill cipher
Next: PE Scrambler