Call for review: Hashing by hand algorithm [Cryptography]

Prev: Dynamic Hill cipher
Next: PE Scrambler

From: Maaartin on 26 Apr 2010 19:43

On Apr 27, 1:07 am, "J.D." <degolyer...(a)yahoo.com> wrote:
> On Apr 26, 6:57 pm, Maaartin <grajc...(a)seznam.cz> wrote:
>
> > On Apr 26, 11:56 pm, "J.D." <degolyer...(a)yahoo.com> wrote:
>
> > > Not all hash functions use extensive final processing between the last
> > > input and outputting the digest. For many (e.g. MD4 and MD5) the
> > > digest is simply the last chaining variable.
>
> > You're right. So maybe all modern hash functions do?
>
> Well, most hashes based on the Merkle Damgard construction seem to
> have the digest = the last chaining variable. This includes several
> of the SHA-3 hash function candidates (if I remember correctly, I can
> try to look up a cite if you want).

I did already - only 19 out of 51 first round candidates do, so I was
wrong again:
http://csrc.nist.gov/groups/ST/hash/sha-3/Round1/Feb2009/documents/sha3_classification_forler.pdf

> I like both of those ideas. They are simpler and perhaps more
> effective than my suggestion...

They are simple, and I hope, Brian will be able to evaluate it
somehow, and I hope they'll perform well. Proving anything about
security is too hard, but for hand/card ciphers/hashes the statistical
tests could be appropriate.

From: J.D. on 26 Apr 2010 20:38

On Apr 26, 7:35 pm, "J.D." <degolyer...(a)yahoo.com> wrote:
> On Apr 26, 7:22 pm, "J.D." <degolyer...(a)yahoo.com> wrote:
>
>
>
> > On Apr 26, 6:59 pm, "J.D." <degolyer...(a)yahoo.com> wrote:
>
> > > On Apr 26, 5:56 pm, "J.D." <degolyer...(a)yahoo.com> wrote:
>
> > > > Indeed. Perhaps you (brian) could improve the mix rate by doing
> > > > something like the following:
>
> > > > Split the deck into a red and a black decks by suits. Each deck would
> > > > have 26 cards, corresponding to each letter in the English alphabet..
> > > > 1) Using the red deck first, use your algorithm (in the link a few
> > > > posts above) adjusted for a deck of length 26 to feed in the first
> > > > letter of the message.
> > > > 2) Take the resulting reordered red deck, and spread them out in a
> > > > line.
> > > > 3) Then place a black card under each red card according to some
> > > > formula: For example;
> > > > i) interpret each red card as a number, (e.g. ace of hearts = 1),
> > > > ii) then apply an affine function (e.g. f = (7x)+5 mod 26) to the
> > > > number of the given red card,
> > > > iii) and then (interpreting each black card as a number in the same
> > > > range) place the black card corresponding to the output of the
> > > > function under the red card (e.g. f(1) = 12, so under the ace of
> > > > hearts place the queen of clubs, assuming the queen of clubs = 12).
> > > > 4) Then, scoop up the black deck, keeping the cards in the order they
> > > > are placed below the red cards, and use the black deck to absorb the
> > > > second letter of the message (using your algorithm).
> > > > 5) Repeat steps 2-4, except using the black deck along the top and
> > > > placing the red cards below according to the formula.
> > > > ...and continue repeating until all letters of the message are
> > > > absorbed.
>
> > > Some python code (which would tack on to your code in the link):
> > > def convert(deck):
> > > a = len(deck)
> > > length = a-1
> > > tempdeck = list()
> > > for i in range(length):
> > > tempdeck.append(((deck[i]*7)+5)%length)
> > > return tempdeck
>
> > > So, hashing a message "abc":
> > > #deck = [0....25]
> > > deck = shuffle(deck, 0) #a = 0
> > > deck = convert(deck)
> > > deck = shuffle(deck, 1) #b=1
> > > deck = convert(deck)
> > > deck = shuffle(deck, 2) #c=2
> > > deck = convert(deck)
> > > #and the digest:
> > > print(deck)
>
> > > (Note for testing with shorter decks: while the above 'convert'
> > > function will work for any length of deck, the chosen affine function
> > > (y = 7x+5 mod no. of cards in deck) is intended for decks of 26 cards
> > > -- it might not be appropriate for other deck lengths)
>
> > Oops, function should read:
>
> > def convert(deck):
> > length = len(deck)
> > tempdeck = list()
> > for i in range(length):
> > tempdeck.append(((deck[i]*7)+5)%length)
> > return tempdeck
>
> After some testing, this does not substantially increase the avalanche
> rate, and we still end up with digests for short messages with too
> much order. For instance, hash of message 3,9,15:
> [16, 21, 0, 5, 10, 15, 20, 14, 25, 4, 9, 19, 13, 24, 3, 8, 18, 6, 23,
> 2, 7, 12, 17, 22, 1, 11]
> And hash of message 3,8,15:
> [21, 6, 0, 5, 10, 15, 20, 14, 25, 4, 9, 19, 16, 24, 3, 8, 13, 18, 23,
> 2, 7, 12, 17, 22, 1, 11]
> That is clearly unacceptable...
>
> Perhaps my idea can still work, but (as described in the addendum to
> my post) by using one out of 26 different affine functions for the
> conversion step, as selected by the latest letter absorbed. I am
> still learning python (I started yesterday), so I am not sure how to
> do that. Also, it may be too complicated to efficiently do by hand.
> Maaartin's idea(s) may be more workable...

All right, so I've figured out (I think) how to do this in python.
Though it's not exactly elegant:

def superconvert(deck):
length = len(deck)
tempdeck = list()
if message[0] == 0:
for i in range(length):
tempdeck.append(((deck[i]*5)+1)%length)
return tempdeck
elif message[0] == 1:
for i in range(length):
tempdeck.append(((deck[i]*5)+9)%length)
return tempdeck
elif message[0] == 2:
for i in range(length):
tempdeck.append(((deck[i]*5)+10)%length)
return tempdeck
elif message[0] == 3:
for i in range(length):
tempdeck.append(((deck[i]*5)+23)%length)
return tempdeck
elif message[0] == 4:
for i in range(length):
tempdeck.append(((deck[i]*5)+3)%length)
return tempdeck
elif message[0] == 5:
for i in range(length):
tempdeck.append(((deck[i]*7)+5)%length)
return tempdeck
elif message[0] == 6:
for i in range(length):
tempdeck.append(((deck[i]*7)+11)%length)
return tempdeck
elif message[0] == 7:
for i in range(length):
tempdeck.append(((deck[i]*7)+24)%length)
return tempdeck
elif message[0] == 8:
for i in range(length):
tempdeck.append(((deck[i]*7)+18)%length)
return tempdeck
elif message[0] == 9:
for i in range(length):
tempdeck.append(((deck[i]*11)+2)%length)
return tempdeck
elif message[0] == 10:
for i in range(length):
tempdeck.append(((deck[i]*11)+6)%length)
return tempdeck
elif message[0] == 11:
for i in range(length):
tempdeck.append(((deck[i]*11)+13)%length)
return tempdeck
elif message[0] == 12:
for i in range(length):
tempdeck.append(((deck[i]*11)+18)%length)
return tempdeck
elif message[0] == 13:
for i in range(length):
tempdeck.append(((deck[i]*11)+23)%length)
return tempdeck
elif message[0] == 14:
for i in range(length):
tempdeck.append(((deck[i]*7)+1)%length)
return tempdeck
elif message[0] == 15:
for i in range(length):
tempdeck.append(((deck[i]*7)+14)%length)
return tempdeck
elif message[0] == 16:
for i in range(length):
tempdeck.append(((deck[i]*23)+9)%length)
return tempdeck
elif message[0] == 17:
for i in range(length):
tempdeck.append(((deck[i]*23)+16)%length)
return tempdeck
elif message[0] == 18:
for i in range(length):
tempdeck.append(((deck[i]*23)+19)%length)
return tempdeck
elif message[0] == 19:
for i in range(length):
tempdeck.append(((deck[i]*23)+25)%length)
return tempdeck
elif message[0] == 20:
for i in range(length):
tempdeck.append(((deck[i]*17)+3)%length)
return tempdeck
elif message[0] == 21:
for i in range(length):
tempdeck.append(((deck[i]*17)+7)%length)
return tempdeck
elif message[0] == 22:
for i in range(length):
tempdeck.append(((deck[i]*17)+11)%length)
return tempdeck
elif message[0] == 23:
for i in range(length):
tempdeck.append(((deck[i]*17)+20)%length)
return tempdeck
elif message[0] == 24:
for i in range(length):
tempdeck.append(((deck[i]*17)+15)%length)
return tempdeck
else:
for i in range(length):
tempdeck.append(((deck[i]*23)+1)%length)
return tempdeck

#So the main hash function is:
message = [3,9,15]
deck = [0...25]
for i in range(len(message)):
deck = shuffle(deck, message[0])
deck = superconvert(deck)
del message[0]

Some results....
Digest of message 3,9,15:
[16, 5, 0, 21, 11, 19, 6, 1, 22, 17, 12, 7, 2, 23, 18, 13, 8, 3, 24,
14, 15, 9, 4, 25, 20, 10]
Digest of message 3,8,15:
[12, 22, 7, 18, 3, 14, 25, 10, 21, 6, 17, 2, 13, 24, 9, 20, 5, 16, 1,
23, 15, 8, 0, 19, 4, 11]
Digest of message 3,9,17:
[16, 17, 7, 24, 15, 6, 23, 14, 4, 5, 22, 13, 21, 25, 12, 3, 20, 11, 2,
19, 10, 1, 18, 9, 0, 8]

So this seems to produce a good mixing. On the other hand, it appears
that the selection of affine functions (which I did very quickly and
randomly) may need to be done more carefully, as the interaction
between poorly chosen functions can produce some odd results. e.g.:

Digest of 3,9,16:
[11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 25, 24, 23, 22, 20, 13, 19, 15,
18, 17, 16, 14, 21, 12]

Given this oddity (which seems to indicate a deeper vulnerability),
and the difficulty that all this would entail if done by hand, I don't
think this is a viable addition to your hash-by-hand algorithm. Try
Maaartin's approach(es) instead.

From: Maaartin on 26 Apr 2010 21:58

On Apr 27, 2:38 am, "J.D." <degolyer...(a)yahoo.com> wrote:
> Given this oddity (which seems to indicate a deeper vulnerability),
> and the difficulty that all this would entail if done by hand, I don't
> think this is a viable addition to your hash-by-hand algorithm. Try
> Maaartin's approach(es) instead.

It looks like my ideas don't perform well. Especially the "idea for
the data extraction" seems to make no good - it loves to produce short
cycles. I think there's a remedy.

Below my program but I'm a very beginner in python, too:

#creates a deck of n cards in each of 4 shapes
def new_deck(n):
deck = [];
for col in ["c", "d", "h", "s"]:
for n in range(n):
deck = deck + [col + str(n)];
return deck;

#return the numerical value of a card, i.e., a number from 0..12
def card_to_number(card):
return int(card[1:]);

#return true iff the card is black, i.e., for clubs and spades
def card_is_black(card):
c = card[0]
return (c=='c') | (c=='s')

def card_to_letter(card):
c = card[0]
high = 13 if (c=='h') | (c=='s') else 0;
low = card_to_number(card)
return chr(ord('a') + high + low);

def two_color_shuffle(deck):
i = 0
col = card_is_black(deck[0])
while card_is_black(deck[i])==col: i += 1
while card_is_black(deck[i])!=col: i += 1
return deck[i:] + deck[:i]

def four_shapes_shuffle(deck):
i = 0
map = {"c" : False, "d" : False, "h" : False, "s" : False}
found = 0;
while True:
c = deck[i][0]
i += 1
#print i, c, map[c], found
if not map[c]:
map[c] = True
found += 1
if found==4: break
return deck[i:] + deck[:i]

#return a shuffled deck according to "idea for the data extraction";
the output is the last card
#sometimes it suffers from short periods!!!
def output_shuffle(deck):
first = deck[0];
result = deck[1:];
count = card_to_number(deck[0]) + 1 # assuming stack is long enough
#print "<" + str(count) + ">";
#print "=>" + deck[count-1];
return deck[count:] + deck[0:count];

#nearly the original "http://brianpmearns.com/bpm/shufflehash"
def shuffle(deck, letter):
for i in range(len(deck)-1):
if card_to_letter(deck[i]) == letter:
d = deck[i+1]
a = deck[0:i]
b = deck[i+2:]
return b + [deck[i]] + a + [d]
#Edge case, the card we want is the last one.
d = deck[0]
a = deck[1:-1]
return [deck[len(deck)-1]] + a + [d]

### DEMO

def print_deck(prefix, deck):
result = ""
for card in deck: result += " " + card
print prefix + result[1:]

deck = new_deck(13)
print_deck("init ", deck)

print "\n", "input_shuffle";
for letter in "aeiouy":
deck = shuffle(deck, letter)
print_deck(letter + " => ", deck)

print "\n", "output_shuffle"
for n in range(8):
deck = output_shuffle(deck)
print_deck(deck[len(deck)-1] + " <= ", deck)

print "\n", "two_color_shuffle"
for n in range(2):
deck = two_color_shuffle(deck)
print_deck("TC ", deck)

print "\n", "four_shapes_shuffle"
for n in range(2):
deck = four_shapes_shuffle(deck)
print_deck("FS ", deck)

From: J.D. on 26 Apr 2010 22:13

On Apr 26, 9:58 pm, Maaartin <grajc...(a)seznam.cz> wrote:
> On Apr 27, 2:38 am, "J.D." <degolyer...(a)yahoo.com> wrote:
>
> > Given this oddity (which seems to indicate a deeper vulnerability),
> > and the difficulty that all this would entail if done by hand, I don't
> > think this is a viable addition to your hash-by-hand algorithm. Try
> > Maaartin's approach(es) instead.
>
> It looks like my ideas don't perform well. Especially the "idea for
> the data extraction" seems to make no good - it loves to produce short
> cycles. I think there's a remedy.
>
> Below my program but I'm a very beginner in python, too:
>
> #creates a deck of n cards in each of 4 shapes
> def new_deck(n):
> deck = [];
> for col in ["c", "d", "h", "s"]:
> for n in range(n):
> deck = deck + [col + str(n)];
> return deck;
>
> #return the numerical value of a card, i.e., a number from 0..12
> def card_to_number(card):
> return int(card[1:]);
>
> #return true iff the card is black, i.e., for clubs and spades
> def card_is_black(card):
> c = card[0]
> return (c=='c') | (c=='s')
>
> def card_to_letter(card):
> c = card[0]
> high = 13 if (c=='h') | (c=='s') else 0;
> low = card_to_number(card)
> return chr(ord('a') + high + low);
>
> def two_color_shuffle(deck):
> i = 0
> col = card_is_black(deck[0])
> while card_is_black(deck[i])==col: i += 1
> while card_is_black(deck[i])!=col: i += 1
> return deck[i:] + deck[:i]
>
> def four_shapes_shuffle(deck):
> i = 0
> map = {"c" : False, "d" : False, "h" : False, "s" : False}
> found = 0;
> while True:
> c = deck[i][0]
> i += 1
> #print i, c, map[c], found
> if not map[c]:
> map[c] = True
> found += 1
> if found==4: break
> return deck[i:] + deck[:i]
>
> #return a shuffled deck according to "idea for the data extraction";
> the output is the last card
> #sometimes it suffers from short periods!!!
> def output_shuffle(deck):
> first = deck[0];
> result = deck[1:];
> count = card_to_number(deck[0]) + 1 # assuming stack is long enough
> #print "<" + str(count) + ">";
> #print "=>" + deck[count-1];
> return deck[count:] + deck[0:count];
>
> #nearly the original "http://brianpmearns.com/bpm/shufflehash"
> def shuffle(deck, letter):
> for i in range(len(deck)-1):
> if card_to_letter(deck[i]) == letter:
> d = deck[i+1]
> a = deck[0:i]
> b = deck[i+2:]
> return b + [deck[i]] + a + [d]
> #Edge case, the card we want is the last one.
> d = deck[0]
> a = deck[1:-1]
> return [deck[len(deck)-1]] + a + [d]
>
> ### DEMO
>
> def print_deck(prefix, deck):
> result = ""
> for card in deck: result += " " + card
> print prefix + result[1:]
>
> deck = new_deck(13)
> print_deck("init ", deck)
>
> print "\n", "input_shuffle";
> for letter in "aeiouy":
> deck = shuffle(deck, letter)
> print_deck(letter + " => ", deck)
>
> print "\n", "output_shuffle"
> for n in range(8):
> deck = output_shuffle(deck)
> print_deck(deck[len(deck)-1] + " <= ", deck)
>
> print "\n", "two_color_shuffle"
> for n in range(2):
> deck = two_color_shuffle(deck)
> print_deck("TC ", deck)
>
> print "\n", "four_shapes_shuffle"
> for n in range(2):
> deck = four_shapes_shuffle(deck)
> print_deck("FS ", deck)

Jesus, Maaartin. Isn't it like 4 AM where you are? Go to sleep!

From: bmearns on 26 Apr 2010 22:39

Maaartin and J.D., thanks so much for all the great work on this. I
really appreciate all the effort and feedback. It's getting a bit late
here, as well, so I'll take another look through your suggestions
tomorrow.

More responses below. =)

On Apr 26, 5:56 pm, "J.D." <degolyer...(a)yahoo.com> wrote:
> On Apr 26, 4:39 pm, Maaartin <grajc...(a)seznam.cz> wrote:
>
> > On Apr 26, 7:41 pm, bmearns <mearn...(a)gmail.com> wrote:
> > What I liked on your former scheme (and miss now) was the use of card
> > color - this is quite easy and could lead to good mixing. Would you
> > like something like this?
>
> Indeed. Perhaps you (brian) could improve the mix rate by doing
> something like the following:
>
> Split the deck into a red and a black decks by suits. Each deck would
> have 26 cards, corresponding to each letter in the English alphabet.
> 1) Using the red deck first, use your algorithm (in the link a few
> posts above) adjusted for a deck of length 26 to feed in the first
> letter of the message.
> 2) Take the resulting reordered red deck, and spread them out in a
> line.
> 3) Then place a black card under each red card according to some
> formula: For example;
> i) interpret each red card as a number, (e.g. ace of hearts = 1),
> ii) then apply an affine function (e.g. f = (7x)+5 mod 26) to the
> number of the given red card,
> iii) and then (interpreting each black card as a number in the same
> range) place the black card corresponding to the output of the
> function under the red card (e.g. f(1) = 12, so under the ace of
> hearts place the queen of clubs, assuming the queen of clubs = 12).
> 4) Then, scoop up the black deck, keeping the cards in the order they
> are placed below the red cards, and use the black deck to absorb the
> second letter of the message (using your algorithm).
> 5) Repeat steps 2-4, except using the black deck along the top and
> placing the red cards below according to the formula.
> ...and continue repeating until all letters of the message are
> absorbed.

I had considered something vaguely (/vaguely/) similar, especially
with my first algorithm (with the two rows). Essentially, I was
thinking of using the top row as the key for a transposition (sorry,
I'm showing my affinity for classic ciphers, but I guess that's sort
of appropriate in this case). In other words, I would reorder the
bottom row by virtually sorting the top row.
E.g.,
T: 3 5 1 2 4 0
B: 5 3 2 0 1 4
So the bottom would be reordered: 4 2 0 5 1 3

However, after the rather disastrous results of actually trying to
perform my first algorithm, I'm very weary of over complicating it. I
have no doubt that your suggestions would greatly improve the security
of the algorithm, but it doesn't seem very feasible to perform by
hand. Multiplying and taking mod 26 is going to slow things way down,
and is going to be pretty error prone I think.

But I think I did come up with a reasonable way to significantly
improve the security. The algorithm is exactly the same, but after
feeding in the entire input, I re-hash the resulting deck. For
instance, I cut a deck in two (Hearts and Clubs, Spades and Diamonds),
feed my input into one of them. Then, I feed all 26 cards from that
deck into the second deck, and I can take my output from that. I /
think/ that will significantly obscure any sequences, even for fairly
short inputs. And it's pretty reasonable to do since the algorithm is
so quick. So this is really just applying the concept of iterations.

I'm thinking of some tests that I'll hopefully implement and run
tomorrow. I'm going to step through each character in the resulting
deck and see how many of them make a two-card sequence with the next
one. I /think/ that a random permutation will average 1 two-card
sequence, but I'll include this in my testing to see if it's true.
With this test, I can see how many inputs are required for the average
number of two-card sequences per permutation to be the same as for
random permutations. I'll try it with just a straight application, to
see if a reasonable number of inputs can still reach the random-like
threshold, then I'll try it with my above idea (a second iteration) to
see if it brings the minimum number of inputs down at all.

[snip]
>
> Addendum -- if this isn't too complicated already, you could have 26
> _different_ affine functions for step 3, one per letter, and use the
> function A when the message letter just absorbed by the hash was 'a'
> and function B when the letter was 'b', and so on.

I'm afraid it might be too complicated already =).

Thanks, looking forward to more feedback!
-Brian

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Prev: Dynamic Hill cipher
Next: PE Scrambler