Prev: Dynamic Hill cipher
Next: PE Scrambler
From: J.D. on 27 Apr 2010 00:09 On Apr 26, 10:39 pm, bmearns <mearn...(a)gmail.com> wrote: > However, after the rather disastrous results of actually trying to > perform my first algorithm, I'm very weary of over complicating it. I > have no doubt that your suggestions would greatly improve the security > of the algorithm, but it doesn't seem very feasible to perform by > hand. Multiplying and taking mod 26 is going to slow things way down, > and is going to be pretty error prone I think. Yeah, that's the conclusion I came to as well. This is partially rectifiable by using written tables (so you aren't doing math in your head), but that defeats the simplicity of the "deck of cards" approach, and takes time to draw up the tables. > > But I think I did come up with a reasonable way to significantly > improve the security. The algorithm is exactly the same, but after > feeding in the entire input, I re-hash the resulting deck. For > instance, I cut a deck in two (Hearts and Clubs, Spades and Diamonds), > feed my input into one of them. Then, I feed all 26 cards from that > deck into the second deck, and I can take my output from that. I / > think/ that will significantly obscure any sequences, even for fairly > short inputs. And it's pretty reasonable to do since the algorithm is > so quick. So this is really just applying the concept of iterations. That's also sounds like a good idea. I would go with the full deck though. One thing I didn't mention with my above approach (because it already had enough going against it) was that it effectively cuts the internal state down to 26 cards, which is roughly equivalent to 88 bits. Brute force collision attacks generally require about 2^n/2 evaluations of a hash function (where n = the size in bits of the chaining variable, which I think is analogous to the internal state of the shuffle hash). So brute forcing the shuffle hash function (with a 26 card deck) would take only 2^44 evaluations -- which doesn't take very long using a computer. If you are going to go the above route, I would recommend going full out and using all 52 cards. This increases the internal state to approximately 225 bits (and makes brute force collision attacks require approximately 2^112 evaluations). That is to say: i) feed the message into the (52 card) deck via your algorithm, ii) let the new message = the 52 card digest from step (i), or some relatively large subset of that digest (if you can't stomach hashing a 52-letter-long message), iii) reset the deck and feed the new message into the deck via your algorithm, iv) output the final digest. One complication that you might consider: I really like Maaartin's idea of using the colors of the cards to increase the mixing rate, but I kind of want to do something else with that notion. OK, let's say that each letter of the alphabet is represented by two cards -- one red and one black. Let's call the two cards that stand for a letter the "instances" of the letter. e.g. 'A' is represented by the Ace of Hearts and the Ace of Clubs -- and both cards are "instances" of the letter A. The first instance you encounter while moving through the deck from the first card to the last is called the "first instance", and the second instance encountered is the "second instance". To feed 'A' into the hash: i) find the first instance of A, ii) find the second instance of A, iii) take every card between the first and second instances of A (which may be zero) and set them aside for the moment, iv) take the first instance of A and move it to the front of the deck (if it's not already there), v) take all cards from the second instance of A to the end and move them all to the front of the deck (i.e. swap front and back 'halves'), vi) take the set aside cards that originally were between the two instances and move them to the front of the deck (there may have been none, in which case do nothing). To illustrate: Steps i and ii) <---10 cards--><Ace of Clubs><--5 cards--><Ace of Hearts><--Rest of deck--> Step iii) <---10 cards--><Ace of Clubs><Ace of Hearts><--Rest of deck-- > .....<--5 cards--> Step iv) <Ace of Clubs><---10 cards--><Ace of Hearts><--Rest of deck-- > .....<--5 cards--> Step v) <Ace of Hearts><--Rest of deck--><Ace of Clubs><---10 cards-- > .....<--5 cards--> Step vi) <--5 cards--><Ace of Hearts><--Rest of deck--><Ace of Clubs><---10 cards-->
From: Maaartin on 27 Apr 2010 09:42 On Apr 27, 6:09 am, "J.D." <degolyer...(a)yahoo.com> wrote: > On Apr 26, 10:39 pm, bmearns <mearn...(a)gmail.com> wrote: > That's also sounds like a good idea. I would go with the full deck > though. One thing I didn't mention with my above approach (because it > already had enough going against it) was that it effectively cuts the > internal state down to 26 cards, which is roughly equivalent to 88 > bits. Brute force collision attacks generally require about 2^n/2 > evaluations of a hash function (where n = the size in bits of the > chaining variable, which I think is analogous to the internal state of > the shuffle hash). So brute forcing the shuffle hash function (with a > 26 card deck) would take only 2^44 evaluations -- which doesn't take > very long using a computer. Right, but collisions are probably not very important, especially when it gets used for generating passwords. Nonetheless, having a large internal state seems to be a good idea, and it cost about nothing, since the work involved is nearly independent of the stack size. The hardest and most error-prone part is IMHO converting the letters into cards and vice versa, at least in the simple version I'm heading towards. > If you are going to go the above route, I would recommend going full > out and using all 52 cards. This increases the internal state to > approximately 225 bits (and makes brute force collision attacks > require approximately 2^112 evaluations). That is to say: > i) feed the message into the (52 card) deck via your algorithm, > ii) let the new message = the 52 card digest from step (i), or some > relatively large subset of that digest (if you can't stomach hashing a > 52-letter-long message), > iii) reset the deck and feed the new message into the deck via your > algorithm, > iv) output the final digest. This means either writing the cards down or use two stacks, right? > One complication that you might consider: I really like Maaartin's > idea of using the colors of the cards to increase the mixing rate, but > I kind of want to do something else with that notion. The two operation I proposed (two_color_shuffle and four_shapes_shuffle) are not good. I still hope we could do something similar, but I like your proposal (below), too. > OK, let's say > that each letter of the alphabet is represented by two cards -- one > red and one black. Let's call the two cards that stand for a letter > the "instances" of the letter. e.g. 'A' is represented by the Ace of > Hearts and the Ace of Clubs -- and both cards are "instances" of the > letter A. The first instance you encounter while moving through the > deck from the first card to the last is called the "first instance", > and the second instance encountered is the "second instance". > > To feed 'A' into the hash: > > i) find the first instance of A, > ii) find the second instance of A, > iii) take every card between the first and second instances of A > (which may be zero) and set them aside for the moment, > iv) take the first instance of A and move it to the front of the deck > (if it's not already there), > v) take all cards from the second instance of A to the end and move > them all to the front of the deck (i.e. swap front and back 'halves'), > vi) take the set aside cards that originally were between the two > instances and move them to the front of the deck (there may have been > none, in which case do nothing). The implementation is shorter than the description, see double_shuffle below. I changed the representation to letters, I use both lowercase and uppercase, but they're treated exactly the same. I also added a new optional operation: deal(deck, n), which is very simple: Take n cards for the top, put them on the table, repeat until there are no more cards left (in the last step there may be less than n cards). This is like dealing the cards to one player, it reverses the deck and for n>1 shuffles it a bit. The function double_shuffle(deck, letter, shouldDeal) looks for the first and the last (i.e., second) occurrence of letter, and does what you described. With shouldDeal=True it also uses the function deal for the head and tail parts with n=2 and n=2, respectively. The program is very simple: ######### def new_deck(n=26, occurrences=2): deck = ""; for i in range(occurrences): for j in range(n): deck = deck + chr(ord("aA"[i&1]) + j) return deck; def deal(deck, n): result = "" while len(deck)>n: result = deck[:n] + result deck = deck[n:] result = deck + result return result; def double_shuffle(deck, letter, shouldDeal): for i in range(len(deck)): if deck[i].lower() == letter: break for j in range(len(deck)-1, 0, -1): if deck[j].lower() == letter: break head = deck[0:i] middle = deck[i+1:j] tail = deck[j+1:] if shouldDeal: head = deal(head, 2) tail = deal(tail, 3) return middle + deck[j] + tail + deck[i] + head; def demo(shouldDeal): deck = new_deck(); print "with" if shouldDeal else "without", "deal" print " ", deck for x in "qwerty": deck = double_shuffle(deck, x, shouldDeal); print x, " => ", deck demo(False) demo(True) ######### Here are the results, it looks like deal is doing a good job: ######### without deal abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ q => rstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZqabcdefghijklmnop w => xyzABCDEFGHIJKLMNOPQRSTUVWXYZqabcdefghijklmnopwrstuv e => FGHIJKLMNOPQRSTUVWXYZqabcdefghijklmnopwrstuvExyzABCD r => STUVWXYZqabcdefghijklmnopwrstuvExyzABCDRFGHIJKLMNOPQ t => UVWXYZqabcdefghijklmnopwrstuvExyzABCDRFGHIJKLMNOPQTS y => ZqabcdefghijklmnopwrstuvExyzABCDRFGHIJKLMNOPQTSYUVWX with deal abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ q => rstuvwxyzABCDEFGHIJKLMNOPQXYZUVWRSTqopmnklijghefcdab w => xyzABCDEFGHIJKLMNOPQXYZUVWabfcdghelijmnkqopRSTwvturs e => FGHIJKLMNOPQXYZUVWabfcdgheurswvtRSTqopmnklijEDBCzAxy r => swvtRyzAxDBCijEnklopmSTqruhedgfcabVWZUXYPQNOLMJKHIFG t => RyzAxDBCijEnklopmSTFGKHILMJQNOXYPWZUabVgfchedqrutvsw y => zAxDBCijEnklopmSTFGKHILMJQNOXYswutvdqrcheVgfUabPWZyR #########
From: bmearns on 27 Apr 2010 10:23 On Apr 27, 12:09 am, "J.D." <degolyer...(a)yahoo.com> wrote: > On Apr 26, 10:39 pm, bmearns <mearn...(a)gmail.com> wrote: > > > However, after the rather disastrous results of actually trying to > > perform my first algorithm, I'm very weary of over complicating it. I > > have no doubt that your suggestions would greatly improve the security > > of the algorithm, but it doesn't seem very feasible to perform by > > hand. Multiplying and taking mod 26 is going to slow things way down, > > and is going to be pretty error prone I think. > > Yeah, that's the conclusion I came to as well. This is partially > rectifiable by using written tables (so you aren't doing math in your > head), but that defeats the simplicity of the "deck of cards" > approach, and takes time to draw up the tables. Agreed. There are already lots of good hash algorithms available; if we can't come up with one that is extremely simple to perform by hand, there's really no point. > > > > > But I think I did come up with a reasonable way to significantly > > improve the security. The algorithm is exactly the same, but after > > feeding in the entire input, I re-hash the resulting deck. For > > instance, I cut a deck in two (Hearts and Clubs, Spades and Diamonds), > > feed my input into one of them. Then, I feed all 26 cards from that > > deck into the second deck, and I can take my output from that. I / > > think/ that will significantly obscure any sequences, even for fairly > > short inputs. And it's pretty reasonable to do since the algorithm is > > so quick. So this is really just applying the concept of iterations. > > That's also sounds like a good idea. I would go with the full deck > though. One thing I didn't mention with my above approach (because it > already had enough going against it) was that it effectively cuts the > internal state down to 26 cards, which is roughly equivalent to 88 > bits. Brute force collision attacks generally require about 2^n/2 > evaluations of a hash function (where n = the size in bits of the > chaining variable, which I think is analogous to the internal state of > the shuffle hash). So brute forcing the shuffle hash function (with a > 26 card deck) would take only 2^44 evaluations -- which doesn't take > very long using a computer. > > If you are going to go the above route, I would recommend going full > out and using all 52 cards. This increases the internal state to > approximately 225 bits (and makes brute force collision attacks > require approximately 2^112 evaluations). That is to say: > i) feed the message into the (52 card) deck via your algorithm, > ii) let the new message = the 52 card digest from step (i), or some > relatively large subset of that digest (if you can't stomach hashing a > 52-letter-long message), > iii) reset the deck and feed the new message into the deck via your > algorithm, > iv) output the final digest. Well, the bad new is I just tried out the "iteration" idea, and it didn't help a whole lot. With a 26 card deck, I fed in just a few values and ended up with: 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 3 1 2 4 5 Then I fed that whole thing into a fresh deck and got: 9 8 10 12 11 13 15 14 16 18 17 19 21 20 22 24 23 6 0 25 3 2 1 5 4 7 Those sequences are harder to get rid of than I expected. A 52 card deck definitely brings the bit count way up, but it also means more inputs are needed to destroy the sequences. Also remember that for this application, collision attacks really aren't an issue. There are essentially two security concerns: 1) It should be difficult for an attacker to guess the output of the hash without knowing the input. 2) Given the output, it should be difficult for an attacker to figure out the input. The first point basically just requires that the output not be biased, I think. The second point is what I've been calling irreversibility. I suppose this is sort of like a first preimage attack, except that it is not sufficient for an attacker to just find any old message that hashes to the correct value, they need to find the specific value that was used in order for it to be useful. That being the case, I think it's more important to eliminate those tell-tale sequences than it is to increase the maximum number of states. A 26-card deck has about 88 bits, as you said. For a password, that's pretty damn good, so as long as the outputs are close to uniform, I think that satisfies the first point. An attacker would need to try, on average, 2^87 different passwords before getting the right one, which I think is probably secure for most people's purposes? I still need to do the sequence testing I mentioned, but based on a few tests, I think 14 or 15 inputs will generally suffice in a 26- card deck to destroy the sequences, and as few as 10 inputs seems to be working well if I perform the second iteration. I think that's a fairly reasonable number of letters to remember as a password. It may also be effective to just feed the password in twice...but that might also somehow reveal information (like when the Germans encoded the ring settings twice in Enigma messages). > > One complication that you might consider: I really like Maaartin's > idea of using the colors of the cards to increase the mixing rate, but > I kind of want to do something else with that notion. OK, let's say > that each letter of the alphabet is represented by two cards -- one > red and one black. Let's call the two cards that stand for a letter > the "instances" of the letter. e.g. 'A' is represented by the Ace of > Hearts and the Ace of Clubs -- and both cards are "instances" of the > letter A. The first instance you encounter while moving through the > deck from the first card to the last is called the "first instance", > and the second instance encountered is the "second instance". > > To feed 'A' into the hash: > > i) find the first instance of A, > ii) find the second instance of A, > iii) take every card between the first and second instances of A > (which may be zero) and set them aside for the moment, > iv) take the first instance of A and move it to the front of the deck > (if it's not already there), > v) take all cards from the second instance of A to the end and move > them all to the front of the deck (i.e. swap front and back 'halves'), > vi) take the set aside cards that originally were between the two > instances and move them to the front of the deck (there may have been > none, in which case do nothing). > > To illustrate: > Steps i and ii) > <---10 cards--><Ace of Clubs><--5 cards--><Ace of Hearts><--Rest of > deck--> > > Step iii) > <---10 cards--><Ace of Clubs><Ace of Hearts><--Rest of deck-- > > > .....<--5 cards--> > > Step iv) > <Ace of Clubs><---10 cards--><Ace of Hearts><--Rest of deck-- > > > .....<--5 cards--> > > Step v) > <Ace of Hearts><--Rest of deck--><Ace of Clubs><---10 cards-- > > > .....<--5 cards--> > > Step vi) > <--5 cards--><Ace of Hearts><--Rest of deck--><Ace of Clubs><---10 > cards--> I like the idea but take a look at what you did: [ x ] A [ y ] B [ z ] turned into [ y ] B [ z ] A [ x ] The [ y ] B [ z ] end up together again, in order. All you actually did was swap the cards before and after the first instance: [ e ] A [ f ] --> [ f ] A [ e ] But, I do like this idea, and I think we can work with it.I really want to get those sequence the hell out of there. Using your same 52- card deck, and idea of first and second instances of each letter, what if we take all the cards starting with the first and going up to, but not including, the second instance, then alternately move them to the top and bottom of the top part. This is essentially the same as what I originally had, except there I only did it for two cards. Now we're doing it for however many cards are between the two instances. As a final step, we can cut after the second instance. For example (A and B are the instances): Start: [ x ] A e f g h i j k B [ y ] Then: j h f A [ x ] e g i k B [ y ] Finally: [ y ] j h f A [ x ] e g i k B I think it should be easy to do without putting down any cards, writing anything down, referring to any tables, or anything like that, and it should be pretty quick, too, though I'll need to test it out when it's more appropriate for me to be playing with cards. I like this because it forcibly removes some sequences (not irreversibly, of course, but hopefully after a few feeds they'll scatter more). It also changes the distance between the two instances by an amount based on the position of the first card (i.e., the length of [x]). Even feeding the same value twice in a row just scrambles it more. And if the instances are adjacent, it looks like this: Start: [ x ] A B [ y ] Then: A [ x ] B [ y ] Finally: [ y ] A [ x ] B I'll implement this and run some simple tests to see how it looks, then I'll start some more extensive testing. Cheers, -Brian
From: J.D. on 27 Apr 2010 12:03 On Apr 27, 9:42 am, Maaartin <grajc...(a)seznam.cz> wrote: > > Right, but collisions are probably not very important, especially when > it gets used for generating passwords. That's true. > This means either writing the cards down or use two stacks, right? Good point. That is a complication it would be best to avoid, if at all possible. > > I changed the representation to letters, I use both lowercase and > uppercase, but they're treated exactly the same. I also added a new > optional operation: deal(deck, n), which is very simple: Take n cards > for the top, put them on the table, repeat until there are no more > cards left (in the last step there may be less than n cards). This is > like dealing the cards to one player, it reverses the deck and for n>1 > shuffles it a bit. Let me see if I understand you: You are taking some set of cards and then partially re-ordering them by, let's say, pairs, such that the first pair becomes the last, the second pair becomes the second from last, and so on. e.g. (0,1)(2,3)(4,5) --> (4,5)(2,3)(0,1) Is that correct? > > The function double_shuffle(deck, letter, shouldDeal) looks for the > first and the last (i.e., second) occurrence of letter, and does what > you described. With shouldDeal=True it also uses the function deal for > the head and tail parts with n=2 and n=2, respectively. > > The program is very simple: Thank you for implementing it. I can say with certainty that if you are new to python you are still much further along than I. Unfortunately, with some testing using your code, it appears my idea does not really increase the mix rate as much as I had hoped (even with your deal function to boost it) -- there is still far too much retention of the original card order after 10, 15 and even 20 message letters are absorbed. Maybe bmearn's improvement above will do better. Alternatively, maybe if the 'deal' amount is defined by the latest letter to be absorbed (assuming I understand your 'deal' function at all). e.g. to absorb 'c', double shuffle with instances of 'c', and then deal-sort the deck by triplets (as 'c' = 3); to absorb 'd', double shuffle with instances of 'd', then deal-sort the deck by quadruplets...and so on. This would be less effective for 'a' and 'z' if 'a' = 1 and 'z' = 26, but perhaps we could assign every letter a number in the range 2 to 14 (with 'a' and 'n' both = Aces = 14).
From: J.D. on 27 Apr 2010 12:30
On Apr 27, 10:23 am, bmearns <mearn...(a)gmail.com> wrote: > Well, the bad new is I just tried out the "iteration" idea, and it > didn't help a whole lot. With a 26 card deck, I fed in just a few > values and ended up with: > 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 0 3 1 > 2 4 5 > Then I fed that whole thing into a fresh deck and got: > 9 8 10 12 11 13 15 14 16 18 17 19 21 20 22 24 23 6 0 25 3 2 1 > 5 4 7 > > Those sequences are harder to get rid of than I expected. A 52 card > deck definitely brings the bit count way up, but it also means more > inputs are needed to destroy the sequences. Yeah, the persistence of those sequences is kind of annoying. A note of caution: the goal is avalanche, not merely to get rid of the sequences. The persistence of the sequences is an indication of the lack of avalanche -- but the absence of the sequences does not prove that avalanche has occurred. So we might come up with a trick that appears to thoroughly scramble the sequences, but if different inputs don't cause large changes in the output (e.g. significantly change the relative ordering of the cards) then the hash will probably still be vulnerable. > > Also remember that for this application, collision attacks really > aren't an issue. There are essentially two security concerns: > 1) It should be difficult for an attacker to guess the output of the > hash without knowing the input. > 2) Given the output, it should be difficult for an attacker to figure > out the input. > > The first point basically just requires that the output not be biased, > I think. The second point is what I've been calling irreversibility. I > suppose this is sort of like a first preimage attack, except that it > is not sufficient for an attacker to just find any old message that > hashes to the correct value, they need to find the specific value that > was used in order for it to be useful. > > That being the case, I think it's more important to eliminate those > tell-tale sequences than it is to increase the maximum number of > states. A 26-card deck has about 88 bits, as you said. For a password, > that's pretty damn good, so as long as the outputs are close to > uniform, I think that satisfies the first point. An attacker would > need to try, on average, 2^87 different passwords before getting the > right one, which I think is probably secure for most people's > purposes? Well, if you are only using this as a key derivation function then the attacker would focus on the thing with less entropy. So if the password has less entropy than the internal state of the hash function (which is highly likely given that people are lazy) then the attacker will focus on the password and just try to guess that. > > I still need to do the sequence testing I mentioned, but based on a > few tests, I think 14 or 15 inputs will generally suffice in a 26- > card deck to destroy the sequences, and as few as 10 inputs seems to > be working well if I perform the second iteration. I think that's a > fairly reasonable number of letters to remember as a password. It may > also be effective to just feed the password in twice...but that might > also somehow reveal information (like when the Germans encoded the > ring settings twice in Enigma messages). What's the avalanche rate after 14 or 15 inputs? > I like the idea but take a look at what you did: > [ x ] A [ y ] B [ z ] > turned into > [ y ] B [ z ] A [ x ] > > The [ y ] B [ z ] end up together again, in order. All you actually > did was swap the cards before and after the first instance: > [ e ] A [ f ] --> [ f ] A [ e ] Yeah, testing with Maaartin's implementation has convinced me that this is not enough... > > But, I do like this idea, and I think we can work with it.I really > want to get those sequence the hell out of there. Using your same 52- > card deck, and idea of first and second instances of each letter, what > if we take all the cards starting with the first and going up to, but > not including, the second instance, then alternately move them to the > top and bottom of the top part. This is essentially the same as what I > originally had, except there I only did it for two cards. Now we're > doing it for however many cards are between the two instances. As a > final step, we can cut after the second instance. > For example (A and B are the instances): > Start: [ x ] A e f g h i j k B [ y ] > Then: j h f A [ x ] e g i k B [ y ] > Finally: [ y ] j h f A [ x ] e g i k B Hmm, you mean take every other card of the interval (the cards between the two instances), starting with the first instance, and move them to the front, and then move the cards after the second instance to the front? Yeah, that could work. Though I am concerned that the 'every other card" part might be a bit fiddly and error prone (I would have to do testing with an actual deck of cards to be sure). I am also not sure about how well it would increase avalanche. Testing is required. |