Prev: Dynamic Hill cipher
Next: PE Scrambler
From: Maaartin on 29 Apr 2010 13:34 On Apr 29, 6:34 pm, bmearns <mearn...(a)gmail.com> wrote: > Alright, I've got a new routine that I'm calling "raking" the deck > (for no particular reason). I think that it can be done pretty quickly For me it seems like you get rid of 2 cards in each step, so you need to do quite a lot of work. Or am I wrong? Currently, I don't have the time for our "card play", but I'll read it later carefully, it looks interresting. I've also got some new ideas to be tried out. > To bury a card, X, take N to be the rank of X (a number in [1,13]). Interpreting cards as numbers in 1..13 is different what we (at least I and J.D.) did before. We interpreted Aces as 14, which is no good idea for the following reasons: - Ace looks like 1, you can easily be misguided. - Counting starts usually with 1 instead of 2 (ok, 0 is better, but not here). - We lay the initial deck like A2345..., so A=1. > So like I said if we add in the output function, this might work out > really well. I'll start running some statistical tests, and if those > are promising, I'll start implementing some more thorough tests. These tests are important, I'm going to evaluate some new ideas of mine before I start to describe and precise them. Could you post the test code here? I've got a proposal for a general test function. The interface could look like evaluate(input_function, finalizing_function, output_function, effort, seed) where input_function takes two arguments (deck, letter) and returns the new deck after absorbing letter finalizing_function takes a single argument (deck) and returns the new deck after postprocessing (like my 3 burries) output_function takes a single argument (deck) and returns [deck, output_letter] effort is a number determining the running time seed is the seed for a deterministic prng used for the tests The result of evaluate could be a list of numbers, which should be equal to 0 for a perfect algorithm; the further from 0 they are, the worse. What do you think about it?
From: bmearns on 29 Apr 2010 13:56 On Apr 29, 1:25 pm, "J.D." <degolyer...(a)yahoo.com> wrote: > On Apr 29, 12:34 pm, bmearns <mearn...(a)gmail.com> wrote: > > > > > So that's the rake operation, and the overall algorithm is simply > > feeding in input cards using the old shuffle_1 routine, > > Which one was the old shuffle_1 routine? I've lost track of what was > what... shuffle_1 was the very simple one: split the deck into top and bottom parts, above and below the input card. Move the input card to the front of the top part, then move the next card after it to the back of the top part. Then put the top part behind the bottom part. It's almost just a simple cut, but you move the input card up to the top before cutting. So far I've just been running collision tests on this latest routine of mine, without even using a final output function. I've tried 250- thousand unique random inputs of 10-cards each and gotten 250-thousand unique outputs. That's definitely a good start, but we know collision isn't everything. I'm also going to need to spend several hours otherwise engaged, but I'll plan to continue running collision tests in the background, and then work on the more useful tests when I'm free. -Brian
From: bmearns on 29 Apr 2010 14:25 On Apr 29, 1:34 pm, Maaartin <grajc...(a)seznam.cz> wrote: > On Apr 29, 6:34 pm, bmearns <mearn...(a)gmail.com> wrote: > > > Alright, I've got a new routine that I'm calling "raking" the deck > > (for no particular reason). I think that it can be done pretty quickly > > For me it seems like you get rid of 2 cards in each step, so you need > to do quite a lot of work. Or am I wrong? No, it only gets rid of 1 at a time, but I may not have described it very well. Here's the python: ############################################### def color(card, deckSize): half = deckSize >> 1 if card < half: return 0 else: return 1 def bury(deck, x, suitSize): """ Buries card x into the given deck. Interpets X as a number under modulo of the specified suit size, then adds one (so it's just the rank of the card), counts that many spaces down the deck, removes the /next/ card, and replaces it with x. Returns a two-tuple (output, deck) where output is the removed card, and deck is the resulting deck with the output card replaced by x. """ n = (x % suitSize) + 1 deckSize = len(deck) if n >= deckSize: n = deckSize - 1 head = deck[0:n] output = deck[n] tail = deck[n+1:] deck = head + [x] + tail return (output, deck) def rake(deck, suits): """ Arguments: deck - A permutation of numerical cards. The red cards are the lowest numbers, the black cards are the rest. Goes from 0 up to N-1. suits - The number of suits in the deck. For a real deck, this is the number of cards (the length of 'deck') divided by 13. But you can set it however you want. Dividing this into the length of the deck determines how big each suit is, and this in turn is used to determine the rank of each card (by taking the numerical value modulo the size-of-a-suit). """ deckSize = len(deck) suitSize = deckSize / suits result = [] s = deck[0] scol = color(s, deckSize) deck = deck[1:] while len(deck) > 1: t = deck[0] tcol = color(t, deckSize) #Same color, bury t if tcol == scol: #Count down by #t, take next card as output # and replace with t. output, deck = bury(deck, t, suitSize) #An artifact of doing it in code, now we've # got d buried, we also have it at the front of # the deck, so remove it. deck = deck[1:] #Append output result += [output] #Different color, bury s, then pick up t as new s. else: #Count down by #s, take next card after that # as output, and replace it with s. output, deck = bury(deck, s, suitSize) #But the deck hasn't shrunk at all, we removed # a card (output), but we replaced it # with the selected card. So now pick up the # first card as the new selected. s = deck[0] scol = color(s, deckSize) deck = deck[1:] #Append output result += [output] #When there's only one card left in the deck (and one # in our hand), no matter what they are, we're going to # end up outputing the one in the deck, so just do that, # and then tack on the selected. return result + deck + [s] ############################################### > > Currently, I don't have the time for our "card play", but I'll read it > later carefully, it looks interresting. I've also got some new ideas > to be tried out. Yeh, I'll have to take a few hours off as well. Hopefully will get back at it sometime tonight. > > To bury a card, X, take N to be the rank of X (a number in [1,13]). > > Interpreting cards as numbers in 1..13 is different what we (at least > I and J.D.) did before. We interpreted Aces as 14, which is no good > idea for the following reasons: [snip] Yeh, I've been taking ace as 1, but I think I noticed you were using 14. To a certain extent, using 14 makes sense because it gives a bigger range. For me, it's more natural to associate ace with 1, as you pointed out, and it's more natural to start counting at 1 than 2. Likewise, converting to letters is more direct if ace is 1. All in all, I don't think it makes a big difference, it's mostly just a matter of convention. As long as you make a choice and stick with it everytime you run the algorithm, it shouldn't really matter much. I'll stick with ace=1 for the reasons you cited. > > > So like I said if we add in the output function, this might work out > > really well. I'll start running some statistical tests, and if those > > are promising, I'll start implementing some more thorough tests. > > These tests are important, I'm going to evaluate some new ideas of > mine before I start to describe and precise them. In addition to the "direction" and "2-card sequence" tests I described previously, I've come up with the following tests that I haven't implemented yet: * See where each card ends up in the final output. Should be uniformly distributed. * For each card as first input, see where the card ends up with different numbers of inputs. At a certain minimum number of inputs, it should be uniform. * For each card as last input, see where the card ends up with different numbers of inputs. At a certain minimum number of inputs, it should be uniform. * See how far every card moves between two digests when different numbers of additional cards are fed in. If anyone can think of other tests to do, or comment on these, that'd be great. > Could you post the test code here? I will, but not yet. It's kind of a shambles right now, in addition to just general cleaning up, I want to implement a logical and standard interface like what you've suggested below. I'm also working on using a file-based-database to prevent the system memory from filling up too much when running large numbers of tests and collecting large numbers of statistics. (Fortunately python has this functionality built in). > > I've got a proposal for a general test function. The interface could > look like > > evaluate(input_function, finalizing_function, output_function, effort, > seed) > > where > > input_function takes two arguments (deck, letter) and returns the new > deck after absorbing letter > > finalizing_function takes a single argument (deck) and returns the new > deck after postprocessing (like my 3 burries) > > output_function takes a single argument (deck) and returns [deck, > output_letter] > > effort is a number determining the running time > > seed is the seed for a deterministic prng used for the tests > > The result of evaluate could be a list of numbers, which should be > equal to 0 for a perfect algorithm; the further from 0 they are, the > worse. > > What do you think about it? That looks pretty good, but do you mean there would be a different one of these functions to test every metric? I think we can set it up a little more efficiently where we can test a number of different metrics on a single deck. Then, for instance, my crappy statistical testing code output data in CSV format which can be read and plotted in Excel to compare different things (though I'll probably move to using GNUPlot instead). I've got a Subversion server running at home. I'll probably start putting my code in there so you can always get the latest from there if you want. If you're not familiar with Subversion, it has a simple web interface so you can always just download it. If you are familiar with Subversion, I can give you write access as well, so we can all contribute to the same code. But that will have to wait till later tonight. In the mean time, just let me know if you want write access, and if you do give me a username you want to use and an email address I should send your login info to. Actually, if you use PGP, you can encrypt a username and password to me with key-id 0x3aa70848. Just don't use an important password; I don't have Subversion set up to be very secure with your passwords. -Brian
From: J.D. on 29 Apr 2010 23:08 I have some analysis for the two algorithms that I tested by hand (mine and Maaartin's). Specifically I have been thinking about reversibility, and how resistant each algorithm is to preimage attack. For the moment I am going to just focus on Maaartin's algorithm (this post is long enough already). Hopefully this will give us some useful tools for thinking about other potential algorithms. Maaartin's Algorithm and Preimage Attacks: The first thing to note is that both the "bury" and the fixed (1,2) deal operations are easily undoable. Here is how to undo them: "Disinter" (undoes "bury" step) i) Look at the last card, interpret it as a value, "N" (which during my tests was in a range from 2 to 14), and set it aside, ii) Take the bottom N cards from the end of the deck and move them in a block to the front, iii) Take the former last card and put it on the top of the deck. "Break Deal" (undoes fixed (1,2) deal step) i) Take the first 18 cards off the top of the deck -- this is the "short stack", the remaining 34 cards is the "tall stack", and there is a "result stack" that is currently empty, ii) Take one card off the top of the short stack and put it on the result stack, iii) Take two cards off the top of the tall stack and put them on the result stack, iv) Repeat steps ii and iii in that order until all the cards are in the result stack -- the result is the deck as it existed before the deal step. So, let's say that we are given the digest, consisting of the full deck after all letters have been absorbed and then three buries performed during finalization. We don't know how long the message was (though we can assume it was relatively short, since this was done by hand and the message was an easily memorable password). We do know the starting condition of the deck before any of the letters were absorbed. The first step in reversing the hash is to undo those three buries, which can be done by calling "Disinter" three times. Now we call "Break Deal" once, and right away we can determine with certainty the last letter of the hashed message. That is because the last step not undone was a shuffle-cut, and that would have moved an instance of that last letter to the very end of the deck. So all we have to do is look at the last card and interpret it as a letter, and that gives us the last letter of the hashed message. Generally you don't want to let the attacker be able to easily determine even one of the letters of the message with 100% certainty. So that is an area where Maaartin's algorithm needs improvement. However I think we can go further than just finding the last letter. To go any further and recover more letters of the message requires that we be able to undo the combination of two shuffle-cuts and a bury step. The difficulty with reversing the "shuffle-cut + bury + shuffle-cut" steps is that while each shuffle-cut moves the first instance to the end, if we don't know what the deck looked like before the shuffle-cut then it is somewhat hard to determine precisely how many cards were in the deck ahead of that first instance (and hence how many cards were moved to the end along with the first instance by the shuffle-cut). But we can narrow down the possibilities... Consider what happens to the deck when a letter is absorbed. First, the hasher translates the letter into a pair of cards (the two instances of that latter, X and Y). Then he searches through the deck from the top for the first of those two instances to appear, and once he finds it he moves it and all the cards above that first instance to the end. To visualize: Before first shuffle-cut: [first unknown no. of cards] X [second unknown no. of cards] Y [third unknown no. of cards] After first shuffle-cut: [second unknown no. of cards] Y [third unknown no. of cards][first unknown no. of cards] X Now he performs a bury. To do this, he looks at the new first card, interprets it as a value, "N", and moves it and the next N number of cards to the end like so: Before bury: "N" [N cards] [?] Y [?] X After bury: [?] Y [?] X [N cards] "N" (I assume for the moment that the next instance, Y, was not "N" or among the N cards moved -- later we'll look at what happens when it is) Then the second shuffle-cut: [?] Y [?] X [N cards] "N" ==> [?] X [N cards] "N" [?] Y That bury step in the middle is the weak point, and will enable us to (with a certain amount of guessing) undo all three steps. Here is how to start undoing them, given the output of the three steps, and still assuming that the second instance was not moved to the bottom by the bury step: 1) Look at the last card -- this will be an instance of the absorbed letter (Y); 2) Find the other instance of that letter in the deck (X), "N" will be between them; 3) To find "N", look for all of the self-justifying cards between X and Y; General Definition: A self-justifying card is a card with a face value of "N", and there are N cards between it and some "anchor-point". The relevant anchor-point given our assumption is the current highest instance (X). To illustrate: A self-justifying card where X is the anchor point: [?] X [N cards] "N" [...] Y Any card that satisfies the above illustrated condition is a potential candidate for the determinant of the bury step (the card that determined how many cards below it to bury). In the simplest case, let's say there is only one such candidate (there will always be at least one). In that case reversing the last shuffle-cut and then the bury is easy: [?] X [N cards] "N" [...] Y Reverse last shuffle-cut: [...] Y [?] X [N cards] "N" Reverse bury: "N" [N cards] [...] Y [?] X So what if our assumption was wrong, and the other instance was moved by the bury step? If that is the case then we can still reverse the two steps by looking for self-justifying cards. For example, let's say that the other instance was among the N cards specified by "N". In which case, after the bury, the deck will look like this: Bury: "N" {[a] Y [b]} [?] X ==> [?] X {[a] Y [b]} "N" (where a + b + 1 = N) Second shuffle-cut: [?] X {[a] Y [b]} "N" ==> {[a] Y [b]} "N" [?] X Note that "N" is self-justifying when the anchor-point is the top of the deck (i.e. a card with a face value of "N" is self-justifying if there are exactly N cards above it in the deck). This holds even if we assume that the other instance was also the bury-determinant: 1st shuffle-cut: [?] X Y [z cards] ==> Y [z] [?] X Bury: [z-Y] [?] X [Y cards] Y 2nd shuffle-cut: [Y cards] Y [z-Y] [?] X i.e. "Y" is Y cards away from the top of the deck. Putting this all together, let's return to our procedure and specify it more completely: "Seek Justification" (undoes the second shuffle-cut and bury steps): i) Look at the last card -- this will be an instance of the absorbed letter, (Y); ii) Find the other instance of that letter in the deck, (X) -- the actual bury determinant, "N", will either be between the two instances or will be X itself; iii) To find "N", look for all of the self-justifying cards from X (including X) up to Y (not including Y); A card is self-justifying iff: - the card has a face value of "N" and there are N cards between it and X, OR - the card has a face value of "N" and there are only N cards above it. iv) If there is more than one self-justifying card then we will need to branch our analysis -- i.e. pursue multiple possible paths to the answer -- For each and every self-justifying card, "J": a) Take all the cards below "J" (including Y) and move them as a block to the top, b) Put "J" aside, and take J many cards from the bottom and move them to the top, c) Then put "J" on the top, d) Record the current condition of the deck as 'Condition of Deck for candidate "J"', and then return the deck to the condition it was in before (a), and repeat steps (a)-(d) for the next self-justifying card until all the possible candidates for the actual bury determinant have been processed and recorded. The actual condition of the deck before the bury function and the second shuffle-cut will be on that list. But since we don't know which one it was (unless there is only one self-justifying card), we have to investigate all of them until we find one that leads to a valid preimage. We do that by picking one of the possible "J"s, assume it was in fact the correct bury determinant, and continue with the next phase of the reversal analysis under that assumption. Then repeat for each and every "J". Reversing the First Shuffle-Cut: The next phase in the analysis is reversing the first shuffle-cut, and will probably require even more guessing and more branching of our analysis. After the second shuffle-cut and bury steps are reversed, we can say with certainty that the original first card (the top of the deck before the first shuffle-cut) is either an element of the interval between the two instances, or is X itself: "N" [N cards] [...] Y <<<[interval] X>>> Original top card is within <<< >>>, which I will call "the Zone of possible top cards". The number of cards in the Zone determines how difficult reversing the first shuffle will be (the larger this number is, the harder it is to undo). If there is only one card in the Zone (i.e. only X), then we don't need to guess or branch the analysis at all; we just move that card to the top and the first shuffle-cut is undone with certainty. But if there is more than one card in the Zone then we need to test each one and see if it meets certain criteria. Those that do not meet the criteria can be ignored (as they will not be possible top cards), but all those that do meet the criteria have to be pursued as possible paths to a valid preimage. So what criteria should we use, and how do evaluate the candidates? The procedure I have come up with is as follows: "Seek More Justification" (undoes first shuffle-cut) For each and every card, "C", in the Zone: i) Assume that "C" was in fact the top card before the first shuffle- cut, and reverse that first shuffle-cut by taking "C" and every card below "C" and moving them as a block to the top; ii) Criterion One: Compare the deck as it is now with the starting condition of the deck before any letters are absorbed -- if the deck is identical to the starting condition then we can stop all of our analysis, because with a high degree of probability we have found the point where the hashing began. The message can be extracted by looking at the various reverse-transformations of the digest that we took in order to reach this starting condition. iii) Criterion Two: If the deck under the assumption that "C" was the top card fails Criterion One, then there are other tests we can perform on it to see if it is none-the-less on the right path to a valid preimage: a) Perform "Break Deal" on the deck, b) Perform "Seek Justification" on the resulting deck, c) If "Seek Justification" does NOT produce any self-justifying cards, then the assumption that "C" was the top card cannot have been correct, and we do not need to pursue this branch any further. (If it isn't clear to you why that is, tell me and I will explain it.) On the other hand, if "Seek Justification" does find one or more self- justifying cards then we must consider this a "fruitful" branch of our analysis, and record "C" as a true possibility. iv) Return the deck to the condition it was in before (i) and continue steps (i) through (iii) will the other possible cards in the Zone. The card that was in fact the top card before the first shuffle-cut will be on the list of 'true possibilities'. But since we can't be sure which one it is (unless the list has only one element), we need to pursue each true possibility (each "fruitful" branch of the analysis) until we find a valid preimage, the same way that we had to follow up on every self-justifying card in the previous phase. The next phase is to repeat our cycle of "Break Deal", then "Seek Justification" and then "Seek More Justification" for each of the fruitful branches, in order to peel off the next letter of the message. We keep repeating that cycle, and peel off more and more letters (and branching more and more) unless we run out of memory (because we have too many fruitful branches to pursue) or we find a valid preimage (one that meets Criterion One in the "Seek More Justification" phase). Attack Complexity: The complexity of this attack is a function of the average number of fruitful branches that occur in order to peel off a single letter. The attack will be faster than a brute force attack if and only if the average number of branches is below a certain threshold. As I understand it, this threshold is 26 -- i.e. so long as the average number of branches that arise during our attempt to peel off a letter is less than 26 then our algorithm (on average) will require fewer operations to find a valid preimage than a brute force preimage attack. Why 26? Well, if the brute force attacker doesn't know the length of the message, but does know it is likelier to be short rather than long (for the reasons described at the top of this post) then the attacker would probably first search all possible messages of length 1, then all possible messages of length 2, and so on until he finds a message that produces the hash in question. The number of possible messages of length N is 26^N, and to systematically search all possible messages up to length N in that fashion requires SUM_(i: i=1,2,3 N) 26^i evaluations of the hash function. The reversal attack algorithm described in this post also has an exponential complexity (it requires exponentially more operations to find valid preimages the longer they are). Like the brute force attack, the exponent is the length L of the message in question, and the algorithm searches all shorter preimages before moving onto longer and longer ones. But for the reversal attack, the base is the average number of fruitful branches required to peel off a letter. So long as the base is less than 26 it will be a faster way to find a valid preimage of length L than the brute force attack. So what is the average number of fruitful branches? I don't know -- that is an empirical question that must be determined by testing. However I can say more precisely what we need to look for while testing. Recall that branches occur in two places while peeling off a single letter -- at the "Seek Justification" stage, and at the "Seek More Justification" stage. The average number of fruitful branches at the first stage, "P", is identical to the average number of self-justifying cards found during the "Seek Justification" step. The average number of fruitful branches at the second stage, "Q", is the product of two numbers: (a) the average number of cards in the Zone; and (b), the average proportion of Zone cards that cannot be ignored because they might possibly lead to a valid preimage (expressed as a fraction of the total number of Zone cards). The average number of branches that arise when peeling off a letter is thus P*Q. So long as P*Q < 26, the attack described in this post will break Maaartin's algorithm (though it may not be a practical break for very large messages). I have a similar analysis for my own algorithm, but for now let's make sure that the above analysis is a) clear, and b) correct. Is there any part that you guys need clarified or expanded upon? Do you spot any errors in the analysis?
From: J.D. on 29 Apr 2010 23:44
Brian, I'll take a closer look at your new "raking" idea tomorrow. For now I am kinda wiped out... |