Prev: ANN: Leo 4.7 final released
Next: AKKA vs Python
From: Paul Rubin on 23 Feb 2010 21:39 Steven D'Aprano <steven(a)REMOVE.THIS.cybersource.com.au> writes: > Paul, if you were anyone else, I'd be sneering uncontrollably about now, > but you're not clueless about cryptography, so what have I missed? Why is > reducing the number of distinct letters by more than 50% anything but a > disaster? This makes the task of brute-forcing the password exponentially > easier. Reducing the number of distinct letters by 50% decreases the entropy per character by 1 bit. That stuff about mixing letters and digits and funny symbols just makes the password a worse nuisance to remember and type, for a small gain in entropy that you can compute and make up for. The main thing you have to make sure is that the min-entropy is sufficient for your purposes, and it's generally more convenient to do that by making the password a little bit longer than by imposing contortions on the person typing it. Ross Anderson's "Security Engineering" chapter about passwords is worth reading too: http://www.cl.cam.ac.uk/~rja14/Papers/SE-03.pdf When I mentioned entropy loss to the OP though, I mostly meant loss from getting rid of the letter z. The (binary) Shannon entropy of the uniform probability distribution on 26 letters is 4.7004397 bits; on 25 letters, it's 4.6438561 bits. The difference isn't enough to give an attacker that much advantage. I like the diceware approach to passphrase generation and I've been using it for years. www.diceware.com explains it in detail and the docs there are quite well-thought-out and informative. Keep in mind that the entropy needed for an online password (attacker has to make a server query for every guess, and hopefully gets locked out after n wrong tries) and an offline one (attacker has something like a hash of the password and can run a completely offline search) are different. Here is a program that I use sometimes: from math import log dictfile = '/usr/share/dict/words' def genrandom(nbytes): with open('/dev/urandom') as f: return int(f.read(nbytes).encode('hex'), 16) def main(): wordlist = list(x.strip() for x in open(dictfile) if len(x) < 7) nwords = len(wordlist) print "%d words, entropy=%.3f bits/word"% ( nwords, log(nwords, 2)) print '-'.join(wordlist[genrandom(10)%nwords] for i in xrange(6)) main()
From: Paul Rubin on 23 Feb 2010 21:41 Steven D'Aprano <steven(a)REMOVE.THIS.cybersource.com.au> writes: > Putting aside the philosophical question of what "truly random" means, I > presume you mean that the letters are uniformly distributed. The answer > to that is, they don't like uniformly distributed. That is a good point, the way those letters are generated (through the decimal conversion stuff), they won't be all that uniform.
From: Paul Rubin on 23 Feb 2010 21:50 mk <mrkafk(a)gmail.com> writes: >> You might look at the sitewww.diceware.comfor an approach to this, >> which you can implement with a program. The docs there are pretty >> thoughtful and may help you understand the relevant issues. > > Thanks. But I would also be grateful for indicating what is wrong/ugly > in my code. The stuff about converting 4 random bytes to a decimal string and then peeling off 2 digits at a time is pretty awful, and notice that since 2**32 is 4294967296, in the cases where you get 10 digits, the first 2-digit pair is never higher than 42. There are also some effects on the lower digits. The total entropy loss probably isn't fatal but as described, it's ugly. I'd write your code something like this: nletters = 5 def randomword(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a')+ord(c)%26) for c in f.read(n)]) print randomword(nletters) I wouldn't rely on a 5 letter combination for a high security application, but it might be ok for some low security purposes. Two random 5-letter combinations separated by a hyphen will be much better, and is probably easier to type than a solid block of 10 letters.
From: Robert Kern on 23 Feb 2010 22:09 On 2010-02-23 20:43 , Steven D'Aprano wrote: > On Wed, 24 Feb 2010 02:40:13 +0000, Steven D'Aprano wrote: > >> On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote: >> >>> The question is: is this secure? That is, can the string generated this >>> way be considered truly random? >> >> Putting aside the philosophical question of what "truly random" means, I >> presume you mean that the letters are uniformly distributed. The answer >> to that is, they don't like uniformly distributed. > > Er, they don't *look* uniformly distributed. > > (Of course, being random, perhaps they are and I just got unlucky.) You'd have to be very, *very* unlucky to get a sample of that size so far from uniformly distributed if the generating process actually were uniform. Of course, uniformity isn't really necessary. You just need enough entropy in the distribution (amongst other things like protection of the seed from being known or guessed). A skewed distribution of characters is perfectly fine provided that you had enough characters in the password to meet the desired entropy requirement. A skewed distribution does require more characters to meet a specified entropy requirement than a uniform distribution, of course. That said, for a naive strategy like "pick an independent random character, repeat", you should just use a uniform distribution. It makes the analysis easier. Worthwhile generators that give skewed distributions usually do so for a good reason, like generating pronounceable passwords. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
From: Lie Ryan on 23 Feb 2010 22:41
On 02/24/10 14:09, Robert Kern wrote: > On 2010-02-23 20:43 , Steven D'Aprano wrote: >> On Wed, 24 Feb 2010 02:40:13 +0000, Steven D'Aprano wrote: >> >>> On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote: >>> >>>> The question is: is this secure? That is, can the string generated this >>>> way be considered truly random? >>> >>> Putting aside the philosophical question of what "truly random" means, I >>> presume you mean that the letters are uniformly distributed. The answer >>> to that is, they don't like uniformly distributed. >> >> Er, they don't *look* uniformly distributed. >> >> (Of course, being random, perhaps they are and I just got unlucky.) > > You'd have to be very, *very* unlucky to get a sample of that size so > far from uniformly distributed if the generating process actually were > uniform. > > Of course, uniformity isn't really necessary. You just need enough > entropy in the distribution (amongst other things like protection of the > seed from being known or guessed). A skewed distribution of characters > is perfectly fine provided that you had enough characters in the > password to meet the desired entropy requirement. A skewed distribution > does require more characters to meet a specified entropy requirement > than a uniform distribution, of course. > > That said, for a naive strategy like "pick an independent random > character, repeat", you should just use a uniform distribution. It makes > the analysis easier. Worthwhile generators that give skewed > distributions usually do so for a good reason, like generating > pronounceable passwords. If an attacker knows the that the random number generator have an extreme skew and he knows the distribution of the letters, how much advantage would it give the attacker? My initial guess is that the more skewed the letters are, the better the advantage, since an attacker using brute-force can write his program to prefer the most likely letters? |