Is this secure? [Python]

Prev: ANN: Leo 4.7 final released
Next: AKKA vs Python

From: Paul Rubin on 23 Feb 2010 21:39

Steven D'Aprano <steven(a)REMOVE.THIS.cybersource.com.au> writes:
> Paul, if you were anyone else, I'd be sneering uncontrollably about now,
> but you're not clueless about cryptography, so what have I missed? Why is
> reducing the number of distinct letters by more than 50% anything but a
> disaster? This makes the task of brute-forcing the password exponentially
> easier.

Reducing the number of distinct letters by 50% decreases the entropy per
character by 1 bit. That stuff about mixing letters and digits and
funny symbols just makes the password a worse nuisance to remember and
type, for a small gain in entropy that you can compute and make up for.
The main thing you have to make sure is that the min-entropy is
sufficient for your purposes, and it's generally more convenient to do
that by making the password a little bit longer than by imposing
contortions on the person typing it. Ross Anderson's "Security
Engineering" chapter about passwords is worth reading too:

http://www.cl.cam.ac.uk/~rja14/Papers/SE-03.pdf

When I mentioned entropy loss to the OP though, I mostly meant loss from
getting rid of the letter z. The (binary) Shannon entropy of the
uniform probability distribution on 26 letters is 4.7004397 bits; on 25
letters, it's 4.6438561 bits. The difference isn't enough to give an
attacker that much advantage.

I like the diceware approach to passphrase generation and I've been
using it for years. www.diceware.com explains it in detail and the docs
there are quite well-thought-out and informative. Keep in mind that the
entropy needed for an online password (attacker has to make a server
query for every guess, and hopefully gets locked out after n wrong
tries) and an offline one (attacker has something like a hash of the
password and can run a completely offline search) are different.

Here is a program that I use sometimes:

from math import log
dictfile = '/usr/share/dict/words'

def genrandom(nbytes):
with open('/dev/urandom') as f:
return int(f.read(nbytes).encode('hex'), 16)

def main():
wordlist = list(x.strip() for x in open(dictfile) if len(x) < 7)
nwords = len(wordlist)
print "%d words, entropy=%.3f bits/word"% (
nwords, log(nwords, 2))
print '-'.join(wordlist[genrandom(10)%nwords] for i in xrange(6))

main()

From: Paul Rubin on 23 Feb 2010 21:41

Steven D'Aprano <steven(a)REMOVE.THIS.cybersource.com.au> writes:
> Putting aside the philosophical question of what "truly random" means, I
> presume you mean that the letters are uniformly distributed. The answer
> to that is, they don't like uniformly distributed.

That is a good point, the way those letters are generated (through the
decimal conversion stuff), they won't be all that uniform.

From: Paul Rubin on 23 Feb 2010 21:50

mk <mrkafk(a)gmail.com> writes:
>> You might look at the sitewww.diceware.comfor an approach to this,
>> which you can implement with a program. The docs there are pretty
>> thoughtful and may help you understand the relevant issues.
>
> Thanks. But I would also be grateful for indicating what is wrong/ugly
> in my code.

The stuff about converting 4 random bytes to a decimal string and then
peeling off 2 digits at a time is pretty awful, and notice that since
2**32 is 4294967296, in the cases where you get 10 digits, the first
2-digit pair is never higher than 42. There are also some effects on
the lower digits. The total entropy loss probably isn't fatal but as
described, it's ugly.

I'd write your code something like this:

nletters = 5

def randomword(n):
with open('/dev/urandom') as f:
return ''.join([chr(ord('a')+ord(c)%26) for c in f.read(n)])

print randomword(nletters)

I wouldn't rely on a 5 letter combination for a high security
application, but it might be ok for some low security purposes. Two
random 5-letter combinations separated by a hyphen will be much better,
and is probably easier to type than a solid block of 10 letters.

From: Robert Kern on 23 Feb 2010 22:09

On 2010-02-23 20:43 , Steven D'Aprano wrote:
> On Wed, 24 Feb 2010 02:40:13 +0000, Steven D'Aprano wrote:
>
>> On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote:
>>
>>> The question is: is this secure? That is, can the string generated this
>>> way be considered truly random?
>>
>> Putting aside the philosophical question of what "truly random" means, I
>> presume you mean that the letters are uniformly distributed. The answer
>> to that is, they don't like uniformly distributed.
>
> Er, they don't *look* uniformly distributed.
>
> (Of course, being random, perhaps they are and I just got unlucky.)

You'd have to be very, *very* unlucky to get a sample of that size so far from
uniformly distributed if the generating process actually were uniform.

Of course, uniformity isn't really necessary. You just need enough entropy in
the distribution (amongst other things like protection of the seed from being
known or guessed). A skewed distribution of characters is perfectly fine
provided that you had enough characters in the password to meet the desired
entropy requirement. A skewed distribution does require more characters to meet
a specified entropy requirement than a uniform distribution, of course.

That said, for a naive strategy like "pick an independent random character,
repeat", you should just use a uniform distribution. It makes the analysis
easier. Worthwhile generators that give skewed distributions usually do so for a
good reason, like generating pronounceable passwords.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

From: Lie Ryan on 23 Feb 2010 22:41

On 02/24/10 14:09, Robert Kern wrote:
> On 2010-02-23 20:43 , Steven D'Aprano wrote:
>> On Wed, 24 Feb 2010 02:40:13 +0000, Steven D'Aprano wrote:
>>
>>> On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote:
>>>
>>>> The question is: is this secure? That is, can the string generated this
>>>> way be considered truly random?
>>>
>>> Putting aside the philosophical question of what "truly random" means, I
>>> presume you mean that the letters are uniformly distributed. The answer
>>> to that is, they don't like uniformly distributed.
>>
>> Er, they don't *look* uniformly distributed.
>>
>> (Of course, being random, perhaps they are and I just got unlucky.)
>
> You'd have to be very, *very* unlucky to get a sample of that size so
> far from uniformly distributed if the generating process actually were
> uniform.
>
> Of course, uniformity isn't really necessary. You just need enough
> entropy in the distribution (amongst other things like protection of the
> seed from being known or guessed). A skewed distribution of characters
> is perfectly fine provided that you had enough characters in the
> password to meet the desired entropy requirement. A skewed distribution
> does require more characters to meet a specified entropy requirement
> than a uniform distribution, of course.
>
> That said, for a naive strategy like "pick an independent random
> character, repeat", you should just use a uniform distribution. It makes
> the analysis easier. Worthwhile generators that give skewed
> distributions usually do so for a good reason, like generating
> pronounceable passwords.

If an attacker knows the that the random number generator have an
extreme skew and he knows the distribution of the letters, how much
advantage would it give the attacker? My initial guess is that the more
skewed the letters are, the better the advantage, since an attacker
using brute-force can write his program to prefer the most likely letters?

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: ANN: Leo 4.7 final released
Next: AKKA vs Python