Prev: ANN: Leo 4.7 final released
Next: AKKA vs Python
From: mk on 24 Feb 2010 14:09 On 2010-02-24 20:01, Robert Kern wrote: > I will repeat my advice to just use random.SystemRandom.choice() instead > of trying to interpret the bytes from /dev/urandom directly. Oh I hear you -- for production use I would (will) certainly consider this. However, now I'm interested in the problem itself: why is the damn distribution not uniform? Regards, mk
From: Robert Kern on 24 Feb 2010 14:19 On 2010-02-24 13:09 PM, mk wrote: > On 2010-02-24 20:01, Robert Kern wrote: >> I will repeat my advice to just use random.SystemRandom.choice() instead >> of trying to interpret the bytes from /dev/urandom directly. > > Oh I hear you -- for production use I would (will) certainly consider > this. However, now I'm interested in the problem itself: why is the damn > distribution not uniform? You want "< 234", not "< 235". (234 % 26 == 0), so you get some extra 'a's. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
From: mk on 24 Feb 2010 14:16 On 2010-02-24 20:01, Robert Kern wrote: > I will repeat my advice to just use random.SystemRandom.choice() instead > of trying to interpret the bytes from /dev/urandom directly. Out of curiosity: def gen_rand_string(length): prng = random.SystemRandom() chars = [] for i in range(length): chars.append(prng.choice('abcdefghijklmnopqrstuvwxyz')) return ''.join(chars) if __name__ == "__main__": chardict = {} for i in range(10000): ## w = gen_rand_word(10) w = gen_rand_string(10) count_chars(chardict, w) counts = list(chardict.items()) counts.sort(key = operator.itemgetter(1), reverse = True) for char, count in counts: print char, count s 3966 d 3912 g 3909 h 3905 a 3901 u 3900 q 3891 m 3888 k 3884 b 3878 x 3875 v 3867 w 3864 y 3851 l 3825 z 3821 c 3819 e 3819 r 3816 n 3808 o 3797 f 3795 t 3784 p 3765 j 3730 i 3704 Better, although still not perfect. Regards, mk
From: Paul Rubin on 24 Feb 2010 14:09 Robert Kern <robert.kern(a)gmail.com> writes: > I will repeat my advice to just use random.SystemRandom.choice() > instead of trying to interpret the bytes from /dev/urandom directly. SystemRandom is something pretty new so I wasn't aware of it. But yeah, if I were thinking more clearly I would have suggested os.urandom instead of opening /dev/urandom.
From: Robert Kern on 24 Feb 2010 14:36
On 2010-02-24 13:16 PM, mk wrote: > On 2010-02-24 20:01, Robert Kern wrote: >> I will repeat my advice to just use random.SystemRandom.choice() instead >> of trying to interpret the bytes from /dev/urandom directly. > > Out of curiosity: > > def gen_rand_string(length): > prng = random.SystemRandom() > chars = [] > for i in range(length): > chars.append(prng.choice('abcdefghijklmnopqrstuvwxyz')) > return ''.join(chars) > > if __name__ == "__main__": > chardict = {} > for i in range(10000): > ## w = gen_rand_word(10) > w = gen_rand_string(10) > count_chars(chardict, w) > counts = list(chardict.items()) > counts.sort(key = operator.itemgetter(1), reverse = True) > for char, count in counts: > print char, count > > > s 3966 > d 3912 > g 3909 > h 3905 > a 3901 > u 3900 > q 3891 > m 3888 > k 3884 > b 3878 > x 3875 > v 3867 > w 3864 > y 3851 > l 3825 > z 3821 > c 3819 > e 3819 > r 3816 > n 3808 > o 3797 > f 3795 > t 3784 > p 3765 > j 3730 > i 3704 > > Better, although still not perfect. This distribution is well within expectations. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco |