Prev: Microsoft lessening commitment to IronPython and IronRuby
Next: Is there any way to minimize str()/unicode() objects memoryusage [Python 2.6.4] ?
From: dmtr on 7 Aug 2010 04:32 > Looking at your benchmark, random.choice(letters) has probably less overhead > than letters[random.randint(...)]. You might even try to inline it as Right... random.choice()... I'm a bit new to python, always something to learn. But anyway in that benchmark (from http://bugs.python.org/issue9520 ) the code that generate 'words' takes 90% of the time. And I'm really looking at deltas between different methods, not the absolute value. I was also using different code to get benchmarks for my previous message... Here's the code: #!/usr/bin/python # -*- coding: utf-8 -*- import os, time, re, array start = time.time() d = dict() for i in xrange(0, 1000000): d[unicode(i).encode('utf-8')] = array.array('i', (i, i+1, i+2, i+3, i+4, i+5, i+6)) dt = time.time() - start vm = re.findall("(VmPeak.*|VmSize.*)", open('/proc/%d/status' % os.getpid()).read()) print "%d keys, %s, %f seconds, %f keys per second" % (len(d), vm, dt, len(d) / dt)
From: dmtr on 7 Aug 2010 04:45
I guess with the actual dataset I'll be able to improve the memory usage a bit, with BioPython::trie. That would probably be enough optimization to continue working with some comfort. On this test code BioPython::trie gives a bit of improvement in terms of memory. Not much though... >>> d = dict() >>> for i in xrange(0, 1000000): d[unicode(i).encode('utf-8')] = array.array('i', (i, i+1, i+2, i+3, i+4, i+5, i+6)) 1000000 keys, ['VmPeak:\t 125656 kB', 'VmSize:\t 125656 kB'], 3.525858 seconds, 283618.896034 keys per second >>> from Bio import trie >>> d = trie.trie() >>> for i in xrange(0, 1000000): d[unicode(i).encode('utf-8')] = array.array('i', (i, i+1, i+2, i+3, i+4, i+5, i+6)) 1000000 keys, ['VmPeak:\t 108932 kB', 'VmSize:\t 108932 kB'], 4.142797 seconds, 241382.814950 keys per second |