Prev: freewrap is awesome!
Next: Tcl and .NET
From: Georgios Petasis on 1 Dec 2009 18:28 O/H Donald Arseneau έγραψε: > On Dec 1, 9:05 am, Alexandre Ferrieux <alexandre.ferri...(a)gmail.com> > wrote: >> On Dec 1, 5:33 pm, Donald Arseneau <a...(a)triumf.ca> wrote: >> >> >> >>> On Nov 30, 1:59 pm, Georgios Petasis <peta...(a)iit.demokritos.gr> >>> wrote: >>>> However, I don't know how to serialise and restore such a large >>>> structure. Just using "array get" needs much more memory, and tcl needs >>>> more than the 2GB a 32-bit application can use. So, I wrote some code >>>> that serialises all elements without requiring conversion to strings. >>> ... array nextelement ... >> Ahem, the question is about serialization, not iteration, and reuse >> (sharing) of values. What does the array iterator have to do with >> that ? > > It lets you save the contents of a 1.3GB Tcl array to a file without > overflowing process memory as [array get] would. I was presuming that > "serialise and restore" meant "serialize for writing, and restore > from > a file". > > I agree that a Tcl array is not ideal for such a big hash table, and > something more like a database is more appropriate. > > Donald Arseneau > Yes, I used array search for storing the hash table, and iteration over the dicts to store them. [array get" needed much more memory that the application could allocate (since it was using already 1.3 GB). George
From: Georgios Petasis on 1 Dec 2009 18:32 O/H Helmut Giese έγραψε: > Hi George, > it could be that MetaKit is your friend. It is not a database (just > "persistent storage"), but it seems to me that you don't really need > true DB capabilities (which for me ist the possibility to formulate > complex queries). > Its performance can be quite astonishing and it probably has less of a > memory overhead than a database solution. > > You already have it installed - it's part of ActiveState's Tcl. I > haven't used it for a couple of years so I cannot off hand produce an > example, but if you want to check if it fits your needs, there are > probably enough knowledgable people around here to help you get going. > > Good luck > Helmut Giese > > >> Hi all, >> >> I have a large hash table, whose keys are words, and the values are >> dicts, that contain integer pairs. >> I am creating this structure in memory, taking care to reuse objects as >> much as possible, with the result occupying ~ 1.3GB of memory. >> >> However, I don't know how to serialise and restore such a large >> structure. Just using "array get" needs much more memory, and tcl needs >> more than the 2GB a 32-bit application can use. So, I wrote some code >> that serialises all elements without requiring conversion to strings. >> The format I chose was as tcl code, to be asy to load it back: >> >> set dict [dict create] >> dict set dict 48422 1 >> set word {tenjin} >> set word_matrix($word) $dict >> set dict [dict create] >> dict set dict 4779 1 >> dict set dict 29113 2 >> dict set dict 44221 1 >> set word {lightyear} >> set word_matrix($word) $dict >> set dict [dict create] >> dict set dict 25399 1 >> set word {salary?} >> set word_matrix($word) $dict >> set dict [dict create] >> dict set dict 366 1 >> dict set dict 819 1 >> dict set dict 1154 2 >> dict set dict 2580 1 >> dict set dict 3164 1 >> dict set dict 3244 2 >> dict set dict 3420 2 >> dict set dict 3833 1 >> ... 313 MB of similar data. >> >> However, I cannot load back the data from this file. The problem is that >> a new object is created for every number in the file, which is memory >> expensive since there is some repetition. >> >> I tried to enclose the data in a proc (hoping that tcl will compile the >> proc into bytecode internally, and end up reusing the same objects for >> the same integers), but it didn't work (wish terminated around 1.3 GB >> with a message of not being able to re-alloc a large memory piece). >> >> Any ideas? >> >> George > Dear Helmut, Again a proposal I didn't think of :-) I am not sure I have the courage to test it though, as this would be the 6th implementation from scratch of the same task... I am currently running something with sqlite. I will check timings and decide... Regards, George
From: Donal K. Fellows on 1 Dec 2009 18:42 On 1 Dec, 19:19, drscr...(a)gmail.com wrote: > This is one of the things about sqlite. It processes the whole query > and returns it in one chunk. I think it does it progressively if you give it a script to execute each time round. Donal.
From: Georgios Petasis on 1 Dec 2009 18:42 O/H Georgios Petasis έγραψε: > O/H Will Duquette έγραψε: >> On Dec 1, 11:01 am, Georgios Petasis <peta...(a)iit.demokritos.gr> >> wrote: >>> $database onecolumn "SELECT id FROM words WHERE word='$word'" >>> >> >> Others have already mentioned adding an index, which you'll definitely >> want to do. I just wanted to point out that the usual way to write >> this query is >> >> $database onecolumn {SELECT id FROM words WHERE word=$word} >> >> SQLite will do the variable interpolation for you, according to SQL >> rules rather than Tcl rules, which generally speaking is what you >> want. Among other things, it prevents SQL injection attacks/errors. >> For example, in your version if $word is >> >> some'word >> >> you'll get an SQL syntax error. > > This is brilliant!! I couldn't imagine about this, so I converted > manually single quotes to '' before the sql statements! > > Many thanks, > > George I estimate that this will give a speed up ~50%. I suppose mainly because sqlite can cache the queries, and there is no need to reparse them (as was the case when I performed variable substitutions at the tcl level, and with every query sqlite saw q new string...) George
From: Georgios Petasis on 1 Dec 2009 18:47
O/H Donal K. Fellows έγραψε: > On 1 Dec, 19:19, drscr...(a)gmail.com wrote: >> This is one of the things about sqlite. It processes the whole query >> and returns it in one chunk. > > I think it does it progressively if you give it a script to execute > each time round. > > Donal. Yes, it does. And can store a whole row either in a Tcl array, or in variables named after the column names. Regards, George |