Prev: Ann: Gnocl gets own URL/
Next: Compiling
From: blacksqr on 28 Feb 2010 00:05 I am trying to figure out how to deal with what I have been told is Tcl's "high-water-mark memory management" when dealing with very large data sets. Apparently the Tcl interpreter often doesn't release all the memory used by a procedure after the procedure returns, but keeps claim on it and, I suppose, re-uses it when necessary. My problem is that I have written a procedure that reads a very large file into RAM, processes it, and writes the results to an output file. The net RAM usage should be zero after the procedure returns. I would like to call this procedure several times, but after the first time my computer's RAM is almost depleted, and during the second procedure call there's not enough RAM to read the second file into memory. Tcl, instead of releasing un-needed memory to allow the file read, crashes with a memory allocation error. At first I thought it was a memory leak, but apparently not, according to feedback I've gotten. I would submit a bug report, but is this even considered a bug? Or am I just out of luck due to how Tcl manages memory? Also, it's hard to attach a 250MB file to a bug ticket. If this is not a bug, is there any way to force Tcl to purge its "high-water-mark memory" so that multiple high-memory-demand procedure calls can be made? I'm using ActiveTcl 8.5.8 on Ubuntu Intrepid.
From: hae on 28 Feb 2010 04:05 On 28 Feb., 06:05, blacksqr <stephen.hunt...(a)alum.mit.edu> wrote: > I am trying to figure out how to deal with what I have been told is > Tcl's "high-water-mark memory management" when dealing with very large > data sets. Apparently the Tcl interpreter often doesn't release all > the memory used by a procedure after the procedure returns, but keeps > claim on it and, I suppose, re-uses it when necessary. > > My problem is that I have written a procedure that reads a very large > file into RAM, processes it, and writes the results to an output > file. The net RAM usage should be zero after the procedure returns. > I would like to call this procedure several times, but after the first > time my computer's RAM is almost depleted, and during the second > procedure call there's not enough RAM to read the second file into > memory. Tcl, instead of releasing un-needed memory to allow the file > read, crashes with a memory allocation error. > > At first I thought it was a memory leak, but apparently not, according > to feedback I've gotten. I would submit a bug report, but is this > even considered a bug? Or am I just out of luck due to how Tcl > manages memory? Also, it's hard to attach a 250MB file to a bug > ticket. If this is not a bug, is there any way to force Tcl to purge > its "high-water-mark memory" so that multiple high-memory-demand > procedure calls can be made? > > I'm using ActiveTcl 8.5.8 on Ubuntu Intrepid. Hi, Tcl doesn't release memory in variables automatically, as far as I know, but I may be wrong here. However you can use unset and array unset respectively to explictly clear the variable. Ruediger
From: Alexandre Ferrieux on 28 Feb 2010 07:18 On Feb 28, 6:05 am, blacksqr <stephen.hunt...(a)alum.mit.edu> wrote: > I am trying to figure out how to deal with what I have been told is > Tcl's "high-water-mark memory management" when dealing with very large > data sets. Apparently the Tcl interpreter often doesn't release all > the memory used by a procedure after the procedure returns, but keeps > claim on it and, I suppose, re-uses it when necessary. > > My problem is that I have written a procedure that reads a very large > file into RAM, processes it, and writes the results to an output > file. The net RAM usage should be zero after the procedure returns. > I would like to call this procedure several times, but after the first > time my computer's RAM is almost depleted, and during the second > procedure call there's not enough RAM to read the second file into > memory. Tcl, instead of releasing un-needed memory to allow the file > read, crashes with a memory allocation error. > > At first I thought it was a memory leak, but apparently not, according > to feedback I've gotten. I would submit a bug report, but is this > even considered a bug? Or am I just out of luck due to how Tcl > manages memory? Also, it's hard to attach a 250MB file to a bug > ticket. If this is not a bug, is there any way to force Tcl to purge > its "high-water-mark memory" so that multiple high-memory-demand > procedure calls can be made? The feedback you've gotten in 2960042, from Donal, explicitly says that it is not a memory leak IF running it twice doesn't extends the memory claim, which is what Donal (and everybody else) observe(s) normally. Since your initial report didn't mention this extra, worrying piece of evidence you've just described, Donal set the bug status to Invalid +Pending, meaning you have two weeks to provide counter evidence before it gets automatically closed (which is reversible too). So instead of switching to another communication channel (comp.lang.tcl), just deposit this evidence as a comment to the bug. We'll take it seriously there. -Alex
From: Donal K. Fellows on 28 Feb 2010 12:58 On 28 Feb, 05:05, blacksqr <stephen.hunt...(a)alum.mit.edu> wrote: > At first I thought it was a memory leak, but apparently not, according > to feedback I've gotten. I would submit a bug report, but is this > even considered a bug? Or am I just out of luck due to how Tcl > manages memory? Also, it's hard to attach a 250MB file to a bug > ticket. If this is not a bug, is there any way to force Tcl to purge > its "high-water-mark memory" so that multiple high-memory-demand > procedure calls can be made? Well, you have to be very careful with what you complain about. In particular, the script you supplied to describe the problem (in Bug #2960042) did not leak when tried many times over (100k) with somewhat smaller amounts of data. Even a single byte of leakage would have showed, but memory usage was stable and constant. There's nothing in the code that switches from one memory management model when moving from a few hundred kB of string to a few hundred GB of string; we just don't bother with that level of sophistication. :-) Whatever is wrong, you've not exhibited it yet so the rest of us can't help debug the real problem yet. At a guess, in your real code you're doing something that causes the data to be kept around; that's not a leak, that's just a program that processes a lot of data. :-) If you anticipate lots of identical strings, you can use techniques like manual splitting and interning to cut memory usage. The [split] command only does that sort of memory consumption reduction when splitting into characters, as that's the only case where it is a predictable major win. Donal.
From: hae on 28 Feb 2010 14:03
On 28 Feb., 10:05, hae <r_haer...(a)gmx.de> wrote: > On 28 Feb., 06:05, blacksqr <stephen.hunt...(a)alum.mit.edu> wrote: > > > > > I am trying to figure out how to deal with what I have been told is > > Tcl's "high-water-mark memory management" when dealing with very large > > data sets. Apparently the Tcl interpreter often doesn't release all > > the memory used by a procedure after the procedure returns, but keeps > > claim on it and, I suppose, re-uses it when necessary. > > > My problem is that I have written a procedure that reads a very large > > file into RAM, processes it, and writes the results to an output > > file. The net RAM usage should be zero after the procedure returns. > > I would like to call this procedure several times, but after the first > > time my computer's RAM is almost depleted, and during the second > > procedure call there's not enough RAM to read the second file into > > memory. Tcl, instead of releasing un-needed memory to allow the file > > read, crashes with a memory allocation error. > > > At first I thought it was a memory leak, but apparently not, according > > to feedback I've gotten. I would submit a bug report, but is this > > even considered a bug? Or am I just out of luck due to how Tcl > > manages memory? Also, it's hard to attach a 250MB file to a bug > > ticket. If this is not a bug, is there any way to force Tcl to purge > > its "high-water-mark memory" so that multiple high-memory-demand > > procedure calls can be made? > > > I'm using ActiveTcl 8.5.8 on Ubuntu Intrepid. > > Hi, > > Tcl doesn't release memory in variables automatically, as far as I > know, but I may be wrong here. However you can use unset and array > unset respectively to explictly clear the variable. > > Ruediger By the way. On linux you will not see that memory consumption as reported by ps will decrease after all allocated memory was freed. See this little test. The reason is, that on linux the memory allocator will increase the heap for small amounts of memory. But the heap is never shrinked after the memory was freed. Of course for large amounts of memory the allocator allocates "real" memory and if that is freed it is given back to the system. #ifndef MALLOC_CNT # define MALLOC_CNT (10 * 1000 * 1000) #endif #ifndef MALLOC_SIZE # define MALLOC_SIZE 32 #endif void* ptr[MALLOC_CNT]; int main (int argc, char** argv) { int i; for (i=0; i<MALLOC_CNT;i++ ) { ptr[i] = malloc( MALLOC_SIZE ); } for (i=0; i<MALLOC_CNT;i++ ) { free(ptr[i]); } while(1) { sleep(1); } return 0; } |