From: Tom Lane on
Robert Haas <robertmhaas(a)gmail.com> writes:
> On Sat, Jul 31, 2010 at 12:32 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> So it'd definitely be a lot better than now. �Maybe there'd be some
>> remaining performance gap for non-pathological cases, but I think we
>> would be willing to pay that in order to avoid bad behavior for the
>> pathological cases. �It's difficult to say for sure of course
>> without going to the trouble of coding and testing it.

> Well, it sounds like a reasonable thing to try, then. You going to
> take a crack at it?

Yeah, I'll have at it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Robert Haas <robertmhaas(a)gmail.com> writes:
> On Sat, Jul 31, 2010 at 12:32 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> So it'd definitely be a lot better than now.  Maybe there'd be some
>> remaining performance gap for non-pathological cases, but I think we
>> would be willing to pay that in order to avoid bad behavior for the
>> pathological cases.  It's difficult to say for sure of course
>> without going to the trouble of coding and testing it.

> Well, it sounds like a reasonable thing to try, then. You going to
> take a crack at it?

This worked out pretty well, so I've applied the attached patch.
I observed the following timings running the index build from
Artur Dobrowski's example:

8.4 9.0b4 HEAD

maintenance_work_mem setting 100MB 60MB 100MB
actual peak process image size 118M 126M 112M
elapsed time 964s 1289s 937s

so the memory bloat problem is definitely fixed and the speed
seems satisfactory.

It might be worth commenting that after I'd rewritten the rbtree code,
I spent awhile pulling my hair out because the code *still* blew past
the expected memory usage. It turned out that there were some
significant leaks in ginEntryInsert and subsidiary routines, causing
memory usage to continue to grow even once we realized we'd hit
maintenance_work_mem and started to dump data to disk. These leaks were
there before, but were masked in 8.4 because the old ginGetEntry code
incrementally pfreed itempointer-list storage as it walked the data
structure; this caused storage to be released faster than the leaks
would consume it. The new code doesn't release any of that storage
until the MemoryContextReset after the dump loop, so any leakage in
the dump loop becomes a visible increase in the peak space consumption.
I wasn't able to remove all of the leakage --- there's some fairly ugly
coding associated with index page splits that leaks an indextuple or so
per split, and I'm not sure where the extra tuple could safely be
pfree'd. That seems to be small enough to live with though; it's less
than 0.5% of the total memory usage in the example I'm working with.

I also cleaned up some other stuff that would've bit us eventually,
like unnecessary use of recursion in the rbtree iteration code.

regards, tom lane