From: Robert Haas on
On Mon, Jul 26, 2010 at 10:31 AM, Alvaro Herrera
<alvherre(a)commandprompt.com> wrote:
> Excerpts from Robert Haas's message of lun jul 26 08:52:46 -0400 2010:
>> Here's another idea. �Instead of making imessages use an SLRU, how
>> about having it steal pages from shared_buffers? �This would require
>> segmenting messages into small enough chunks that they'd fit, but the
>> nice part is that it would avoid the need to have a completely
>> separate shared memory arena. �Ideally, we'd make the infrastructure
>> general enough that things like SLRU could use it also; and get rid of
>> or reduce in size some of the special-purpose chunks we're now
>> allocating.
>
> What's the problem you see with "another shared memory arena"? �Right
> now we allocate a single large arena, and the lot of shared_buffers,
> SLRU pools, locking objects, etc are all allocated from there. �If we
> want another 2 MB for "dynamic shmem", we'd just allocate 2 MB more in
> that large arena and give those to this new code.

But that's not a very flexible design. If you discover that you need
3MB instead of 2MB, you get to restart the entire cluster. If you
discover that you need 1MB instead of 2MB, you get to either restart
the entire cluster, or waste 1MB of shared memory. And since actual
usage will almost certainly fluctuate, you'll almost certainly be
wasting some shared memory that could otherwise be used for other
purposes some of the time. Now, granted, we have this problem already
today, and granted also, 2MB is not an enormous amount of memory on
today's machines. If we really think that 2MB will always be adequate
for every purpose for which we wish to use unicast messaging, then
perhaps it's OK, but I'm not convinced that's true.

It would be nice to think, for example, that this could be used as
infrastructure for parallel query to stream results back from worker
processes to the backend connected to the user. If you're using 16
processors to concurrently scan 16 partitions of an appendrel and
stream those results back to the master, will 128kB/backend be enough
memory to avoid pipeline stalls? What if there's replication going on
at the same time? What if there's other concurrent activity that also
uses imessages? Or even better, what if there's other concurrent
activity that uses the dynamic allocator but NOT imessages? If the
point of having a dynamic allocator is that it's eventually going to
be used by lots of different subsystems, then we had better have a
fairly high degree of confidence that it actually will, but in fact
we've made very little effort to characterize who the other users
might be and whether the stated implementation limitations will be
adequate for them. Frankly, I doubt it. One of the major reasons why
malloc() is so powerful is that you don't have to decide in advance
how much memory you're going to need, as you would if you put the
structure in the data segment. Dynamically allocating out of a 2MB
segment gives up most of that flexibility.

What I think will end up happening here is that you'll always have to
size the segment used by the dynamic allocator considerably larger
than the amount of memory you expect to actually be used, so that
performance doesn't go into the toilet when it fills up. As Markus
pointed out upthread, you'll always need some hard limit on the amount
of space that imessages can use, but you can make that limit much
larger if it's not reserved for a single purpose. If you use the
"temporarily allocated shared buffers" method, then you could set the
default limit to something like "64MB, but not more than 1/8th of
shared buffers". Since the memory won't get used unless it's needed,
you don't really have to care whether a particular installation is
likely to need some, none, or all of that; whereas if you're
allocating nailed-down memory, you're going to want a much smaller
default - a couple of MB, at most. Furthermore, if you do happen to
be running on a 64GB machine with 8GB of shared_buffers and 64MB isn't
adequate, you can easily make it possible to bump that value up by
changing a GUC and hitting reload. With the "nailed-down shared
memory" approach, you're locked into whatever you decide at postmaster
start.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Markus Wanner on
Hi,

On 07/26/2010 06:33 PM, Robert Haas wrote:
> It would be nice to think, for example, that this could be used as
> infrastructure for parallel query to stream results back from worker
> processes to the backend connected to the user. If you're using 16
> processors to concurrently scan 16 partitions of an appendrel and
> stream those results back to the master

Now, *that* sounds like music to my ears ;-)

Or put another way: yes, I think imessages and the bgworker
infrastructure stuff could enable or at least help that goal.

> Dynamically allocating out of a 2MB
> segment gives up most of that flexibility.

Absolutely, that's why I'd like to see other modules that use the
dynamic allocator. The more the better.

> What I think will end up happening here is that you'll always have to
> size the segment used by the dynamic allocator considerably larger
> than the amount of memory you expect to actually be used, so that
> performance doesn't go into the toilet when it fills up. As Markus
> pointed out upthread, you'll always need some hard limit on the amount
> of space that imessages can use, but you can make that limit much
> larger if it's not reserved for a single purpose. If you use the
> "temporarily allocated shared buffers" method, then you could set the
> default limit to something like "64MB, but not more than 1/8th of
> shared buffers".

I've been thinking about such rules as well. They quickly get more
complex if you begin to take OOM situations and their counter-measures
into account.

In a way, fixing every separate pool to its specific size just is the
very simples rule-set I can think of. The dynamic allocator buys you
more flexibility, but choosing good limits and rules between the
sub-systems is another issue.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers