From: Markus Wanner on
Hi,

first of all, thanks for your feedback, I enjoy the discussion.

On 07/21/2010 07:25 PM, Robert Haas wrote:
> Given what you're trying to do, it does sound like you're going to
> need some kind of an algorithm for space management; but you'll be
> managing space within the SLRU rather than within shared_buffers. For
> example, you might end up putting a header on each SLRU page or
> segment and using that to track the available freespace within that
> segment for messages to be read and written. It'll probably be a bit
> more complex than the one for listen (see asyncQueueAddEntries).

But what would that buy us? Also consider that pretty much all available
dynamic allocators use shared memory (either from the OS directly, or
via mmap()'d area).

>> Yes, imessages shouldn't ever be spilled to disk. There naturally must be an
>> upper limit for them. (Be it total available memory, as for threaded things
>> or a given and size-constrained pool, as is the case for dynshmem).
>
> I guess experience has taught me to be wary of things that are wired
> in memory. Under extreme memory pressure, something's got to give, or
> the whole system will croak.

I absolutely agree to that last sentence. However, experience has taught
/me/ to be wary of things that needlessly swap to disk for hours before
reporting any kind of error (AKA swap hell). I prefer systems that
adjust to the OOM condition, instead of just ignoring it and falling
back to disk (which isn't doesn't provide infinite space, so that's just
pushing the limits).

The solution for imessages certainly isn't spilling to disk, which would
consume even more resources. Instead the process(es) for which there are
pending imessages should be allowed to consume them.

That's why upon OOM, IMessageCreate currently simply blocks the process
that wants to create an imessages. And yes, that's not quite perfect
(that process should still consume messages for itself), and it might
not play well with other potential users of dynamically allocated
memory. But it certainly works better than spilling to disk (and yes, I
tested that behavior within Postgres-R).

> Consider also the contrary situation,
> where the imessages stuff is not in use (even for a short period of
> time, like a few minutes). Then we'd really rather not still have
> memory carved out for it.

Huh? That's exactly what dynamic allocation could give you: not having
memory carved out for stuff you currently don't need, but instead being
able to dynamically use memory where most needed. SLRU has memory (not
disk space) carved out for pretty much every sub-system separately, if
I'm reading that code correctly.

> I think what would be even better is to merge the SLRU pools with the
> shared_buffer pool, so that the two can duke it out for who is in most
> need of the limited amount of memory available.

...well, just add the shared_buffer pool to the list of candidates that
could use dynamically allocated shared memory. It would need some
thinking about boundaries (i.e. when to spill to disk, for those modules
that /want/ to spill to disk) and dealing with OOM situations, but
that's about it.

Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Wed, Jul 21, 2010 at 2:53 PM, Markus Wanner <markus(a)bluegap.ch> wrote:
>> Consider also the contrary situation,
>> where the imessages stuff is not in use (even for a short period of
>> time, like a few minutes). �Then we'd really rather not still have
>> memory carved out for it.
>
> Huh? That's exactly what dynamic allocation could give you: not having
> memory carved out for stuff you currently don't need, but instead being able
> to dynamically use memory where most needed. SLRU has memory (not disk
> space) carved out for pretty much every sub-system separately, if I'm
> reading that code correctly.

Yeah, I think you are right. :-(

>> I think what would be even better is to merge the SLRU pools with the
>> shared_buffer pool, so that the two can duke it out for who is in most
>> need of the limited amount of memory available.
>
> ..well, just add the shared_buffer pool to the list of candidates that could
> use dynamically allocated shared memory. It would need some thinking about
> boundaries (i.e. when to spill to disk, for those modules that /want/ to
> spill to disk) and dealing with OOM situations, but that's about it.

I'm not sure why merging the SLRU pools with shared_buffers would
benefit from dynamically allocated shared memory.

I might be at (or possibly beyond) the limit of my ability to comment
intelligently on this without looking more at what you want to use
these imessages for, but I'm still pretty skeptical about the idea of
storing them directly in shared memory. It's possible, though, that I
am all wet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Markus Wanner on
Hi,

On 07/22/2010 12:11 AM, Robert Haas wrote:
> I'm not sure why merging the SLRU pools with shared_buffers would
> benefit from dynamically allocated shared memory.

Well, I'm not sure how you'd merge SLRU pools with shared_buffers. IMO
that inherently leads to the problem of allocating memory dynamically.

With such an allocator, I'd say you just port one module after another
to use that, instead of pre-allocated, fixed portions of shared memory.

> I might be at (or possibly beyond) the limit of my ability to comment
> intelligently on this without looking more at what you want to use
> these imessages for, but I'm still pretty skeptical about the idea of
> storing them directly in shared memory. It's possible, though, that I
> am all wet.

Imessages are meant to be a replacement for unix pipes. (To my
knowledge, those don't spill to disk either, but are blocking as soon as
Linux considers the pipe to be 'full'. Whenever that is. Or am I wrong
here?)

The reasons for replacing them were: they consume lots of file
descriptors, they can only be established between the parent and its
child process (at least for anonymous pipes that's the case) and last
but not least, I got told they still aren't fully portable. Another nice
thing about imessages compared to unix pipes is, that it's a zero-copy
approach.

Hope that makes my opinions and decisions clearer. Thank you for sharing
your concerns and for explaining SLRU to me.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Thu, Jul 22, 2010 at 3:01 AM, Markus Wanner <markus(a)bluegap.ch> wrote:
> On 07/22/2010 12:11 AM, Robert Haas wrote:
>>
>> I'm not sure why merging the SLRU pools with shared_buffers would
>> benefit from dynamically allocated shared memory.
>
> Well, I'm not sure how you'd merge SLRU pools with shared_buffers. IMO that
> inherently leads to the problem of allocating memory dynamically.
>
> With such an allocator, I'd say you just port one module after another to
> use that, instead of pre-allocated, fixed portions of shared memory.

Well, shared_buffers has to be allocated as one contiguous slab
because we index into it that way. So I don't really see how
dynamically allocating memory could help. What you'd need is a
different system for assigning buffer tags, so that a particular tag
could refer to a buffer with either kind of contents.

>> I might be at (or possibly beyond) the limit of my ability to comment
>> intelligently on this without looking more at what you want to use
>> these imessages for, but I'm still pretty skeptical about the idea of
>> storing them directly in shared memory. �It's possible, though, that I
>> am all wet.
>
> Imessages are meant to be a replacement for unix pipes. (To my knowledge,
> those don't spill to disk either, but are blocking as soon as Linux
> considers the pipe to be 'full'. Whenever that is. Or am I wrong here?)

I think you're right about that.

> The reasons for replacing them were: they consume lots of file descriptors,
> they can only be established between the parent and its child process (at
> least for anonymous pipes that's the case) and last but not least, I got
> told they still aren't fully portable. Another nice thing about imessages
> compared to unix pipes is, that it's a zero-copy approach.

That's sort of approaching the question from the opposite end from
what I was concerned about - I was wondering why you need a unicast
message-passing system.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Markus Wanner on
Hi,

On 07/22/2010 01:04 PM, Robert Haas wrote:
> Well, shared_buffers has to be allocated as one contiguous slab
> because we index into it that way. So I don't really see how
> dynamically allocating memory could help. What you'd need is a
> different system for assigning buffer tags, so that a particular tag
> could refer to a buffer with either kind of contents.

Hm.. okay, then it might not be that easy. Thanks for pointing that out.

> That's sort of approaching the question from the opposite end from
> what I was concerned about - I was wondering why you need a unicast
> message-passing system.

Well, the initial Postgres-R approach, being based on Postgres
6.4.something used unix pipes. I coded imessages as a replacement.

Postgres-R basically uses imessages to pass around change sets and other
information required to keep replicas in sync. The thinking in terms of
message passing seems to originate from the GCS, which in itself is a
message passing system (with some nice extras and varying delivery
guarantees).

In Postgres-R the coordinator process receives messages from the GCS,
does some minor controlling and book-keeping, but basically passes on
the data via imessages to a backrgound worker.

Of course, as mentioned in the bgworker patch, this could be done
differently. Using solely shared memory, or maybe SLRU to store change
sets. However, I certainly like the abstraction and guarantees such a
message passing system provides. It makes things easier to reason about,
IMO.

For another example, see the bgworker patches, steps 1 and 2, where I've
changed the current autovacuum infrastructure to use imessages (between
launcher and worker).

[ And I've heard saying that current multi-core CPU designs tend to like
message passing systems. Not sure how much that applies to imessages
and/or how it's used in bgworkers or Postgres-R, though. ]

That much about why using a unicast message-passing system.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers