dynamically allocating chunks from shared memory [PgSql]

Prev: managing git disk space usage
Next: [HACKERS] antisocial things you can do in git (but not CVS)

From: Markus Wanner on 20 Jul 2010 15:14

Hi,

On 07/20/2010 09:05 PM, Alvaro Herrera wrote:
> Hmm, deriving code from a paper published by IBM sounds like bad news --
> who knows what patents they hold on the techniques there?

Yeah, that might be an issue. Note, however, that the lock-based variant
differs substantially from what's been published. And I sort of doubt
their patents covers a lot of stuff that's not lock-free-ish.

But again, I'd also very much welcome any other allocator. In my
opinion, it's the most annoying drawback of the process-based design
compared to a threaded variant (from the perspective of a developer).

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alvaro Herrera on 20 Jul 2010 17:46

Excerpts from Markus Wanner's message of mar jul 20 14:54:42 -0400 2010:

> > With respect to imessages specifically, what is the motivation for
> > using shared memory rather than something like an SLRU? The new
> > LISTEN implementation uses an SLRU and handles variable-size messages,
> > so it seems like it might be well-suited to this task.
>
> Well, imessages predates the new LISTEN implementation by some moons.
> They are intended to replace (unix-ish) pipes between processes. I fail
> to see the immediate link between (S)LRU and inter-process message
> passing. It might be more useful for multiple LISTENers, but I bet it
> has slightly different semantics than imessages.

I guess what Robert is saying is that you don't need shmem to pass
messages around. The new LISTEN implementation was just an example.
imessages aren't supposed to use it directly. Rather, the idea is to
store the messages in a new SLRU area. Thus you don't need to mess with
dynamically allocating shmem at all.

> But to be honest, I don't know too much about the new LISTEN
> implementation. Do you think a loss-less
> (single)-process-to-(single)-process message passing system could be
> built on top of it?

I don't think you should build on top of LISTEN but of slru.c. This is
probably more similar to multixact (see multixact.c) than to the new
LISTEN implementation.

I think it should be rather straightforward. There would be a unique
append-point; each process desiring to send a new message to another
backend would add a new message at that point. There would be one read
pointer per backend, and it would be advanced as messages are consumed.
Old segments could be trimmed as backends advance their read pointer,
similar to how sinval queue is handled.

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 20 Jul 2010 19:52

On Tue, Jul 20, 2010 at 5:46 PM, Alvaro Herrera
<alvherre(a)commandprompt.com> wrote:
> Excerpts from Markus Wanner's message of mar jul 20 14:54:42 -0400 2010:
>
>> > With respect to imessages specifically, what is the motivation for
>> > using shared memory rather than something like an SLRU? �The new
>> > LISTEN implementation uses an SLRU and handles variable-size messages,
>> > so it seems like it might be well-suited to this task.
>>
>> Well, imessages predates the new LISTEN implementation by some moons.
>> They are intended to replace (unix-ish) pipes between processes. I fail
>> to see the immediate link between (S)LRU and inter-process message
>> passing. It might be more useful for multiple LISTENers, but I bet it
>> has slightly different semantics than imessages.
>
> I guess what Robert is saying is that you don't need shmem to pass
> messages around. �The new LISTEN implementation was just an example.
> imessages aren't supposed to use it directly. �Rather, the idea is to
> store the messages in a new SLRU area. �Thus you don't need to mess with
> dynamically allocating shmem at all.

Right. I might be full of bull, but that's what I'm saying. :-)

>> But to be honest, I don't know too much about the new LISTEN
>> implementation. Do you think a loss-less
>> (single)-process-to-(single)-process message passing system could be
>> built on top of it?
>
> I don't think you should build on top of LISTEN but of slru.c. �This is
> probably more similar to multixact (see multixact.c) than to the new
> LISTEN implementation.
>
> I think it should be rather straightforward. �There would be a unique
> append-point; each process desiring to send a new message to another
> backend would add a new message at that point. �There would be one read
> pointer per backend, and it would be advanced as messages are consumed.
> Old segments could be trimmed as backends advance their read pointer,
> similar to how sinval queue is handled.

If the messages are mostly unicast, it might be nice if to contrive a
method whereby backends didn't need to explicitly advance over
messages destined only for other backends. Like maybe allocate a
small, fixed amount of shared memory sufficient for two "pointers"
into the SLRU area per backend, and then use the SLRU to store each
message with a header indicating where the next message is to be
found. For each backend, you store one pointer to the first queued
message and one pointer to the last queued message. New messages can
be added by making the current last message point to a newly added
message and updating the last message pointer for that backend. You'd
need to think about the locking and reference counting carefully to
make sure you eventually freed up unused pages, but it seems like it
might be doable. Of course, if the messages are mostly multi/anycast,
or if the rate of messaging is low enough that the aforementioned
complexity is not worth bothering with, then, what you said.

One big advantage of attacking the problem with an SLRU is that
there's no fixed upper limit on the amount of data that can be
enqueued at any given time. You can spill to disk or whatever as
needed (although hopefully you won't normally do so, for performance
reasons).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Markus Wanner on 21 Jul 2010 04:33

On 07/21/2010 01:52 AM, Robert Haas wrote:
> On Tue, Jul 20, 2010 at 5:46 PM, Alvaro Herrera
> <alvherre(a)commandprompt.com> wrote:
>> I guess what Robert is saying is that you don't need shmem to pass
>> messages around. The new LISTEN implementation was just an example.
>> imessages aren't supposed to use it directly. Rather, the idea is to
>> store the messages in a new SLRU area. Thus you don't need to mess with
>> dynamically allocating shmem at all.

Okay, so I just need to grok the SLRU stuff. Thanks for clarifying.

Note that I sort of /want/ to mess with shared memory. It's what I know
how to deal with. It's how threaded programs work as well. Ya know,
locks, conditional variables, mutexes, all those nice thing that allow
you to shoot your foot so terribly nicely... Oh, well...

>> I think it should be rather straightforward. There would be a unique
>> append-point;

Unique append-point? Sounds like what I had before. That'd be a step
backwards, compared to the per-backend queue and an allocator that
hopefully scales well with the amount of CPU cores.

>> each process desiring to send a new message to another
>> backend would add a new message at that point. There would be one read
>> pointer per backend, and it would be advanced as messages are consumed.
>> Old segments could be trimmed as backends advance their read pointer,
>> similar to how sinval queue is handled.

That leads to pretty nasty fragmentation. A dynamic allocator should do
much better in that regard. (Wamalloc certainly does).

> If the messages are mostly unicast, it might be nice if to contrive a
> method whereby backends didn't need to explicitly advance over
> messages destined only for other backends. Like maybe allocate a
> small, fixed amount of shared memory sufficient for two "pointers"
> into the SLRU area per backend, and then use the SLRU to store each
> message with a header indicating where the next message is to be
> found.

That's pretty much how imessages currently work. A single list of
messages queued per backend.

> For each backend, you store one pointer to the first queued
> message and one pointer to the last queued message. New messages can
> be added by making the current last message point to a newly added
> message and updating the last message pointer for that backend. You'd
> need to think about the locking and reference counting carefully to
> make sure you eventually freed up unused pages, but it seems like it
> might be doable.

I've just read through slru.c, but still don't have a clue how it could
replace a dynamic allocator.

At the moment, the creator of an imessage allocs memory, copies the
payload there and then activates the message by appending it to the
recipient's queue. Upon getting signaled, the recipient consumes the
message by removing it from the queue and is obliged to release the
memory the messages occupies after having processed it. Simple and
straight forward, IMO.

The queue addition and removal is clear. But how would I do the
alloc/free part with SLRU? Its blocks are fixed size (BLCKSZ) and the
API with ReadPage and WritePage is rather unlike a pair of alloc() and
free().

> One big advantage of attacking the problem with an SLRU is that
> there's no fixed upper limit on the amount of data that can be
> enqueued at any given time. You can spill to disk or whatever as
> needed (although hopefully you won't normally do so, for performance
> reasons).

Yes, imessages shouldn't ever be spilled to disk. There naturally must
be an upper limit for them. (Be it total available memory, as for
threaded things or a given and size-constrained pool, as is the case for
dynshmem).

To me it rather sounds like SLRU is a candidate for using dynamically
allocated shared memory underneath, instead of allocating a fixed amount
of slots in advance. That would allow more efficient use of shared
memory. (Given SLRU's ability to spill to disk, it could even be used to
'balance' out anomalies to some extent).

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 21 Jul 2010 13:25

On Wed, Jul 21, 2010 at 4:33 AM, Markus Wanner <markus(a)bluegap.ch> wrote:
> Okay, so I just need to grok the SLRU stuff. Thanks for clarifying.
>
> Note that I sort of /want/ to mess with shared memory. It's what I know how
> to deal with. It's how threaded programs work as well. Ya know, locks,
> conditional variables, mutexes, all those nice thing that allow you to shoot
> your foot so terribly nicely... Oh, well...

For what it's worth, I feel your pain. I think the SLRU method is
*probably* better, but I feel your pain anyway.

>> For each backend, you store one pointer to the first queued
>> message and one pointer to the last queued message. �New messages can
>> be added by making the current last message point to a newly added
>> message and updating the last message pointer for that backend. �You'd
>> need to think about the locking and reference counting carefully to
>> make sure you eventually freed up unused pages, but it seems like it
>> might be doable.
>
> I've just read through slru.c, but still don't have a clue how it could
> replace a dynamic allocator.
>
> At the moment, the creator of an imessage allocs memory, copies the payload
> there and then activates the message by appending it to the recipient's
> queue. Upon getting signaled, the recipient consumes the message by removing
> it from the queue and is obliged to release the memory the messages occupies
> after having processed it. Simple and straight forward, IMO.
>
> The queue addition and removal is clear. But how would I do the alloc/free
> part with SLRU? Its blocks are fixed size (BLCKSZ) and the API with ReadPage
> and WritePage is rather unlike a pair of alloc() and free().

Given what you're trying to do, it does sound like you're going to
need some kind of an algorithm for space management; but you'll be
managing space within the SLRU rather than within shared_buffers. For
example, you might end up putting a header on each SLRU page or
segment and using that to track the available freespace within that
segment for messages to be read and written. It'll probably be a bit
more complex than the one for listen (see asyncQueueAddEntries).

>> One big advantage of attacking the problem with an SLRU is that
>> there's no fixed upper limit on the amount of data that can be
>> enqueued at any given time. �You can spill to disk or whatever as
>> needed (although hopefully you won't normally do so, for performance
>> reasons).
>
> Yes, imessages shouldn't ever be spilled to disk. There naturally must be an
> upper limit for them. (Be it total available memory, as for threaded things
> or a given and size-constrained pool, as is the case for dynshmem).

I guess experience has taught me to be wary of things that are wired
in memory. Under extreme memory pressure, something's got to give, or
the whole system will croak. Consider also the contrary situation,
where the imessages stuff is not in use (even for a short period of
time, like a few minutes). Then we'd really rather not still have
memory carved out for it.

> To me it rather sounds like SLRU is a candidate for using dynamically
> allocated shared memory underneath, instead of allocating a fixed amount of
> slots in advance. That would allow more efficient use of shared memory.
> (Given SLRU's ability to spill to disk, it could even be used to 'balance'
> out anomalies to some extent).

I think what would be even better is to merge the SLRU pools with the
shared_buffer pool, so that the two can duke it out for who is in most
need of the limited amount of memory available.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: managing git disk space usage
Next: [HACKERS] antisocial things you can do in git (but not CVS)