dynamically allocating chunks from shared memory [PgSql]

Prev: managing git disk space usage
Next: [HACKERS] antisocial things you can do in git (but not CVS)

From: Alvaro Herrera on 20 Jul 2010 13:50

Excerpts from Markus Wanner's message of vie jul 02 19:44:46 -0400 2010:

> Having written a very primitive kind of a dynamic memory allocator for
> imessages [1], I've always wanted a better alternative. So I've
> investigated a bit, refactored step-by-step, and finally came up with
> the attached, lock based dynamic shared memory allocator. Its interface
> is as simple as malloc() and free(). A restart of the postmaster should
> truncate the whole area.

Interesting, thanks.

I gave it a skim and found that it badly needs a lot more code comments.

I'm also unconvinced that spinlocks are the best locking primitive here.
Why not lwlocks?

> Being a component which needs to pre-allocate its area in shared memory
> in advance, you need to define a maximum size for the pool of
> dynamically allocatable memory. That's currently defined in shmem.h
> instead of a GUC.

This should be an easy change; I agree that it needs to be configurable.

I'm not sure what kind of resistance you'll see to the idea of a
dynamically allocatable shmem area. Maybe we could use this in other
areas such as allocating space for heavyweight lock objects. Right now
the memory usage for them could grow due to a transitory increase in
lock traffic, leading to out-of-memory conditions later in other
modules. We've seen reports of that problem, so it'd be nice to be able
to fix that with this infrastructure.

I didn't look at the imessages patch (except to notice that I didn't
very much like the handling of out-of-memory, but you already knew that).

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 20 Jul 2010 14:23

On Tue, Jul 20, 2010 at 1:50 PM, Alvaro Herrera
<alvherre(a)commandprompt.com> wrote:
> I'm not sure what kind of resistance you'll see to the idea of a
> dynamically allocatable shmem area. �Maybe we could use this in other
> areas such as allocating space for heavyweight lock objects. �Right now
> the memory usage for them could grow due to a transitory increase in
> lock traffic, leading to out-of-memory conditions later in other
> modules. �We've seen reports of that problem, so it'd be nice to be able
> to fix that with this infrastructure.

Well, you can't really fix that problem with this infrastructure,
because this infrastructure only allows shared memory to be
dynamically allocated from a pool set aside for such allocations in
advance. If a surge in demand can exhaust all the heavyweight lock
space in the system, it can also exhaust the shared pool from which
more heavyweight lock space can be allocated. The failure might
manifest itself in a totally different subsystem though, since the
allocation that failed wouldn't necessarily be a heavyweight lock
allocation, but some other allocation that failed as a result of space
used by the heavyweight locks.

It would be more interesting if you could expand (or contract) the
size of shared memory as a whole while the system is up and running.
Then, perhaps, max_locks_per_transaction and other, similar GUCs could
be made PGC_SIGHUP, which would give you a way out of such situations
that didn't involve taking down the entire cluster. I'm not too sure
how to do that, though.

With respect to imessages specifically, what is the motivation for
using shared memory rather than something like an SLRU? The new
LISTEN implementation uses an SLRU and handles variable-size messages,
so it seems like it might be well-suited to this task.

Incidentally, the link for the imessages patch on the CommitFest page
points to http://archives.postgresql.org/message-id/ab0cd52a64e788f4ecb4515d1e6e4691(a)localhost
- which is the dynamic shmem patch. So I'm not sure where to find the
latest imessages patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Markus Wanner on 20 Jul 2010 14:36

Hello Alvaro,

thank you for looking through this code.

On 07/20/2010 07:50 PM, Alvaro Herrera wrote:
> Interesting, thanks.
>
> I gave it a skim and found that it badly needs a lot more code comments.

Hm.. yeah, the dynshmem stuff could probably need more comments. (The
bgworker stuff is probably a better example).

> I'm also unconvinced that spinlocks are the best locking primitive here.
> Why not lwlocks?

It's derived from a completely lock-free algorithm, as proposed by Maged
M. Michael in: Scalable Lock-Free Dynamic Memory Allocator. I dropped
all of the CAS primitives with their retry loop around and did further
simplifications. Spinlocks simply looked like the simplest thing to
fall-back to. But yeah, splitting into read and write accesses and using
lwlocks might be a win. Or it might not. I honestly don't know. And it's
probably not the best performing allocator ever. But it's certainly
better than nothing.

I did recently release the lock-free variant as well as a lock based
one, see http://www.bluegap.ch/projects/wamalloc/ for more information.

> I'm not sure what kind of resistance you'll see to the idea of a
> dynamically allocatable shmem area.

So far neither resistance nor applause. I'd love to hear more of an
echo. Even if it's resistance.

> Maybe we could use this in other
> areas

...which is why I've published this separately from Postgres-R.

> such as allocating space for heavyweight lock objects. Right now
> the memory usage for them could grow due to a transitory increase in
> lock traffic, leading to out-of-memory conditions later in other
> modules. We've seen reports of that problem, so it'd be nice to be able
> to fix that with this infrastructure.

Maybe, yes. Sounds like a nice idea.

> I didn't look at the imessages patch (except to notice that I didn't
> very much like the handling of out-of-memory, but you already knew that).

As all of the allocation problem has now been ripped out, the imessages
patch got quite a bit smaller. imsg.c now consists of only around 370
lines of code.

The handling of out-of-(shared)-memory situation could certainly be
improved, yes. Note that I've already separated out a
IMessageCreateInternal() method, which simply returns NULL in that case.
Is that the API you'd prefer?

Getting back to the dynshmem stuff: I don't mind much *which* allocator
to use. I also looked at jemalloc, but haven't been able to integrate it
into Postgres. So I've extended my experiment with wamalloc and turned
it into something usable for Postgres.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Markus Wanner on 20 Jul 2010 14:54

Hi,

On 07/20/2010 08:23 PM, Robert Haas wrote:
> Well, you can't really fix that problem with this infrastructure,

No, but it would allow you to better use the existing amount of shared
memory. Possibly avoiding the problem is certain scenarios.

> The failure might
> manifest itself in a totally different subsystem though, since the
> allocation that failed wouldn't necessarily be a heavyweight lock
> allocation, but some other allocation that failed as a result of space
> used by the heavyweight locks.

Yeah, that's a valid concern. Maybe it could be addressed by keeping
track of usage of dynshmem per module, and somehow inform the user about
the usage pattern in case of OOM.

> It would be more interesting

Sure, but then you'd definitely need a dynamic allocator, no?

> With respect to imessages specifically, what is the motivation for
> using shared memory rather than something like an SLRU? The new
> LISTEN implementation uses an SLRU and handles variable-size messages,
> so it seems like it might be well-suited to this task.

Well, imessages predates the new LISTEN implementation by some moons.
They are intended to replace (unix-ish) pipes between processes. I fail
to see the immediate link between (S)LRU and inter-process message
passing. It might be more useful for multiple LISTENers, but I bet it
has slightly different semantics than imessages.

But to be honest, I don't know too much about the new LISTEN
implementation. Do you think a loss-less
(single)-process-to-(single)-process message passing system could be
built on top of it?

> Incidentally, the link for the imessages patch on the CommitFest page
> points to http://archives.postgresql.org/message-id/ab0cd52a64e788f4ecb4515d1e6e4691(a)localhost
> - which is the dynamic shmem patch. So I'm not sure where to find the
> latest imessages patch.

The archive doesn't display attachments very well. But the imessages
patch is part of that mail. Maybe you still find it in your local mailbox?

In the archive view, it starts at the line that says:
*** src/backend/storage/ipc/imsg.c dc149eef487eafb43409a78b8a33c70e7d3c2bfa

(and, well, the dynshmem stuff ends just before that line. Those were
two .diff files attached, IIRC).

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alvaro Herrera on 20 Jul 2010 15:05

Excerpts from Markus Wanner's message of mar jul 20 14:36:55 -0400 2010:

> > I'm also unconvinced that spinlocks are the best locking primitive here.
> > Why not lwlocks?
>
> It's derived from a completely lock-free algorithm, as proposed by Maged
> M. Michael in: Scalable Lock-Free Dynamic Memory Allocator.

Hmm, deriving code from a paper published by IBM sounds like bad news --
who knows what patents they hold on the techniques there?

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

| Next | Last
Pages: 1 2 3 4 5 6
Prev: managing git disk space usage
Next: [HACKERS] antisocial things you can do in git (but not CVS)