global temporary tables [PgSql]

Prev: [HACKERS] global temporary tables
Next: [HACKERS] CIText and pattern_ops

From: Tom Lane on 23 Apr 2010 23:11

Robert Haas <robertmhaas(a)gmail.com> writes:
> A couple of recent threads made got me thinking again about the idea
> of global temporary tables. There seem to be two principal issues:

> 1. What is a global temporary table?

> 2. How could we implement that?

> Despite rereading the "idea: global temp tables" thread from April
> 2009 in some detail, I was not able to get a clear understanding of
> (1).

I believe that the spec's distinction between global and local temp
tables has to do with whether they are visible across module
boundaries. Since we haven't implemented modules, that distinction
is meaningless to us. In the spec, *both* types of temp tables have
the property that the definition (schema) of the table is global
across all sessions, and only the content of the table is session-local.

This arrangement clearly is useful for some applications, but so is our
current definition wherein different sessions can have different schemas
for the same temp table name. So eventually it'd be good to support
both. But the GLOBAL/LOCAL TEMP TABLE distinction is something entirely
different. PG's behavior does not correspond to either of those.

Your idea of using the relmapper layer to instantiate copies of temp
tables is an interesting one. It's only a small piece of the puzzle
though. In particular, what you described would result in the table
having the same OID in all sessions, even though the relfilenodes are
different --- amd since locking is done on the basis of OID, that's
probably *not* what we want. It would be much better for performance
if the different sessions' versions of the table were independently
lockable.

I also kind of wonder what is supposed to happen if someone DROPs or
ALTERs the temp table definition ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 23 Apr 2010 23:18

On Fri, Apr 23, 2010 at 11:11 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas(a)gmail.com> writes:
>> A couple of recent threads made got me thinking again about the idea
>> of global temporary tables. There seem to be two principal issues:
>
>> 1. What is a global temporary table?
>
>> 2. How could we implement that?
>
>> Despite rereading the "idea: global temp tables" thread from April
>> 2009 in some detail, I was not able to get a clear understanding of
>> (1).
>
> I believe that the spec's distinction between global and local temp
> tables has to do with whether they are visible across module
> boundaries. Since we haven't implemented modules, that distinction
> is meaningless to us. In the spec, *both* types of temp tables have
> the property that the definition (schema) of the table is global
> across all sessions, and only the content of the table is session-local.
>
> This arrangement clearly is useful for some applications, but so is our
> current definition wherein different sessions can have different schemas
> for the same temp table name. So eventually it'd be good to support
> both. But the GLOBAL/LOCAL TEMP TABLE distinction is something entirely
> different. PG's behavior does not correspond to either of those.

I don't really care what we call it, although I find the GLOBAL name
convenient and descriptive.

> Your idea of using the relmapper layer to instantiate copies of temp
> tables is an interesting one. It's only a small piece of the puzzle
> though. In particular, what you described would result in the table
> having the same OID in all sessions, even though the relfilenodes are
> different --- amd since locking is done on the basis of OID, that's
> probably *not* what we want. It would be much better for performance
> if the different sessions' versions of the table were independently
> lockable.

Well, it depends on what operation we're talking about. For
operations that involve only the table contents, yeah, we'd like to
lock the versions independently. But for this sort of thing:

> I also kind of wonder what is supposed to happen if someone DROPs or
> ALTERs the temp table definition ...

....not so much. Here you REALLY want a DROP attempt to acquire an
AccessExclusiveLock that will conflict with any outstanding
AccessShareLocks. Similarly, you're only going to be able to modify
the schema for the relation if it's not otherwise in use.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 23 Apr 2010 23:28

Robert Haas <robertmhaas(a)gmail.com> writes:
> On Fri, Apr 23, 2010 at 11:11 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> I also kind of wonder what is supposed to happen if someone DROPs or
>> ALTERs the temp table definition ...

> ...not so much. Here you REALLY want a DROP attempt to acquire an
> AccessExclusiveLock that will conflict with any outstanding
> AccessShareLocks. Similarly, you're only going to be able to modify
> the schema for the relation if it's not otherwise in use.

I think you're presuming the answer to the question. We could also view
the desired behavior as being that each session clones the temp table
definition at some instant (eg, first use). The approach that you're
assuming seems fraught with large downsides: in particular, implementing
ALTER TABLE would be a mess. The would-be alterer would need access to
the physical copies of all sessions, which throws out not only the
assumption that the relmapper entries can be private data, but all of
the access optimizations we currently have in the local buffer manager.
Not to mention the coding mess of having to repeat the ALTER operation
for each of N copies, some of which might disappear while we're trying
to do it (or if they don't, we're blocking backends from exiting).
I don't even know how you'd do the ALTER over again N times if you
only have one set of catalog entries describing the N copies.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 23 Apr 2010 23:46

On Fri, Apr 23, 2010 at 11:28 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas(a)gmail.com> writes:
>> On Fri, Apr 23, 2010 at 11:11 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>>> I also kind of wonder what is supposed to happen if someone DROPs or
>>> ALTERs the temp table definition ...
>
>> ...not so much. Here you REALLY want a DROP attempt to acquire an
>> AccessExclusiveLock that will conflict with any outstanding
>> AccessShareLocks. Similarly, you're only going to be able to modify
>> the schema for the relation if it's not otherwise in use.
>
> I think you're presuming the answer to the question. We could also view
> the desired behavior as being that each session clones the temp table
> definition at some instant (eg, first use). The approach that you're
> assuming seems fraught with large downsides: in particular, implementing
> ALTER TABLE would be a mess. The would-be alterer would need access to
> the physical copies of all sessions, which throws out not only the
> assumption that the relmapper entries can be private data, but all of
> the access optimizations we currently have in the local buffer manager.

I agree, that would be pretty unfortunate, althogh maybe it's the only
way to make it work. It's not what I had in mind. I was thinking
that the would-be ALTERER could just take an AccessExclusiveLock, but
now that I think about it that doesn't work, since a backend could
have the table unlocked between transactions but still have private
contents in it. :-(

> Not to mention the coding mess of having to repeat the ALTER operation
> for each of N copies, some of which might disappear while we're trying
> to do it (or if they don't, we're blocking backends from exiting).
> I don't even know how you'd do the ALTER over again N times if you
> only have one set of catalog entries describing the N copies.

Well, if you clone the table, that just pushes the problem around.
When I run ALTER TABLE on one of these thingamabobs, does it modify my
clone? The original? Both? If it modifies my clone, how do we
modify the original? If it modifies the original, won't I be rather
surprised to find my clone unaffected? If it modifies both, how do we
avoid complete havoc if the original has since been modified (perhaps
incompatibly, perhaps not) by some other backend doing its own ALTER
TABLE?

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Pavel Stehule on 24 Apr 2010 02:42

2010/4/24 Robert Haas <robertmhaas(a)gmail.com>:
> A couple of recent threads made got me thinking again about the idea
> of global temporary tables. Â There seem to be two principal issues:
>
> 1. What is a global temporary table?
>
> 2. How could we implement that?
>
> Despite rereading the "idea: global temp tables" thread from April
> 2009 in some detail, I was not able to get a clear understanding of
> (1). Â What I *think* it is supposed to mean is that the table is a
> permanent object which is "globally" visible - that is, it's part of
> some non-temp schema like public or $user and it's column definitions
> etc. are visible to all backends - and it's not automatically removed
> on commit, backend exit, etc. - but the *contents* of the table are
> temporary and backend-local, so that each new backend initially sees
> it as empty and can then insert, update, and delete data independently
> of what any other backend does.
>
> As to (2), my thought is that perhaps we could implement this by
> instantiating a separate relfilenode for the relation for each backend
> which accesses it. Â relfilenode would be 0 in pg_class, as it is for
> "mapped" relations, but every time a backend touched the rel, we'd
> allocate a relfilenode and associated the oid of the temp table to it
> using some kind of backend-local storage - actually similar to what
> the relmapper code does, except without the complexity of ever
> actually having to persist the value; and perhaps using a hash table
> rather than an array, since the number of mapped rels that a backend
> can need to deal with is rather more limited than the number of temp
> tables it might want to use.

it is good idea.

I missing some ideas about statistics, about indexes.

Regards
Pavel Stehule
>
> Thoughts?
>
> ...Robert
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

| Next | Last
Pages: 1 2 3 4 5 6 7 8 9
Prev: [HACKERS] global temporary tables
Next: [HACKERS] CIText and pattern_ops