global temporary tables [PgSql]

Prev: [HACKERS] global temporary tables
Next: [HACKERS] CIText and pattern_ops

From: "Greg Sabino Mullane" on 24 Apr 2010 12:02

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

>...
> surprised to find my clone unaffected? If it modifies both, how do we
> avoid complete havoc if the original has since been modified (perhaps
> incompatibly, perhaps not) by some other backend doing its own ALTER
> TABLE?

Since this is such a thorny problem, and this is a temporary table, why
not just disallow ALTER completely for the first pass?

- --
Greg Sabino Mullane greg(a)turnstep.com
PGP Key: 0x14964AC8 201004241201
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkvTFfsACgkQvJuQZxSWSsjrVACePmmNglGi6KoZgYZ7zjUm4gPm
o2wAoNYYuiZl1HZXsgiAOQkJzNUmaORm
=IijV
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 24 Apr 2010 12:11

"Greg Sabino Mullane" <greg(a)turnstep.com> writes:
>> surprised to find my clone unaffected? If it modifies both, how do we
>> avoid complete havoc if the original has since been modified (perhaps
>> incompatibly, perhaps not) by some other backend doing its own ALTER
>> TABLE?

> Since this is such a thorny problem, and this is a temporary table, why
> not just disallow ALTER completely for the first pass?

Usually the way we approach these kinds of problems is that we want
to see some plausible outline for how they might be fixed before we
move forward with the base feature. IOW, I wouldn't object to not
having ALTER in the first release, but if we have no idea how to do
ALTER at all I'd be too worried that we were painting ourselves into
a corner.

Or maybe you can make a case that there's no need to allow ALTER at
all, ever. But surely DROP needs to be possible, and that seems to
already introduce some of the same issues.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 24 Apr 2010 13:16

On Sat, Apr 24, 2010 at 12:11 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> "Greg Sabino Mullane" <greg(a)turnstep.com> writes:
>>> surprised to find my clone unaffected? If it modifies both, how do we
>>> avoid complete havoc if the original has since been modified (perhaps
>>> incompatibly, perhaps not) by some other backend doing its own ALTER
>>> TABLE?
>
>> Since this is such a thorny problem, and this is a temporary table, why
>> not just disallow ALTER completely for the first pass?
>
> Usually the way we approach these kinds of problems is that we want
> to see some plausible outline for how they might be fixed before we
> move forward with the base feature. IOW, I wouldn't object to not
> having ALTER in the first release, but if we have no idea how to do
> ALTER at all I'd be too worried that we were painting ourselves into
> a corner.
>
> Or maybe you can make a case that there's no need to allow ALTER at
> all, ever. But surely DROP needs to be possible, and that seems to
> already introduce some of the same issues.

I had the same thought as GSM this morning. More specifically, it
seems to me that the problematic cases are precisely those in which
you might feel an urge to touch somebody else's local buffers, so I
think we should disallow, approximately, those ALTER TABLE cases which
require a full-table rewrite. I don't see the problem with DROP.
Under the proposed design, it's approximately equivalent to dropping a
table that someone else has truncated. You just wait for the
necessary lock and then do it.

At least AIUI, the use case for this feature is that you want to avoid
creating "the same" temporary table over and over again. The schema
is fixed and doesn't change much, but you're creating it lots and lots
of times in lots and lots of different backends, leading to both
management and performance difficulties. If you want to be able to
change the schema frequently or in a backend-local way, use the
existing temporary table feature.

Now, there is ONE problem with DROP, which is that you might orphan
some heaps. Of course, that can also happen due to a backend crash.
Currently, autovacuum arranges to drop any orphaned temp tables that
have passed the wraparound threshold, but even if we were happy with
waiting 2 billion transactions to get things cleaned up, the mechanism
can't work here because it relies on being able to examine the
pg_class row and determine which backend owns it, and where the
storage is located.

We could possibly set things up so that a running backend will notice
if a global temporary table for which it's created a private
relfilenode gets dropped, and blow away the backing file. But that
doesn't protect against crashes, so I think we're going to need some
other garbage collection mechanism, either instead of in addition to
asking backends to clean up after themselves. I'm not quite sure what
the design of that should look like, though.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 24 Apr 2010 13:31

Robert Haas <robertmhaas(a)gmail.com> writes:
> At least AIUI, the use case for this feature is that you want to avoid
> creating "the same" temporary table over and over again.

The context that I've seen it come up in is that people don't want to
clutter their functions with create-it-if-it-doesn't-exist logic,
which you have to have given the current behavior of temp tables.
Any performance gain from reduced catalog churn would be gravy.

Aside from the DROP problem, I think this implementation proposal
has one other big shortcoming: what are you going to do about
table statistics? In many cases, you really *have* to do an ANALYZE
once you've populated a temp table, if you want to get decent plans
for it. Where will you put those stats?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 24 Apr 2010 13:38

[ forgot to respond to this part ]

Robert Haas <robertmhaas(a)gmail.com> writes:
> ... I don't see the problem with DROP.
> Under the proposed design, it's approximately equivalent to dropping a
> table that someone else has truncated. You just wait for the
> necessary lock and then do it.

And do *what*? You can remove the catalog entries, but how are you
going to make the physical storage of other backends' versions go away?
(To say nothing of making them flush their local buffers for it.)
If you do remove the catalog entries, won't you be cutting the knees
out from under whatever end-of-session cleanup processing might exist
in those other backends?

The idea of the global table as a template that individual sessions
clone working tables from would avoid most of these problems. You
rejected it on the grounds that ALTER would be too hard; but if you're
blowing off ALTER anyway, that argument seems pretty unimpressive.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9
Prev: [HACKERS] global temporary tables
Next: [HACKERS] CIText and pattern_ops