[PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION [PgSql]

Prev: [HACKERS] [PATCH 1/4] Add "COPY ... TO FUNCTION ..." support
Next: Backup history file should be replicated in Streaming Replication?

From: Robert Haas on 30 Dec 2009 00:00

On Tue, Dec 29, 2009 at 11:44 PM, Jeff Davis <pgsql(a)j-davis.com> wrote:
> On Tue, 2009-12-29 at 23:11 -0500, Robert Haas wrote:
>> I fear that to make this really useful we would need to define some
>> new SQL syntax, like:
>>
>> CREATE [OR REPLACE] COPY TARGET name (STARTUP function_name, STREAM
>> function_name, SHUTDOWN function_name);
>> DROP COPY TARGET name;
>> GRANT USAGE ON COPY TARGET TO ...;
>>
>> COPY ... TO/FROM TARGET name (generic_option_list) WITH (options);
>
> Similar ideas were already suggested:
>
> http://archives.postgresql.org/pgsql-hackers/2009-11/msg01601.php
> http://archives.postgresql.org/pgsql-hackers/2009-11/msg01610.php

Sorry, it's been a while since I've read through this thread and I am
not as up on it as perhaps I should be. I generally agree with those
ideas, although I think that trying to make the existing aggregate
interface serve this purpose will probably turn out to be trying to
make a square peg fit a round hole.

> Regardless, I think there needs to be a way to pass arguments to the
> functions (at least the startup one). The obvious use case is to pass
> the destination table name, so that you don't have to define a separate
> target for each destination table.

Agreed, note that I suggested a way to do that.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 30 Dec 2009 00:11

On Tue, Dec 29, 2009 at 11:47 PM, Daniel Farina <drfarina(a)acm.org> wrote:
> On Tue, Dec 29, 2009 at 8:11 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
>> It might be possible to do this without introducing a notion of an
>> explicit object, but there are a couple of problems with that. First,
>> it requires the user to specify a lot of details on every COPY
>> invocation, which is awkward. Second, there's a security issue to
>> think about here. If we were just copying to a UDF that took a string
>> as an argument, that would be fine, but it's not safe to let
>> unprivileged users to directly invoke functions that take a
>> type-internal argument. Introducing an explicit object type would
>> allow the creation of copy targets to be restricted to super-users but
>> then granted out to whom the super-user chooses.
>>
>> Thoughts?
>
> Agree on the type internal and superuser access -- indeed, if one were
> to refactor the two existing COPY modes into external functions, the
> stdout behavior would be marked with SECURITY DEFINER and the to-file
> functions would only be superuser-accessible. (Interesting note: that
> means copy.c could theoretically lose the special check for superuser
> privilege for this mode, relying on standard function permissions...).
>
> I was mostly satisfied with a byzantine but otherwise semantically
> simple interface until the idea matures some more -- perhaps in
> practice -- to inform the most useful kind of convenience to support
> it. I don't see a strong reason personally to rush into defining such
> an interface just yet, although an important interface we would have
> to define is contract a user would have to follow to enable their very
> own fully featured COPY output mode.

I think we should try to put an interface layer in place in the first
iteration. I don't really see this as having much value without that.
If we implement this strictly as a series of COPY options, we're
going to be committed to supporting that interface forever whether it
turns out to be good for anything or not. Since any such interface
would pretty much have to be superuser-only, I'm inclined to think
that there is not enough bang for the buck to justify the ongoing
maintenance effort.

One way to figure out whether we've define the COPY TARGET interface
correctly is to put together a few sample implementations and see
whether they seem functional and useful, without too many lacunas. I
think it should be possible to assess from a few such implementations
whether we have basically the right design.

> Still, the patch as-submitted is quite far from achieving one of my
> main litmus test goals: subsumption of existing COPY behavior.
> Particularly thorny was how to support the copying-to-a-file
> semantics, but I believe that the copy-options patch provide a nice
> avenue to solve this problem, as one can call a function in the
> options list and somehow pass the return value of that initializer --
> which may include a file handle -- to the byte-handling function.

It may be useful to consider whether the current behavior could be
re-implemented using some proposed extension mechanism, but I would
not put much emphasis on trying to really subsume the current behavior
into such an implementation. Certainly, we will need to continue
support the existing syntax forever, so the existing functionality
will need to be special-cased at least to that extent.

> Finally, I think a valid point was made that the patch is much more
> powerful to end users if it supports byte arrays, and there are some
> open questions as to whether this should be the only/primary supported
> mode. I personally like the INTERNAL-type interface, as dealing with
> the StringInfo buffer used by current COPY code is very convenient
> from C and the current implementation is not very complicated yet
> avoids unnecessary copying/value coercions, but I agree there is
> definitely value in enabling the use of more normal data types.

I agree that needs some further thought. Again, a few sample
implementations might help provide clarity here. I agree that the
StringInfo representation is a bit random, but OTOH I mostly see
someone installing a few COPY TARGET implementations into their DB and
then using them over and over again. Creating a new COPY TARGET
should be relatively rare, so if they have to be written in C, we may
not care that much. On the flip side we should be looking for some
concrete gain from putting that restriction into the mix.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Daniel Farina on 30 Dec 2009 00:26

On Tue, Dec 29, 2009 at 9:11 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
> I think we should try to put an interface layer in place in the first
> iteration. I don't really see this as having much value without that.
> If we implement this strictly as a series of COPY options, we're
> going to be committed to supporting that interface forever whether it
> turns out to be good for anything or not. Since any such interface
> would pretty much have to be superuser-only, I'm inclined to think
> that there is not enough bang for the buck to justify the ongoing
> maintenance effort.

Interestingly, I came to the opposite conclusion you did: I thought
supporting new top-level syntax might be the more invasive and
support-heavy change as opposed to just changing the option handling
grammar to support funcall-looking things on the right-hand-side.

True, the semantics and interfaces between symbols would need to be
somewhat frozen and documented in either case. I do value in
supporting a notion of "this constellation of functions is OK, even if
one of them does take type INTERNAL", whereas my proposal has no
mechanism to address such a constellation: everything is independent,
which does not capture all the information necessary to determine if
it's safe to execute a particular combination of functions to perform
a COPY.

I think you may be right about this, so we could basically move most
things from the COPY option list into a new DDL option list...I am
more ambivalent, but it seems that would require a catalog, and so on,
which is why I did not leap to do that initially for the very
support-reasons you mentioned.

> One way to figure out whether we've define the COPY TARGET interface
> correctly is to put together a few sample implementations and see
> whether they seem functional and useful, without too many lacunas. I
> think it should be possible to assess from a few such implementations
> whether we have basically the right design.

Okay, I was simply less optimistic than that (that we could get it
right so easily), and was content to put out something that had
reasonable design but perhaps byzantine (but well-defined) interfaces
and let complaints/wishes drive completion.

But given that the main difference between your proposal and mine is
to move things out of COPY's option list and into a toplevel DDL
option list, under the covers things are pretty samey, except I would
imagine your proposal requires committing to a new catalog(?) which
also enables addressing the combination of functions as a unit and a
new top-level DDL (but avoids committing to many new COPY options).

> I agree that needs some further thought. Again, a few sample
> implementations might help provide clarity here. I agree that the
> StringInfo representation is a bit random, but OTOH I mostly see
> someone installing a few COPY TARGET implementations into their DB and
> then using them over and over again. Creating a new COPY TARGET
> should be relatively rare, so if they have to be written in C, we may
> not care that much. On the flip side we should be looking for some
> concrete gain from putting that restriction into the mix.

I was originally inclined to agree, but I believe there are others who
the most interesting and use applications are only captured if
userland types are supported, and I can see good reason for that
belief...

Even something as simple as authoring "tee" for COPY will require
writing C here, whereas when supporting userland types people can hack
out a PL function that calls some other existing C-written functions
(we can provide a BYTEA veneer on top of the INTERNAL-version of a
function rather trivially express for this purpose...)

fdr

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 30 Dec 2009 00:46

On Wed, Dec 30, 2009 at 12:26 AM, Daniel Farina <drfarina(a)acm.org> wrote:
> I think you may be right about this, so we could basically move most
> things from the COPY option list into a new DDL option list...I am
> more ambivalent, but it seems that would require a catalog, and so on,
> which is why I did not leap to do that initially for the very
> support-reasons you mentioned.

Adding a catalog doesn't bother me a bit. That doesn't really cost
anything. What I'm concerned about in terms of maintenance is that
anything we support in 8.5 will likely have to be supported in 8.6,
8.7, 8.8, 8.9, and 8.10, and probably beyond that. If we do something
that turns out to be CLEARLY a bad idea, then we MIGHT eventually be
able to get rid of it. But we probably won't be that (un)lucky.
We're more likely to do something that's sorta-useful, but makes
future enhancements that we care about more difficult. And then we'll
be stuck with it. That's really what I care about avoiding.

In other words, it's OK if we add something now that may need to be
further extended in the future, but it's BAD if we add something now
that we want to change incompatibly or take out in the future.

> But given that the main difference between your proposal and mine is
> to move things out of COPY's option list and into a toplevel DDL
> option list, under the covers things are pretty samey, except I would
> imagine your proposal requires committing to a new catalog(?) which
> also enables addressing the combination of functions as a unit and a
> new top-level DDL (but avoids committing to many new COPY options).

I agree with that summary. I don't think the catalog is a problem
(though we need to make sure to get the dependency handling correct).
The additional DDL syntax is more likely to draw objections, although
perhaps unsurprisingly I don't have a problem with it personally.

>> I agree that needs some further thought. Again, a few sample
>> implementations might help provide clarity here. I agree that the
>> StringInfo representation is a bit random, but OTOH I mostly see
>> someone installing a few COPY TARGET implementations into their DB and
>> then using them over and over again. Creating a new COPY TARGET
>> should be relatively rare, so if they have to be written in C, we may
>> not care that much. On the flip side we should be looking for some
>> concrete gain from putting that restriction into the mix.
>
> I was originally inclined to agree, but I believe there are others who
> the most interesting and use applications are only captured if
> userland types are supported, and I can see good reason for that
> belief...
>
> Even something as simple as authoring "tee" for COPY will require
> writing C here, whereas when supporting userland types people can hack
> out a PL function that calls some other existing C-written functions
> (we can provide a BYTEA veneer on top of the INTERNAL-version of a
> function rather trivially express for this purpose...)

Sure. If you think you can make it work using bytea, that seems
clearly better than using an internal type, all things being equal.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Daniel Farina on 30 Dec 2009 00:56

On Tue, Dec 29, 2009 at 9:46 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
> Sure. If you think you can make it work using bytea, that seems
> clearly better than using an internal type, all things being equal.

I think both are perfectly viable and can be supported simultaneously,
actually...I simply like INTERNAL because the mechanism and passed
data structure for when you *do* want to write a C function is just
really crisp and simple, and is not going to be a source of overhead.

fdr

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: [HACKERS] [PATCH 1/4] Add "COPY ... TO FUNCTION ..." support
Next: Backup history file should be replicated in Streaming Replication?