[PATCH 4/4] Add tests to dblink covering use of COPY TO FUNCTION [PgSql]

Prev: [HACKERS] [PATCH 1/4] Add "COPY ... TO FUNCTION ..." support
Next: Backup history file should be replicated in Streaming Replication?

From: Pavel Stehule on 24 Nov 2009 08:39

2009/11/24 Daniel Farina <drfarina(a)gmail.com>:
> On Tue, Nov 24, 2009 at 4:37 AM, Pavel Stehule <pavel.stehule(a)gmail.com> wrote:
>> 2009/11/24 Daniel Farina <drfarina(a)gmail.com>:
>>> On Tue, Nov 24, 2009 at 2:10 AM, Pavel Stehule <pavel.stehule(a)gmail.com> wrote:
>>>> Hello
>>>>
>>>> I thing, so this patch is maybe good idea. I am missing better
>>>> function specification. Specification by name isn't enough - we can
>>>> have a overloaded functions. This syntax doesn't allow to use explicit
>>>> cast - from my personal view, the syntax is ugly - with type
>>>> specification we don't need to keyword FUNCTION
>>>
>>> As long as things continue to support the INTERNAL-type behavior for
>>> extremely low overhead bulk transfers I am open to suggestions about
>>> how to enrich things...but how would I do so under this proposal?
>>>
>>
>> using an INTERNAL type is wrong. It breaks design these functions for
>> usual PL. I don't see any reason, why it's necessary.
>>
>>> I am especially fishing for suggestions in the direction of managing
>>> state for the function between rows though...I don't like how the
>>> current design seems to scream "use a global variable."
>>>
>>>> We have a fast copy statement - ok., we have a fast function ok, but
>>>> inside a function we have to call "slow" sql query. Personally What is
>>>> advantage?
>>>
>>> The implementation here uses a type 'internal' for performance. Â It
>>> doesn't even recompute the fcinfo because of the very particular
>>> circumstances of how the function is called. Â It doesn't do a memory
>>> copy of the argument buffer either, to the best of my knowledge. Â In
>>> the dblink patches you basically stream directly from the disk, format
>>> the COPY bytes, and shove it into a waiting COPY on another postgres
>>> node...there's almost no additional work in-between. Â All utilized
>>> time would be some combination of the normal COPY byte stream
>>> generation and libpq.
>>>
>>
>> I understand and I dislike it. This design isn't general - or it is
>> far from using a function. It doesn't use complete FUNCAPI interface.
>> I thing so you need different semantic. You are not use a function.
>> You are use some like "stream object". This stream object can have a
>> input, output function, and parameters should be internal (I don't
>> thing, so internal could to carry any significant performance here) or
>> standard. Syntax should be similar to CREATE AGGREGATE.
>
> I think you might be right about this. Â At the time I was too shy to
> add a DDL command for this hack, though. Â But what I did want is a
> form of currying, and that's not easily accomplished in SQL without
> extension...
>

COPY is a PostgreSQL extension. If there are other related extensions - why not?
PostgreSQL has lot of database objects over SQL standard - see
fulltext implementation. I am not sure if STREAM is good keyword now.
It could be in collision with STREAM from streaming databases.

>> then syntax should be:
>>
>> COPY table TO streamname(parameters)
>>
>> COPY table TO filestream('/tmp/foo.dta') ...
>> COPY table TO dblinkstream(connectionstring) ...
>
> I like this one quite a bit...it's a bit like an aggregate, except the
> initial condition can be set in a rather function-callish way.
>
> But that does seem to require making a DDL command, which leaves a
> nice green field. Â In particular, we could then make as many hooks,
> flags, and options as we wanted, but sometimes there is a paradox of
> choice...I just did not want to anticipate on Postgres being friendly
> to a new DDL command when writing this the first time.
>

sure - nobody like too much changes in gram.y. But well designed
general feature with related SQL enhancing is more acceptable, then
fast simply hack. Don't be a hurry. This idea is good - but it needs:

a) good designed C API like:

initialise_functions(fcinfo) -- std fcinfo
consument_process_tuple(fcinfo) -- gets standard row -- Datum
dvalues[] + Row description
producent_process_tuple(fcinfo) -- returns standard row -- Datum
dvalues[] + Row description (look on SRF API)
terminate_funnction(fcinfo)

I am sure, so this could be similar to AGGREGATE api
+ some samples to contrib

b) good designed PLPerlu and PLPythonu interface
+ some samples to documentation

Regards
Pavel Stehule

>
>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 24 Nov 2009 09:50

On Mon, Nov 23, 2009 at 8:46 PM, Greg Smith <greg(a)2ndquadrant.com> wrote:
> You know how people complain about how new contributors are treated here?
> Throwing out comments like this, that come off as belittling to other
> people's work, doesn't help. All I was suggesting was that Dan wasn't
> developing this in complete isolation from the hackers community as Robert
> had feared, as will be obvious when we get to:

I still think it's better to have discussion on the mailing list than
elsewhere. But we're doing that now, so, good.

> As far as other past discussion here that might be relevant, this patch
> includes a direct change to gram.y to support the new syntax. You've
> already suggested before that it might be time to update COPY the same way
> EXPLAIN and now VACUUM have been overhauled to provide a more flexible
> options interface:
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg00616.php This
> patch might be more fuel for that idea.

FWIW, Tom already committed a patch by Emmanuel and myself that did this.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Pavel Stehule on 24 Nov 2009 09:52

2009/11/24 Hannu Krosing <hannu(a)2ndquadrant.com>:
> On Tue, 2009-11-24 at 05:00 -0800, Daniel Farina wrote:
>> On Tue, Nov 24, 2009 at 4:37 AM, Pavel Stehule <pavel.stehule(a)gmail.com> wrote:
>
>> > then syntax should be:
>> >
>> > COPY table TO streamname(parameters)
>> >
>> > COPY table TO filestream('/tmp/foo.dta') ...
>> > COPY table TO dblinkstream(connectionstring) ...
>
> You probably meant
>
> COPY table TO dblinkstream(connectionstring, table)
>
> ?
>
>> I like this one quite a bit...it's a bit like an aggregate, except the
>> initial condition can be set in a rather function-callish way.
>>
>> But that does seem to require making a DDL command, which leaves a
>> nice green field.
>
> not necessarily DDL, maybe just a "copystream" type and a set of
> functions creating objects of that type.
>
> if you make it a proper type with input and output function, then you
> can probably use it in statements like this
>
> COPY table TO (select stream::copystream from streams where id = 7);
>
> COPY table TO 'file:/tmp/outfile':: copystream;
>
> COPY table TO 'dblink::<connectstring>':: copystream;

it interesting - but still you have to have DDL for declaring stream.
It is analogous to function:

CREATE FUNCTION ....

SELECT 'foo'::regprocedure

but syntax COPY table TO copystream is good idea. I like it.

>
>> In particular, we could then make as many hooks,
>> flags, and options as we wanted, but sometimes there is a paradox of
>> choice...I just did not want to anticipate on Postgres being friendly
>> to a new DDL command when writing this the first time.
>
> fulltext lived for quite some time as set of types and functions before
> it was glorified with its own DDL syntax.

What is DDL? Wrapper for insert to system catalog.

so we can have table pg_catalog.copystream

and for first testing

CREATE OR REPLACE FUNCTION register_copystream(regproc, regproc, regproc ....)

if we will happy - than it is one day work for support statement

CREATE COPYSTREAM ( ...

Regards
Pavel Stehule

>
> It may be good to have the same approach here - do it as a set of types
> and functions first, think about adding DDL once it has stabilised
> enough
>
>
> --
> Hannu Krosing Â http://www.2ndQuadrant.com
> PostgreSQL Scalability and Availability
> Â Services, Consulting and Training
>
>
>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 24 Nov 2009 09:56

On Mon, Nov 23, 2009 at 9:37 PM, Andrew Dunstan <andrew(a)dunslane.net> wrote:
>
>
> Greg Smith wrote:
>>
>> I haven't heard anything from Andrew about ragged CVS import either. I
>> think that ultimately those features are useful, but just exceed what the
>> existing code could be hacked to handle cleanly.
>
> The patch is attached for your edification/amusement. I have backpatched it
> to 8.4 for the client that needed it, and it's working just fine. I didn't
> pursue it when it was clear that it was not going to be accepted. COPY
> returning text[] would allow us to achieve the same thing, a bit more
> verbosely, but it would be a lot more work to develop.

FWIW, I've somewhat come around to this idea. But I might be the only one.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 24 Nov 2009 23:44

Jeff Davis <pgsql(a)j-davis.com> writes:
> Don't you still need the functions to accept an argument of type
> internal? Otherwise, we lose the ability to copy a buffer to the dblink
> connection, which was the original motivation.

If you do that, then there is no possibility of ever using this feature
except with C-coded functions, which seems to me to remove most of
whatever use-case there was.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: [HACKERS] [PATCH 1/4] Add "COPY ... TO FUNCTION ..." support
Next: Backup history file should be replicated in Streaming Replication?