Prev: [HACKERS] [PATCH 1/4] Add "COPY ... TO FUNCTION ..." support
Next: Backup history file should be replicated in Streaming Replication?
From: Pavel Stehule on 24 Nov 2009 08:39 2009/11/24 Daniel Farina <drfarina(a)gmail.com>: > On Tue, Nov 24, 2009 at 4:37 AM, Pavel Stehule <pavel.stehule(a)gmail.com> wrote: >> 2009/11/24 Daniel Farina <drfarina(a)gmail.com>: >>> On Tue, Nov 24, 2009 at 2:10 AM, Pavel Stehule <pavel.stehule(a)gmail.com> wrote: >>>> Hello >>>> >>>> I thing, so this patch is maybe good idea. I am missing better >>>> function specification. Specification by name isn't enough - we can >>>> have a overloaded functions. This syntax doesn't allow to use explicit >>>> cast - from my personal view, the syntax is ugly - with type >>>> specification we don't need to keyword FUNCTION >>> >>> As long as things continue to support the INTERNAL-type behavior for >>> extremely low overhead bulk transfers I am open to suggestions about >>> how to enrich things...but how would I do so under this proposal? >>> >> >> using an INTERNAL type is wrong. It breaks design these functions for >> usual PL. I don't see any reason, why it's necessary. >> >>> I am especially fishing for suggestions in the direction of managing >>> state for the function between rows though...I don't like how the >>> current design seems to scream "use a global variable." >>> >>>> We have a fast copy statement - ok., we have a fast function ok, but >>>> inside a function we have to call "slow" sql query. Personally What is >>>> advantage? >>> >>> The implementation here uses a type 'internal' for performance. Â It >>> doesn't even recompute the fcinfo because of the very particular >>> circumstances of how the function is called. Â It doesn't do a memory >>> copy of the argument buffer either, to the best of my knowledge. Â In >>> the dblink patches you basically stream directly from the disk, format >>> the COPY bytes, and shove it into a waiting COPY on another postgres >>> node...there's almost no additional work in-between. Â All utilized >>> time would be some combination of the normal COPY byte stream >>> generation and libpq. >>> >> >> I understand and I dislike it. This design isn't general - or it is >> far from using a function. It doesn't use complete FUNCAPI interface. >> I thing so you need different semantic. You are not use a function. >> You are use some like "stream object". This stream object can have a >> input, output function, and parameters should be internal (I don't >> thing, so internal could to carry any significant performance here) or >> standard. Syntax should be similar to CREATE AGGREGATE. > > I think you might be right about this. Â At the time I was too shy to > add a DDL command for this hack, though. Â But what I did want is a > form of currying, and that's not easily accomplished in SQL without > extension... > COPY is a PostgreSQL extension. If there are other related extensions - why not? PostgreSQL has lot of database objects over SQL standard - see fulltext implementation. I am not sure if STREAM is good keyword now. It could be in collision with STREAM from streaming databases. >> then syntax should be: >> >> COPY table TO streamname(parameters) >> >> COPY table TO filestream('/tmp/foo.dta') ... >> COPY table TO dblinkstream(connectionstring) ... > > I like this one quite a bit...it's a bit like an aggregate, except the > initial condition can be set in a rather function-callish way. > > But that does seem to require making a DDL command, which leaves a > nice green field. Â In particular, we could then make as many hooks, > flags, and options as we wanted, but sometimes there is a paradox of > choice...I just did not want to anticipate on Postgres being friendly > to a new DDL command when writing this the first time. > sure - nobody like too much changes in gram.y. But well designed general feature with related SQL enhancing is more acceptable, then fast simply hack. Don't be a hurry. This idea is good - but it needs: a) good designed C API like: initialise_functions(fcinfo) -- std fcinfo consument_process_tuple(fcinfo) -- gets standard row -- Datum dvalues[] + Row description producent_process_tuple(fcinfo) -- returns standard row -- Datum dvalues[] + Row description (look on SRF API) terminate_funnction(fcinfo) I am sure, so this could be similar to AGGREGATE api + some samples to contrib b) good designed PLPerlu and PLPythonu interface + some samples to documentation Regards Pavel Stehule > > -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 24 Nov 2009 09:50 On Mon, Nov 23, 2009 at 8:46 PM, Greg Smith <greg(a)2ndquadrant.com> wrote: > You know how people complain about how new contributors are treated here? > Throwing out comments like this, that come off as belittling to other > people's work, doesn't help. All I was suggesting was that Dan wasn't > developing this in complete isolation from the hackers community as Robert > had feared, as will be obvious when we get to: I still think it's better to have discussion on the mailing list than elsewhere. But we're doing that now, so, good. > As far as other past discussion here that might be relevant, this patch > includes a direct change to gram.y to support the new syntax. You've > already suggested before that it might be time to update COPY the same way > EXPLAIN and now VACUUM have been overhauled to provide a more flexible > options interface: > http://archives.postgresql.org/pgsql-hackers/2009-09/msg00616.php This > patch might be more fuel for that idea. FWIW, Tom already committed a patch by Emmanuel and myself that did this. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Pavel Stehule on 24 Nov 2009 09:52 2009/11/24 Hannu Krosing <hannu(a)2ndquadrant.com>: > On Tue, 2009-11-24 at 05:00 -0800, Daniel Farina wrote: >> On Tue, Nov 24, 2009 at 4:37 AM, Pavel Stehule <pavel.stehule(a)gmail.com> wrote: > >> > then syntax should be: >> > >> > COPY table TO streamname(parameters) >> > >> > COPY table TO filestream('/tmp/foo.dta') ... >> > COPY table TO dblinkstream(connectionstring) ... > > You probably meant > > COPY table TO dblinkstream(connectionstring, table) > > ? > >> I like this one quite a bit...it's a bit like an aggregate, except the >> initial condition can be set in a rather function-callish way. >> >> But that does seem to require making a DDL command, which leaves a >> nice green field. > > not necessarily DDL, maybe just a "copystream" type and a set of > functions creating objects of that type. > > if you make it a proper type with input and output function, then you > can probably use it in statements like this > > COPY table TO (select stream::copystream from streams where id = 7); > > COPY table TO 'file:/tmp/outfile':: copystream; > > COPY table TO 'dblink::<connectstring>':: copystream; it interesting - but still you have to have DDL for declaring stream. It is analogous to function: CREATE FUNCTION .... SELECT 'foo'::regprocedure but syntax COPY table TO copystream is good idea. I like it. > >> In particular, we could then make as many hooks, >> flags, and options as we wanted, but sometimes there is a paradox of >> choice...I just did not want to anticipate on Postgres being friendly >> to a new DDL command when writing this the first time. > > fulltext lived for quite some time as set of types and functions before > it was glorified with its own DDL syntax. What is DDL? Wrapper for insert to system catalog. so we can have table pg_catalog.copystream and for first testing CREATE OR REPLACE FUNCTION register_copystream(regproc, regproc, regproc ....) if we will happy - than it is one day work for support statement CREATE COPYSTREAM ( ... Regards Pavel Stehule > > It may be good to have the same approach here - do it as a set of types > and functions first, think about adding DDL once it has stabilised > enough > > > -- > Hannu Krosing  http://www.2ndQuadrant.com > PostgreSQL Scalability and Availability >  Services, Consulting and Training > > > -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 24 Nov 2009 09:56 On Mon, Nov 23, 2009 at 9:37 PM, Andrew Dunstan <andrew(a)dunslane.net> wrote: > > > Greg Smith wrote: >> >> I haven't heard anything from Andrew about ragged CVS import either. I >> think that ultimately those features are useful, but just exceed what the >> existing code could be hacked to handle cleanly. > > The patch is attached for your edification/amusement. I have backpatched it > to 8.4 for the client that needed it, and it's working just fine. I didn't > pursue it when it was clear that it was not going to be accepted. COPY > returning text[] would allow us to achieve the same thing, a bit more > verbosely, but it would be a lot more work to develop. FWIW, I've somewhat come around to this idea. But I might be the only one. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 24 Nov 2009 23:44
Jeff Davis <pgsql(a)j-davis.com> writes: > Don't you still need the functions to accept an argument of type > internal? Otherwise, we lose the ability to copy a buffer to the dblink > connection, which was the original motivation. If you do that, then there is no possibility of ever using this feature except with C-coded functions, which seems to me to remove most of whatever use-case there was. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |