From: Daniel Farina on
On Mon, Nov 23, 2009 at 4:20 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> pgsql-hackers had some preliminary discussions a couple months back
> on refactoring COPY to allow things like this --- see the thread
> starting here:
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg00486.php
> While I don't think we arrived at any final decisions, I would like
> to know how this design fits in with what was discussed then.

This seems to be about importing/ingress, whereas this patch is about
exporting/egress...it is an interesting question on how much parsing
to do before on the ingress side before handing a row to a function
though, should we try to make these kinds of operations a bit more
symmetrical.

I did consider refactoring COPY, but since it's never clear when we
start a feature whether it is going to manifest itself as a good
upstream candidate we default to trying to make future merges with
Postgres tractable I did not take on such a large and artistic task.

fdr

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Daniel Farina on
On Tue, Nov 24, 2009 at 12:29 AM, Hannu Krosing <hannu(a)krosing.net> wrote:
> COPY stdin TO udf();

If stdin becomes (is?) a legitimate source of records, then this patch
will Just Work.

The patch is already quite useful in the COPY (SELECT ...) TO FUNCTION
.... scenario.

> COPY udf() FROM stdin;

This is unaddressed, but I think it would be a good idea to consider
enabling this kind of thing prior to application.

fdr

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Daniel Farina on
On Tue, Nov 24, 2009 at 12:38 AM, Hannu Krosing <hannu(a)2ndquadrant.com> wrote:
> COPY func(rowtype) FROM stdin;

I didn't consider rowtype...I did consider a type list, such as:

COPY func(typea, typeb, typec) FROM ...

Which would then operate just like a table, but be useless for the
data-cleaning case, and would not allow type overloading to be the
mechanism of selecting behavior.

It was just an idea...I don't really know the use cases well enough,
my anticipated used case was updating eagerly materialized quantities
with the UDF.

fdr

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Pavel Stehule on
Hello

I thing, so this patch is maybe good idea. I am missing better
function specification. Specification by name isn't enough - we can
have a overloaded functions. This syntax doesn't allow to use explicit
cast - from my personal view, the syntax is ugly - with type
specification we don't need to keyword FUNCTION

what about::

var a) by type specification

COPY table TO foo(varchar, int)

var b) by value expression - it allows some changes in order, casting, constants

COPY table TO foo($3, $1::date, $2::text, CURRENT_DATE, true);

One question:
We have a fast copy statement - ok., we have a fast function ok, but
inside a function we have to call "slow" sql query. Personally What is
advantage?

We need pipes like

like COPY table TO foo(..) TO table

foo() should be a transformation function, or real pipe function

Regards
Pavel Stehule

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Daniel Farina on
On Tue, Nov 24, 2009 at 2:10 AM, Pavel Stehule <pavel.stehule(a)gmail.com> wrote:
> Hello
>
> I thing, so this patch is maybe good idea. I am missing better
> function specification. Specification by name isn't enough - we can
> have a overloaded functions. This syntax doesn't allow to use explicit
> cast - from my personal view, the syntax is ugly - with type
> specification we don't need to keyword FUNCTION

As long as things continue to support the INTERNAL-type behavior for
extremely low overhead bulk transfers I am open to suggestions about
how to enrich things...but how would I do so under this proposal?

I am especially fishing for suggestions in the direction of managing
state for the function between rows though...I don't like how the
current design seems to scream "use a global variable."

> We have a fast copy statement - ok., we have a fast function ok, but
> inside a function we have to call "slow" sql query. Personally What is
> advantage?

The implementation here uses a type 'internal' for performance. It
doesn't even recompute the fcinfo because of the very particular
circumstances of how the function is called. It doesn't do a memory
copy of the argument buffer either, to the best of my knowledge. In
the dblink patches you basically stream directly from the disk, format
the COPY bytes, and shove it into a waiting COPY on another postgres
node...there's almost no additional work in-between. All utilized
time would be some combination of the normal COPY byte stream
generation and libpq.

This, of course, presumes that everyone who is interested in building
on this is going to use some UDFs written in C...

>
> We need pipes like
>
> like COPY table TO foo(..) TO table
>
> foo() should be a transformation function, or real pipe function

I've actually considered this pipe thing with a colleague while
driving home from work...it occurred to us that it would be nice to
have both pipes and tees (basically composition vs. mapping
application of functions over the input) in some form. Not sure what
an elegant way to express that is or how to control it. Since you can
work around this by composing or applying functions on your own in
another function, I'm not sure if that's as high priority for me
personally.

fdr

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers