From: Hitoshi Harada on
2009/11/15 Andrew Gierth <andrew(a)tao11.riddles.org.uk>:
> My thinking is that the executor definitely shouldn't be relying on it
> being a specific node type, but should just ExecEvalExpr it on the
> first call and store the result; then you don't have to know whether
> it's a Const or Param node or a more complex expression.

Yeah, so that we allow something like ROWS BETWEEN 1 + $1 PRECEDING
AND ... And to support RANGE BETWEEN n PRECEDING ..., we should make
datum to add or to subtract from current value on initial call anyway.

Regards,


--
Hitoshi Harada

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Gierth on
Here's the rest of the review, as far as I've taken it given the
problems with the code.

The patch applied cleanly and includes regression tests but not docs.

Small nitpicks: there are some comments not updated (e.g. the
big one at the start of eval_windowaggregates). A couple of lines are
commented-out using C++ comments.

The overall approach seems ok, and the parser stuff seems fine to me.

These are the issues I've found that make it not committable in its
present form (including the ones I mentioned before):

- missing _readWindowFrameDef function (all nodes that are output
from parse analysis must have both _read and _out functions,
otherwise views can't work)

- the A_Const nodes should probably be transformed to Const nodes in
parse analysis, since A_Const has no _read/_out functions, which
means changing the corresponding code in the executor.

- ruleutils.c not updated to deparse the newly added window options

- leaks memory like it's going out of style

The memory leakage is caused by not resetting any memory contexts when
throwing away all the aggregate state when advancing the start of the
window frame. This looks like it will require a rethink of the memory
management being used; it's not enough just to pfree copies of the
transition values (which you don't appear to be doing), you have to
reset the memory context that was exposed to the transition functions
via context->wincontext. So the current setup of a single long-lived
context won't work; you'll need a long-lived one, plus an additional
one that you can reset any time the aggregates need to be
re-initialized. (And if you're not going to break existing aggregate
functions, WindowAggState.wincontext needs to be the one that gets
reset.)

Tests for memory leaks:

-- tests for failure to free by-ref transition values
select count(*)
from (select i,max(repeat(i::text,100)) over (order by i rows between 1 preceding and current row)
from generate_series(1,1000000) i) s;

-- tests for failure to reset memory context on window advance
select count(*)
from (select i,array_agg(i) over (order by i rows between 1 preceding and current row)
from generate_series(1,1000000) i) s;

--
Andrew (irc:RhodiumToad)

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Hitoshi Harada on
2009/11/19 Andrew Gierth <andrew(a)tao11.riddles.org.uk>:
> Here's the rest of the review, as far as I've taken it given the
> problems with the code.
>
> The patch applied cleanly and includes regression tests but not docs.
>
> Small nitpicks: there are some comments not updated (e.g. the
> big one at the start of eval_windowaggregates). A couple of lines are
> commented-out using C++ comments.

OK. It's tough for me to rewrite that big part of comment but I'll try it.

> The overall approach seems ok, and the parser stuff seems fine to me.
>
> These are the issues I've found that make it not committable in its
> present form (including the ones I mentioned before):
>
>  - missing _readWindowFrameDef function (all nodes that are output
>   from parse analysis must have both _read and _out functions,
>   otherwise views can't work)
>
>  - the A_Const nodes should probably be transformed to Const nodes in
>   parse analysis, since A_Const has no _read/_out functions, which
>   means changing the corresponding code in the executor.

A_Const/Const will be replace by Expr, to cover any expression without
local Var.

>
>  - ruleutils.c not updated to deparse the newly added window options
>
>  - leaks memory like it's going out of style
>
> The memory leakage is caused by not resetting any memory contexts when
> throwing away all the aggregate state when advancing the start of the
> window frame. This looks like it will require a rethink of the memory
> management being used; it's not enough just to pfree copies of the
> transition values (which you don't appear to be doing), you have to
> reset the memory context that was exposed to the transition functions
> via context->wincontext. So the current setup of a single long-lived
> context won't work; you'll need a long-lived one, plus an additional
> one that you can reset any time the aggregates need to be
> re-initialized. (And if you're not going to break existing aggregate
> functions, WindowAggState.wincontext needs to be the one that gets
> reset.)

Hmm, good point. Though I doubt we need two contexts for this because
we have not so far (and we already have tmpcontext for that purpose),
memory leakage probably seems to happen. I'll check it out.

Thanks for your elaborate review anyway. All I was worried about is
now clear. It will be lucky if I can update my patch until next week.
So please keep it "Waiting on Author".

Regards,

--
Hitoshi Harada

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Gierth on
>>>>> "Hitoshi" == Hitoshi Harada <umi.tanuki(a)gmail.com> writes:

Hitoshi> As earlier mail, I added aggcontext to WindowAggState.

This issue (as detailed in this post):
http://archives.postgresql.org/pgsql-hackers/2009-11/msg01871.php

is currently the only significant outstanding issue in my review of this
patch. I think we need to see more feedback on whether it is acceptable
to change the aggregate function API again (and if so, what to do with it)
before I can post a final review on this and mark it ready for committer
(or not).

--
Andrew.

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Gierth on
Functionally this patch looks excellent; correct format, applies
cleanly, passes regression, and I've been unable to find any issues
with the code itself. But I still have a concern over the interface
change, so I'm setting this back to "waiting on author" for now even
though it's really a matter for general discussion on -hackers.

To take the outstanding issues in descending order of importance:

1) Memory context handling for aggregate calls

Having thought about this carefully, I think the solution used in this
patch has to be rejected for one specific reason: if you compile
existing code (i.e. aggregates that use WindowAggState.wincontext)
written for 8.4 against the patched server, the code compiles
successfully and appears to run, but leaks memory at runtime. (And I've
confirmed that there do exist external modules that would be affected.)

If we're going to change the interface in this way, there should, IMO,
be enough of a change that old code fails to compile; e.g. by renaming
wincontext to partition_context or some equivalent change.

But in my opinion we should go further than this: since there's obviously
a need for aggregates to be able to get at a suitable memory context, I
think we should formalize this and actually define some interface functions
for them to use, so that something like

if (fcinfo->context && IsA(fcinfo->context, AggState))
aggcontext = ((AggState *) fcinfo->context)->aggcontext;
else if (fcinfo->context && IsA(fcinfo->context, WindowAggState))
aggcontext = ((WindowAggState *) fcinfo->context)->aggcontext;
else
ereport(...);

becomes something like

aggcontext = AggGetMemoryContext(fcinfo->context);
if (!aggcontext)
ereport(...);

For completeness, there should be two other functions: one to simply
return whether we are in fact being called as an aggregate, and another
one to return whether it's safe to scribble on the first argument
(while it's currently the case that these two are equivalent, it would
be good not to assume that).

Comments? This is the most significant issue as I see it.

(Also, a function in contrib/tsearch2 that accesses wincontext wasn't
updated by the patch.)

2) Keywords

I didn't find any discussion of this previously; did I miss it? The
patch changes BETWEEN from TYPE_FUNC_NAME_KEYWORD to COL_NAME_KEYWORD,
so it's now allowed as a column name (which it previously wasn't),
but now not allowed as a function. I get why it can't be a function now,
but if it didn't work as a column name before, why does it now?

(UNBOUNDED remains an unreserved word, and seems to behave in a
reasonable fashion even if used as an upper-level var in the query;
the interpretation of a bare UNBOUNDED has the standard behaviour
as far as I can see.)

3) Regression tests

Testing that views work is OK as far as it goes, but I think that view
definition should be left in place rather than dropped (possibly with
even more variants) so that the deparse code gets properly tested too.
(see the "rules" test)

4) Deparse output

The code is forcing explicit casting on the offset expressions, i.e.
the deparsed code looks like

... ROWS BETWEEN 1::bigint PRECEDING AND 1::bigint FOLLOWING ...

This looks a bit ugly; is it avoidable? At least for ROWS it should
be, I think, since the type is known; even for RANGE, the type would
be determined by the sort column.

5) Documentation issues

The entry for BETWEEN in the keywords appendix isn't updated.
(Wouldn't it make more sense for this to be generated from the keyword
list somehow?)

Spelling:
- current row. In <literal>ROWS</> mode this value means phisical row
+ current row. In <literal>ROWS</> mode this value means physical row
(this error appears twice)

The doc could probably do with some wordsmithing but this probably
shouldn't block the patch on its own; that's something that could be
handled separately I guess.

--
Andrew.

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers