From: Hitoshi Harada on 15 Nov 2009 02:43 2009/11/15 Andrew Gierth <andrew(a)tao11.riddles.org.uk>: > My thinking is that the executor definitely shouldn't be relying on it > being a specific node type, but should just ExecEvalExpr it on the > first call and store the result; then you don't have to know whether > it's a Const or Param node or a more complex expression. Yeah, so that we allow something like ROWS BETWEEN 1 + $1 PRECEDING AND ... And to support RANGE BETWEEN n PRECEDING ..., we should make datum to add or to subtract from current value on initial call anyway. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andrew Gierth on 19 Nov 2009 03:29 Here's the rest of the review, as far as I've taken it given the problems with the code. The patch applied cleanly and includes regression tests but not docs. Small nitpicks: there are some comments not updated (e.g. the big one at the start of eval_windowaggregates). A couple of lines are commented-out using C++ comments. The overall approach seems ok, and the parser stuff seems fine to me. These are the issues I've found that make it not committable in its present form (including the ones I mentioned before): - missing _readWindowFrameDef function (all nodes that are output from parse analysis must have both _read and _out functions, otherwise views can't work) - the A_Const nodes should probably be transformed to Const nodes in parse analysis, since A_Const has no _read/_out functions, which means changing the corresponding code in the executor. - ruleutils.c not updated to deparse the newly added window options - leaks memory like it's going out of style The memory leakage is caused by not resetting any memory contexts when throwing away all the aggregate state when advancing the start of the window frame. This looks like it will require a rethink of the memory management being used; it's not enough just to pfree copies of the transition values (which you don't appear to be doing), you have to reset the memory context that was exposed to the transition functions via context->wincontext. So the current setup of a single long-lived context won't work; you'll need a long-lived one, plus an additional one that you can reset any time the aggregates need to be re-initialized. (And if you're not going to break existing aggregate functions, WindowAggState.wincontext needs to be the one that gets reset.) Tests for memory leaks: -- tests for failure to free by-ref transition values select count(*) from (select i,max(repeat(i::text,100)) over (order by i rows between 1 preceding and current row) from generate_series(1,1000000) i) s; -- tests for failure to reset memory context on window advance select count(*) from (select i,array_agg(i) over (order by i rows between 1 preceding and current row) from generate_series(1,1000000) i) s; -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Hitoshi Harada on 19 Nov 2009 08:43 2009/11/19 Andrew Gierth <andrew(a)tao11.riddles.org.uk>: > Here's the rest of the review, as far as I've taken it given the > problems with the code. > > The patch applied cleanly and includes regression tests but not docs. > > Small nitpicks: there are some comments not updated (e.g. the > big one at the start of eval_windowaggregates). A couple of lines are > commented-out using C++ comments. OK. It's tough for me to rewrite that big part of comment but I'll try it. > The overall approach seems ok, and the parser stuff seems fine to me. > > These are the issues I've found that make it not committable in its > present form (including the ones I mentioned before): > > - missing _readWindowFrameDef function (all nodes that are output > from parse analysis must have both _read and _out functions, > otherwise views can't work) > > - the A_Const nodes should probably be transformed to Const nodes in > parse analysis, since A_Const has no _read/_out functions, which > means changing the corresponding code in the executor. A_Const/Const will be replace by Expr, to cover any expression without local Var. > > - ruleutils.c not updated to deparse the newly added window options > > - leaks memory like it's going out of style > > The memory leakage is caused by not resetting any memory contexts when > throwing away all the aggregate state when advancing the start of the > window frame. This looks like it will require a rethink of the memory > management being used; it's not enough just to pfree copies of the > transition values (which you don't appear to be doing), you have to > reset the memory context that was exposed to the transition functions > via context->wincontext. So the current setup of a single long-lived > context won't work; you'll need a long-lived one, plus an additional > one that you can reset any time the aggregates need to be > re-initialized. (And if you're not going to break existing aggregate > functions, WindowAggState.wincontext needs to be the one that gets > reset.) Hmm, good point. Though I doubt we need two contexts for this because we have not so far (and we already have tmpcontext for that purpose), memory leakage probably seems to happen. I'll check it out. Thanks for your elaborate review anyway. All I was worried about is now clear. It will be lucky if I can update my patch until next week. So please keep it "Waiting on Author". Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andrew Gierth on 2 Dec 2009 03:16 >>>>> "Hitoshi" == Hitoshi Harada <umi.tanuki(a)gmail.com> writes: Hitoshi> As earlier mail, I added aggcontext to WindowAggState. This issue (as detailed in this post): http://archives.postgresql.org/pgsql-hackers/2009-11/msg01871.php is currently the only significant outstanding issue in my review of this patch. I think we need to see more feedback on whether it is acceptable to change the aggregate function API again (and if so, what to do with it) before I can post a final review on this and mark it ready for committer (or not). -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andrew Gierth on 4 Dec 2009 16:15 Functionally this patch looks excellent; correct format, applies cleanly, passes regression, and I've been unable to find any issues with the code itself. But I still have a concern over the interface change, so I'm setting this back to "waiting on author" for now even though it's really a matter for general discussion on -hackers. To take the outstanding issues in descending order of importance: 1) Memory context handling for aggregate calls Having thought about this carefully, I think the solution used in this patch has to be rejected for one specific reason: if you compile existing code (i.e. aggregates that use WindowAggState.wincontext) written for 8.4 against the patched server, the code compiles successfully and appears to run, but leaks memory at runtime. (And I've confirmed that there do exist external modules that would be affected.) If we're going to change the interface in this way, there should, IMO, be enough of a change that old code fails to compile; e.g. by renaming wincontext to partition_context or some equivalent change. But in my opinion we should go further than this: since there's obviously a need for aggregates to be able to get at a suitable memory context, I think we should formalize this and actually define some interface functions for them to use, so that something like if (fcinfo->context && IsA(fcinfo->context, AggState)) aggcontext = ((AggState *) fcinfo->context)->aggcontext; else if (fcinfo->context && IsA(fcinfo->context, WindowAggState)) aggcontext = ((WindowAggState *) fcinfo->context)->aggcontext; else ereport(...); becomes something like aggcontext = AggGetMemoryContext(fcinfo->context); if (!aggcontext) ereport(...); For completeness, there should be two other functions: one to simply return whether we are in fact being called as an aggregate, and another one to return whether it's safe to scribble on the first argument (while it's currently the case that these two are equivalent, it would be good not to assume that). Comments? This is the most significant issue as I see it. (Also, a function in contrib/tsearch2 that accesses wincontext wasn't updated by the patch.) 2) Keywords I didn't find any discussion of this previously; did I miss it? The patch changes BETWEEN from TYPE_FUNC_NAME_KEYWORD to COL_NAME_KEYWORD, so it's now allowed as a column name (which it previously wasn't), but now not allowed as a function. I get why it can't be a function now, but if it didn't work as a column name before, why does it now? (UNBOUNDED remains an unreserved word, and seems to behave in a reasonable fashion even if used as an upper-level var in the query; the interpretation of a bare UNBOUNDED has the standard behaviour as far as I can see.) 3) Regression tests Testing that views work is OK as far as it goes, but I think that view definition should be left in place rather than dropped (possibly with even more variants) so that the deparse code gets properly tested too. (see the "rules" test) 4) Deparse output The code is forcing explicit casting on the offset expressions, i.e. the deparsed code looks like ... ROWS BETWEEN 1::bigint PRECEDING AND 1::bigint FOLLOWING ... This looks a bit ugly; is it avoidable? At least for ROWS it should be, I think, since the type is known; even for RANGE, the type would be determined by the sort column. 5) Documentation issues The entry for BETWEEN in the keywords appendix isn't updated. (Wouldn't it make more sense for this to be generated from the keyword list somehow?) Spelling: - current row. In <literal>ROWS</> mode this value means phisical row + current row. In <literal>ROWS</> mode this value means physical row (this error appears twice) The doc could probably do with some wordsmithing but this probably shouldn't block the patch on its own; that's something that could be handled separately I guess. -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: patch - per-tablespace random_page_cost/seq_page_cost Next: named parameters in SQL functions |