From: Hitoshi Harada on 15 Nov 2009 20:40 2009/11/16 Andrew Gierth <andrew(a)tao11.riddles.org.uk>: >>>>>> "Hitoshi" == Hitoshi Harada <umi.tanuki(a)gmail.com> writes: > > Hitoshi> Questions here: > Hitoshi> - agglevelsup? > What case exactly would you consider an error? When an order by > expression references a lower (more deeply nested) query level than > any of the actual arguments? It's only that I felt not intuitive. To me, arguments are regarded as aggregate's "member" while ORDER BY clause expressions didn't hit me. If it's only me, no problem. Maybe additional document in #syntax-aggregates will do. > Hitoshi> - order by 1? > > Hitoshi> Normal ORDER BY clause accepts constant integer as > Hitoshi> TargetEntry's resno. The patch seems not to support it. > Hitoshi> Shouldn't it be the same as normal ORDER BY? > > Specifically documented. The SQL spec doesn't allow ordinal positions > in ORDER BY any more (those are a holdover from SQL92) and we don't > support them in, for example, window ORDER BY clauses. Clear. > Hitoshi> Coding, almost all sane. Since its syntax and semantics are > Hitoshi> similar to existing DISTINCT and ORDER BY features, parsing > Hitoshi> and transformation code are derived from those area. The > Hitoshi> executor has few issues: > > Hitoshi> - #include in nodeAgg.c > Hitoshi> executor/tuptable.h is added in the patch but required really? > Hitoshi> I removed that line but still build without any warnings. > > The code is making explicit use of various Slot calls declared in > tuptable.h. The only reason why it builds without error when you > remove that is that utils/tuplesort.h happens to include tuptable.h > indirectly. > > The code is making explicit use of various Slot calls declared in > tuptable.h. The only reason why it builds without error when you > remove that is that utils/tuplesort.h happens to include tuptable.h > indirectly. My C skill is not so good to determine if that #include is needed. Simply we'd not needed it, it seems to me that it isn't needed still. > Hitoshi> - process_ordered_aggregate_(single|multi) > Hitoshi> It seems that the patch left process_sorted_aggregate() > Hitoshi> function as process_ordered_aggregate_single() and added > Hitoshi> process_ordered_aggregate_multi() for more than one input > Hitoshi> arguments (actually target entries) case. Why have those > Hitoshi> two? Could we combine them? Or I couldn't find convincing > Hitoshi> reason in comments. > > Performance. > > tuplesort_getdatum etc. seems to be substantially faster than > tuplesort_gettupleslot especially for the case where you're sorting a > pass-by-value datum such as an integer (since the datum is then stored > only in the sort tuple header and doesn't require a separate space > allocation for itself). Using a slot in all cases would have slowed > down some common cases like count(distinct id) by a measurable amount. > > Cases like array_agg(x order by x) benefit from the faster code path > too. > > The memory management between the two cases is sufficiently different > that combining them into one function while still maintaining the > slot vs. datum distinction would be ugly and probably error-prone. > The relatively minor duplication of logic seemed much clearer to me. I see the reason. But if we allow this, shouldn't we apply the same logic to other sort depend parts? Or maybe refactor tuplesort in the future? > > Hitoshi> - SortGroupClause.implicit > Hitoshi> "implicit" member was added in SortGroupClause. I didn't > Hitoshi> find clear reason to add this. Could you show a case to > Hitoshi> clarify this? > > Without that flag or something like it, when you do > > create view foo as select count(distinct x) from table; > > and then display the view definition, you would get back the query as > "select count(distinct x order by x) from table" which would be > confusing and unnecessarily backwards- and forwards-incompatible. > > So the code sets "implicit" for any SortGroupClause that is added for > some internal reason rather than being present in the original query, > and the deparse code in ruleutils skips such clauses. I don't have much experiences in VIEW systems, but isn't it enough to let "order by x" omitted? "select count(distinct x order by x) from table" means the same as "select count(distinct x) from table" currently. ruleutils can handle it if distinct expressions are equal to order by expressions. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 15 Nov 2009 20:52 Hitoshi Harada <umi.tanuki(a)gmail.com> writes: > 2009/11/16 Andrew Gierth <andrew(a)tao11.riddles.org.uk>: > "Hitoshi" == Hitoshi Harada <umi.tanuki(a)gmail.com> writes: >>> �Hitoshi> �- SortGroupClause.implicit >>> �Hitoshi> "implicit" member was added in SortGroupClause. I didn't >>> �Hitoshi> find clear reason to add this. Could you show a case to >>> �Hitoshi> clarify this? >> >> So the code sets "implicit" for any SortGroupClause that is added for >> some internal reason rather than being present in the original query, >> and the deparse code in ruleutils skips such clauses. > I don't have much experiences in VIEW systems, but isn't it enough to > let "order by x" omitted? I agree with Hitoshi that this seems to be a hack to deal with the consequences of a bad design decision. We just recently cleaned up an ancient mistake of the same sort (having the parser add clauses to ORDER BY/DISTINCT that the user didn't write) and I don't care to repeat that error in aggregates. If it's necessary to decorate the aggregate with additional clauses in order to make the executor work, the proper place to do that is the planner. The reason this division of labor is worth preserving is that if there are alternative implementations that might reasonably be chosen, the planner is the place to make that choice. Nailing down sort order in the parser is pre-judging something the planner might wish to change. (As an example, the reason we had to fix the ORDER BY/DISTINCT mistake was that it was constraining the planner's choices about substituting hash aggregation for sort/unique.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andrew Gierth on 15 Nov 2009 21:29 >>>>> "Hitoshi" == Hitoshi Harada <umi.tanuki(a)gmail.com> writes: >> What case exactly would you consider an error? When an order by >> expression references a lower (more deeply nested) query level >> than any of the actual arguments? Hitoshi> It's only that I felt not intuitive. To me, arguments are Hitoshi> regarded as aggregate's "member" while ORDER BY clause Hitoshi> expressions didn't hit me. If it's only me, no Hitoshi> problem. Maybe additional document in #syntax-aggregates Hitoshi> will do. How about: ... But an exception occurs if the aggregate's arguments (including any <literal>ORDER BY</> clause) contain only outer-level variables: the aggregate then belongs to the nearest such outer level, ... >> Without that flag or something like it, when you do >> >> create view foo as select count(distinct x) from table; >> >> and then display the view definition, you would get back the query as >> "select count(distinct x order by x) from table" which would be >> confusing and unnecessarily backwards- and forwards-incompatible. >> >> So the code sets "implicit" for any SortGroupClause that is added for >> some internal reason rather than being present in the original query, >> and the deparse code in ruleutils skips such clauses. Hitoshi> I don't have much experiences in VIEW systems, but isn't it Hitoshi> enough to let "order by x" omitted? "select count(distinct x Hitoshi> order by x) from table" means the same as "select Hitoshi> count(distinct x) from table" currently. ruleutils can Hitoshi> handle it if distinct expressions are equal to order by Hitoshi> expressions. That doesn't work in more complex cases. For example, the user might specify aggfunc(distinct x,y order by x) (not caring about the relative order of y) but the code will still turn that internally into aggfunc(distinct x,y order by x,y). It's necessary to be able to recover what the user originally entered, which means needing to be able to distinguish both of those cases from aggfunc(distinct x,y). Other ways this could have been done (but which I rejected) were: 1) separate lists of SortGroupClauses for "orderings the user explicitly required" and "ordering we're going to use" 2) single list of SortGroupClauses, but a count of how many of the entries are original, rather than added 3) deferring the addition of extra ordering elements required for distinctness until planning time I didn't consider (3) because I wasn't really touching the planner for this patch, and the information needed was already available in parse analysis as part of the error checking. -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andrew Gierth on 15 Nov 2009 22:25 >>>>> "Tom" == Tom Lane <tgl(a)sss.pgh.pa.us> writes: Tom> I agree with Hitoshi that this seems to be a hack to deal with the [snip] So copying the way that SELECT DISTINCT works would be the way to go? i.e. keeping separate lists in the parse node for sorting and distinct? What about error handling? If the user specifies agg(distinct x) where x is not sortable, do we leave it to the planner to detect that (which means not reporting the error position?) -- Andrew. -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 15 Nov 2009 22:31
Andrew Gierth <andrew(a)tao11.riddles.org.uk> writes: > What about error handling? If the user specifies agg(distinct x) where > x is not sortable, do we leave it to the planner to detect that (which > means not reporting the error position?) Well, at the moment there's only going to be a sort-based implementation, so I don't object to throwing an error for that as soon as possible. OTOH I wouldn't recommend expending a lot of code to do it there. I would hope that most of the parser's work for this can be shared with the existing support for query-level ORDER BY/DISTINCT. If that means that we don't complain immediately about cases where there is hash but not sort support, that seems all right to me, because there are very few such datatypes anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |