From: Tom Lane on 14 Dec 2009 13:06 Andrew Gierth <andrew(a)tao11.riddles.org.uk> writes: > Updated version of the aggregate order by patch. I'm starting to look at this now. I find it rather bizarre to merge both the actual arguments of an aggregate and the optional ORDER BY expressions into a single targetlist. It doesn't seem like that would be an especially convenient representation to work with, and I would also expect there to be a nonzero performance hit from the extra TargetEntry expression nodes, even when the feature is not in use. Why didn't you use separate lists? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andrew Gierth on 14 Dec 2009 13:45 >>>>> "Tom" == Tom Lane <tgl(a)sss.pgh.pa.us> writes: >> Updated version of the aggregate order by patch. Tom> I'm starting to look at this now. I find it rather bizarre to Tom> merge both the actual arguments of an aggregate and the optional Tom> ORDER BY expressions into a single targetlist. It doesn't seem Tom> like that would be an especially convenient representation to Tom> work with, It's extremely convenient, since you need the arguments and the ordering expressions together in a slot in order to feed them to tuplesort (in the general case where there is more than one expression); you need a tupledesc to pass to tuplesort; and there are existing functions to construct all of these things from the tlist. Also, you want to merge the argument expressions and ordering expressions where possible, and this is exactly what the existing transformSortClause stuff expects. Tom> and I would also expect there to be a nonzero performance hit Tom> from the extra TargetEntry expression nodes, even when the Tom> feature is not in use. I tested for that and couldn't reliably detect any (certainly <2% on count(i) on generated data, and under the circumstances I was testing that's not necessarily outside the error margin). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 14 Dec 2009 15:08 Andrew Gierth <andrew(a)tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl(a)sss.pgh.pa.us> writes: > Tom> and I would also expect there to be a nonzero performance hit > Tom> from the extra TargetEntry expression nodes, even when the > Tom> feature is not in use. > I tested for that and couldn't reliably detect any (certainly <2% > on count(i) on generated data, and under the circumstances I was > testing that's not necessarily outside the error margin). Hm. Now that I look closer I see we already have a hack to avoid executing an extra expression node when a targetlist item is evaluated, so the extra cost should indeed be insignificant. There would be nonzero associated costs during planning and executor startup, but we seldom bother optimizing for that. So nevermind... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 15 Dec 2009 13:07 Andrew Gierth <andrew(a)tao11.riddles.org.uk> writes: > Updated version of the aggregate order by patch. Applied with some editorialization. The main change I made was to get rid of all the ad-hoc DISTINCT handling in parse_agg.c and use transformDistinctClause() instead. This exposed what I believe to be a bug in the submitted patch: it accepted cases like agg(DISTINCT x ORDER BY x,y) We do not allow that in ordinary query-level DISTINCT because it's ambiguous --- there might be multiple y values for any particular value of x, so the ordering is uncertain. The only way it's not uncertain is if the sort by x fully determines the order, in which case listing y is merely useless. I think this is something that was changed not too long ago, so maybe you were trying to emulate the old behavior, but in any case it's better to not have extra code that doesn't behave quite like the normal case. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andrew Gierth on 15 Dec 2009 15:31
>>>>> "Tom" == Tom Lane <tgl(a)sss.pgh.pa.us> writes: Tom> Andrew Gierth <andrew(a)tao11.riddles.org.uk> writes: >> Updated version of the aggregate order by patch. Tom> Applied with some editorialization. The main change I made was Tom> to get rid of all the ad-hoc DISTINCT handling in parse_agg.c Tom> and use transformDistinctClause() instead. I'll review that; I avoided that code intentionally because the semantics of query-level DISTINCT are different enough. Tom> This exposed what I believe to be a bug in the submitted patch: Tom> it accepted cases like Tom> agg(DISTINCT x ORDER BY x,y) This is not a bug, it was done intentionally (as you might have guessed from the fact that there was a regression test for it). The additional ORDER BY column in this case is always safe (since DISTINCT adopts the equality operator from the sort, it's not possible for additional sort columns to break the DISTINCT). I allowed the case since there was therefore no good reason to forbid it. There is at least one case where this makes a visible difference in query output: if the aggregate can distinguish values of x which are considered equal by the sort operator used, then the value of y affects which value of x is seen. It is probably relatively easy to generate examples of this using the box type and array_agg. Tom> We do not allow that in ordinary query-level DISTINCT Note that ordinary query-level DISTINCT has the reverse semantics; the DISTINCT operation is (per spec) logically prior to the order by, the fact that they are planned in the reverse order is an implementation detail. Query-level DISTINCT shouldn't allow columns in the order by that aren't in the select list because those columns _do not exist_ at the point that ordering logically takes place (even though in the implementation, they might). This isn't the case for aggregate order by. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |