Prev: question (or feature-request): over ( partition by ...order by LIMIT N)
Next: [HACKERS] Remove ROW | ROWS from OFFSET and FETCH
From: Robert Haas on 25 Mar 2010 18:50 On Thu, Mar 25, 2010 at 5:17 PM, David Fetter <david(a)fetter.org> wrote: > On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote: >> Hello @all, >> >> I know, i can do: >> >> select * from (select ... row_number() over (...) ...) foo where >> row_number < N >> >> to limit the rows per group, but the inner select has to retrieve >> the whole set of records and in the outer select most of them >> discarded. > > That sounds like the optimizer's falling down on the job. Would this > be difficult to fix? I may not be the best person to offer an opinion on this topic, but it sounds tricky to me. I think it would need some kind of extremely specific special-case logic. The planner would have to recognize row_number() < n, row_number() <= n, and row_number = n as special cases indicating that n-1, n, and n records respectively should be expected to be fetched from the partition. And you might also worry about n > row_number(), and n >= row_number(). It might be worth doing because I suspect that is actually going to be a fairly common type of query, but some thought needs to be given to how to do it without resorting to abject kludgery. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Hitoshi Harada on 25 Mar 2010 22:06
2010/3/26 David Fetter <david(a)fetter.org>: > On Wed, Mar 24, 2010 at 06:31:59PM +0100, A. Kretschmer wrote: >> Hello @all, >> >> I know, i can do: >> >> select * from (select ... row_number() over (...) ...) foo where >> row_number < N >> >> to limit the rows per group, but the inner select has to retrieve >> the whole set of records and in the outer select most of them >> discarded. > > That sounds like the optimizer's falling down on the job. Would this > be difficult to fix? I believe this isn't the task of window functions. In fact, "over( ... LIMIT n)" or optimizer hack will fail on multiple window definitions. To take top N items of each group (I agree this is quite common job), I'd suggest syntax that is done by extending DISTINCT ON. SELECT DISTINCT n ON(key1, key2) ... where "n" means top "n" items on each "key1, key2" group. The current DISTINCT ON() syntax is equivalent to DISTINCT 1 ON() in this way. That'll be fairly easy to implement and you aren't be bothered by this like multiple window definitions. The cons of this is that it can be applied to only row_number logic. You may want to use rank, dense_rank, etc. sometimes. Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |