From: Tom Lane on 25 May 2010 16:06 Scott Marlowe <scott.marlowe(a)gmail.com> writes: > So, Tom, so you think it's possible that the planner isn't noticing > all those nulls and thinks it'll just take a row or two to get to the > value it needs to join on? I dug through this and have concluded that it's really an oversight in the patch I wrote some years ago in response to this: http://archives.postgresql.org/pgsql-performance/2005-05/msg00219.php That patch taught nodeMergejoin that a row containing a NULL key can't possibly match anything on the other side. However, its response to observing a NULL is just to advance to the next row of that input. What we should do, if the NULL is in the first merge column and the sort order is nulls-high, is realize that every following row in that input must also contain a NULL and so we can just terminate the mergejoin immediately. The original patch works well for cases where there are just a few nulls in one input and the important factor is to not read all the rest of the other input --- but it fails to cover the case where there are many nulls and the important factor is to not read all the rest of the nulls. The problem can be demonstrated if you modify the example given in the above-referenced message so that table t1 contains lots of nulls rather than just a few: explain analyze will show that all of t1 gets read by the mergejoin, and that's not necessary. I'm inclined to think this is a performance bug and should be back-patched, assuming the fix is simple (which I think it is, but haven't coded/tested yet). It'd probably be reasonable to go back to 8.3; before that, sorting nulls high versus nulls low was pretty poorly defined and so there'd be risk of breaking cases that gave the right answers before. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
|
Pages: 1 Prev: tsvector pg_stats seems quite a bit off. Next: Fwd: Hiding data in postgresql |