Prev: pg_dump and join aliases (was Re: [BUGS] ERROR: cannot handle unplanned sub-select)
Next: patch: preload dictionary new version
From: Tom Lane on 16 Jul 2010 19:01 Andres Freund <andres(a)anarazel.de> writes: > What could the join removal path (and similar places) *possibly* do against > such a case? Without stopping to use SnapshotNow I dont see any way :-( But the planner, along with most of the rest of the backend, *does* use SnapshotNow when examining the system catalogs. I share your feeling of discomfort but so far I don't see a hole in Simon's argument. Adding a constraint should never make a previously-correct plan incorrect. Removing one is a very different story, but he says he's not changing that case. (Disclaimer: I have not read the patch.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 16 Jul 2010 19:53 On Jul 16, 2010, at 6:01 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > Andres Freund <andres(a)anarazel.de> writes: >> What could the join removal path (and similar places) *possibly* do against >> such a case? Without stopping to use SnapshotNow I dont see any way :-( > > But the planner, along with most of the rest of the backend, *does* use > SnapshotNow when examining the system catalogs. > > I share your feeling of discomfort but so far I don't see a hole in > Simon's argument. Adding a constraint should never make a > previously-correct plan incorrect. Removing one is a very different > story, but he says he's not changing that case. (Disclaimer: I have > not read the patch.) Perhaps we should start by deciding whether Andres' case is a bug in the first place, and then we can argue about whether it's a join-removal bug, a lock-weakening bug, or a preexisting bug. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andres Freund on 16 Jul 2010 20:20 On Saturday 17 July 2010 01:53:24 Robert Haas wrote: > On Jul 16, 2010, at 6:01 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > > Andres Freund <andres(a)anarazel.de> writes: > >> What could the join removal path (and similar places) *possibly* do > >> against such a case? Without stopping to use SnapshotNow I dont see any > >> way :-( > > > > But the planner, along with most of the rest of the backend, *does* use > > SnapshotNow when examining the system catalogs. > > > > I share your feeling of discomfort but so far I don't see a hole in > > Simon's argument. Neither do I. > > Adding a constraint should never make a > > previously-correct plan incorrect. Removing one is a very different > > story, but he says he's not changing that case. (Disclaimer: I have > > not read the patch.) > > Perhaps we should start by deciding whether Andres' case is a bug in the > first place, and then we can argue about whether it's a join-removal bug, > a lock-weakening bug, or a preexisting bug. The case where its possible to produce such a case *after* having used/locked all participating relations is new I think. Being able to create invalid results by doing DDL in another connection on not-yet-used tables is at least as old as constraint exclusion. Its a bit easier to work around, but thats it. So I personally would not consider this patch as having a bug anymore (thinking helps...). Whether the general issue is a bug or a to-be-more-exhausitive-documented- gotcha I have no idea. I know two people having hit it in production - I dont think its a that common issue though. Starting with the fact that not that many people use serializable. Just to help me: The primary reasons for using SnapshotNow is speed and in some cases correctness (referential integrity). Right? Any other reasons? Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 16 Jul 2010 20:45 Andres Freund <andres(a)anarazel.de> writes: > Just to help me: The primary reasons for using SnapshotNow is speed and in > some cases correctness (referential integrity). Right? Any other reasons? Well, the main point for system catalog accesses is that you *must* have an up-to-date view of the table schemas. As an example, if someone just added an index to an existing table, it would not do for an INSERT to fail to update that index --- no matter whether it's from a serializable transaction or not. So the DDL-executing transaction must hold a lock that would block any operation that had better be able to see what it did, and once another transaction has acquired the lock that lets it go ahead with another operation, it had better see the results of the DDL transaction. However that argument mostly applies to what the executor does. A plan could still be usable despite having been made against a now-obsolete version of the table schema. In the case at hand, I think most constraint-adding situations would require at least ShareLock, because they had better block execution of INSERT/UPDATE/DELETE operations that could fail to honor the constraint if they didn't see it in the catalogs. But AFAICS, addition of a constraint need not block SELECT, and it need not invalidate existing plans. CREATE INDEX uses ShareLock because it's okay to run multiple CREATE INDEXes in parallel (thanks to some rather dodgy coding of the catalog updates). For other cases of constraint additions, it might not be practical to run two constraint additions in parallel. In that case we could use ShareRowExclusive instead, which is self-exclusive but is not any stronger than Share from the perspective of DML commands. Since it's not, I'm unconvinced that it's worth taking any great pains to try to make constraint additions run in parallel. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andres Freund on 18 Jul 2010 11:28
On Saturday 17 July 2010 09:55:37 Simon Riggs wrote: > On Fri, 2010-07-16 at 23:03 +0200, Andres Freund wrote: > > Sure its not that bad, but at least it needs to get documented imho. > > Likely others should chime in here ;-) > > Don't understand you. This is a clear bug in join removal, test case > attached, a minor rework of your original test case. As shown below the same issue exists in other codepaths that we cant easily fix in a stable release :-( - so I think documenting it is the only viable action for the back-branches. > > What could the join removal path (and similar places) *possibly* do > > against such a case? Without stopping to use SnapshotNow I dont see > > any way :-( > The bug is caused by allowing join removal to work in serializable > transactions. The fix for 9.0 is easy and clear: disallow join removal > when planning a query as the second or subsequent query in a > serializable transaction. > > A wider fix might be worth doing for 9.1, not sure. Unfortunately the same issue exists with constraint exclusion - and we can hardly disable that for serializable transactions... CREATE TABLE testconstr(data int); INSERT INTO testconstr VALUES(1),(10); T1: test=# explain analyze SELECT * FROM testconstr WHERE data > 5; QUERY PLAN ------------------------------------------------------------------------------------------------------- Seq Scan on testconstr (cost=0.00..40.00 rows=800 width=4) (actual time=0.029..0.032 rows=1 loops=1) Filter: (data > 5) Total runtime: 0.097 ms (3 rows) test=# BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE; BEGIN --make sure we do have a snapshot test=# SELECT * FROM pg_class WHERE 0 = 1 T2: DELETE FROM testconstr WHERE data >= 5; ALTER TABLE testconstr ADD CONSTRAINT t CHECK(data < 5); T1: test=# explain analyze SELECT * FROM testconstr WHERE data > 5; QUERY PLAN ------------------------------------------------------------------------------------ Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.003..0.003 rows=0 loops=1) One-Time Filter: false Total runtime: 0.045 ms (3 rows) test=# SET constraint_exclusion = false; SET test=# explain analyze SELECT * FROM testconstr WHERE data > 5; QUERY PLAN ------------------------------------------------------------------------------------------------------- Seq Scan on testconstr (cost=0.00..40.00 rows=800 width=4) (actual time=0.030..0.033 rows=1 loops=1) Filter: (data > 5) Total runtime: 0.099 ms (3 rows) Thats seems to be an issue that you realistically can hit in production... I think the same problem exists with inheritance planning - i.e. a child table added to a relation in T1 while T2 already holds a snapshot but hasnt used that specific table was created will see the new child. Thats less severe but still annoying. Beside using an actual Snapshot in portions of the planner (i.e. stats should continue using SnapshotNow) I dont really see a fix here. Andres Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |