From: Heikki Linnakangas on 23 Sep 2009 12:07 Simon Riggs wrote: > On Wed, 2009-09-23 at 12:07 +0300, Heikki Linnakangas wrote: >> seems to be broken > > Agreed. Looking at the relation lock stuff a bit more... When someone tries to acquire an AccessExclusiveLock, but can't get it immediately, we sleep while holding RecoveryInfoLock. That doesn't affect normal queries, but it will in turn block any attempt to get a running-xacts snapshot, and will thus block checkpoint from finishing. It could take a very long time until you get an AccessExclusiveLock on a busy table, so that seems pretty bad. I think we need to think a bit harder about that locking. The problem becomes a lot easier if we accept that it's OK to have a lock included in the running-xacts snapshot and also appear in a XLOG_RELATION_LOCK record later. The standby should handle that gracefully already. If we just remove RecoveryInfoLock, that can happen, but it still won't be possible for a lock to be missed out which is what we really care about. Rather than keep the numHeldLocks counters per-proc in proc array, I think it would be simpler to have a single (or one per lock partition) counter in shared memory in lock.c. It's just an optimization to make it faster to find out that there is no loggable AccessExclusiveLocks in the system, so it really rather belongs into the lock manager. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 23 Sep 2009 12:07 Simon, > Patch withdrawn for correction and rework. Nothing serious, but not much > point doing further testing to all current issues resolved. :-( Good thing we went for 4 CFs. Is there a GIT branch of Simon's current working version up somewhere? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Heikki Linnakangas on 23 Sep 2009 12:30 Josh Berkus wrote: >> Patch withdrawn for correction and rework. Nothing serious, but not much >> point doing further testing to all current issues resolved. > > :-( > > Good thing we went for 4 CFs. I think we should try to hammer this in in this commitfest. None of the issues found this far are too serious, nothing that requires major rewrites. > Is there a GIT branch of Simon's current working version up somewhere? I have my git repository at git.postgresql.org, branch 'hs-riggs'. I've pushed a number of small changes there over Simon's patch. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 23 Sep 2009 12:36 Heikki, > I think we should try to hammer this in in this commitfest. None of the > issues found this far are too serious, nothing that requires major rewrites. It would certainly be valuable to get users testing it starting with Alpha2 instead of waiting 2 months. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Heikki Linnakangas on 24 Sep 2009 06:33
Heikki Linnakangas wrote: > The problem becomes a lot easier if we accept that it's OK to have a > lock included in the running-xacts snapshot and also appear in a > XLOG_RELATION_LOCK record later. The standby should handle that > gracefully already. If we just remove RecoveryInfoLock, that can happen, > but it still won't be possible for a lock to be missed out which is what > we really care about. I see the problem with that now. Without the lock, it's possible that the XLOG_RELATION_LOCK WAL record is written before the XLOG_RUNNING_XACTS record. If the lock is not included in the snapshot, we want the lock WAL record to be after the snapshot record. So i guess we'll need the RecoveryInfoLock. But we don't need to hold it across the wait. I think it's enough to acquire it just before writing the WAL record in LockAcquire. That will ensure that the WAL record isn't written until the snapshot is completely finished. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |