From: Simon Riggs on 25 Sep 2009 06:26 On Fri, 2009-09-25 at 11:49 +0300, Heikki Linnakangas wrote: > Looking at the startup sequence now. > > I see that you modified ExtendSUBTRANS so that it doesn't wipe out > previously set values if it's called with out-of-order xids. I guess > that works, although I think it can leave pages unzeroed if it's called > with a large enough gap between xids, so that the previous one was on > e.g page 10 and the next one on page 12. Page 11 would not be zeroed, > AFAICS. Not sure if such big gaps in the xid sequence are possible, but > seems so if you have a very large max_connections setting and a lot of > subtransactions. Yeh, it happens and I've seen it - which is why the code is there. > Nevertheless, I'd feel better if we kept ExtendSUBTRANS unmodified, and > instead arranged things so that ExtendSUBTRANS is always called in > xid-order. We can call it from RecordKnownAssignedTransactionIds, which > handles the out-of-order problem for other purposes already. > > I think we have similar problem with clog. ExtendCLOG is currently not > called during recovery, so isn't it possible for the problem with > subtransaction commits that's described in the comments in StartupCLOG > to arise during hot standby? Again, I think we should call ExtendCLOG() > in RecordKnownAssignedTransactionIds. RecordKnownAssignedTransactionIds > is the hot standby version of GetNewTransactionId(), so to speak. OK. We record xids in sequence, so it is now a much more natural place to do this. Zeroing them makes them dirty, unfortunately, but that is another issue. RecordKnownAssignedTransactionIds() only works once the snapshot has been initialised. That could leave a gap, so we will need to run it continuously if InHotStandby. Which may have knock-on changes also. > If we go with that, I think we'll need to call StartupSUBTRANS/CLOG > earlier in the startup sequence too, when we establish the startup > snapshot and let backends in. Yes, I think an earlier patch had that, but it turns out that there really isn't anything for those functions to do. Really we could rename those functions EndOfRecoverySUBTRANS and EndOfRecoveryCLOG to illustrate their role better. > - I removed the new DBConsistentUpToLSN() function and moved its > functionality into XLogNeedsFlush(). Since XLogFlush() updates > minRecoveryPoint instead of flushing WAL during recovery, it makes sense > for XLogNeedsFlush() to correspondingly check if minRecoveryPoint needs > to be updated when in recovery. That's a better definition for the call > in bufmgr.c too. > > - I went ahead with the changes with RecoveryInfoLock and tracking the > number of held AccessExclusive locks in lmgr.c instead of proc array. > > Can we find a better name for "loggable locks"? It means locks that need > to be WAL logged when acquired, for hot standby, and "loggable lock" > doesn't sound right for that. "Loggable" implies that it can be logged, > but it really means that it *must* be logged. The distinction of "loggable" is somewhat false since we rely on the fact that only one person is holding it. So we may as well just call em what they are: AccessExclusiveLocks. > Keep an eye on my git repository for latest changes. No, I'm not doing that. If you want to submit changes, please do so to the list or just mention what needs work and I will do it. Having multiple versions of a patch isn't helpful, as I have already said. I have already been burned multiple times by accepting trial code and you yourself have re-written particular parts multiple times. I am very, very grateful for your reviews and detailed suggestions, so this comment is only about coherency and not tripping each other up. (If you want to "editorialize", it needs to be just prior to commit, but making changes to a patch just prior to commit has historically shown to introduce bugs where there weren't any before). There's enough changes already to demand a full re-test, so everything discovered so far plus testing is about 2 weeks work. I will commit things onto git as agreed as I complete coding but that won't imply full testing. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Heikki Linnakangas on 25 Sep 2009 06:23 Simon Riggs wrote: > On Wed, 2009-09-23 at 17:45 +0300, Heikki Linnakangas wrote: >> XactClearRecoveryTransactions() when we see a shutdown checkpoint, which >> clears all recovery locks. But doesn't that prematurely clear all locks >> belonging to prepared transactions as well? > > Much better to read your second post(s). :-) > > Yes, you have found a(nother) issue. This was the first one that gave me > pause to think of the answer. The locks currently aren't tracked as to > whether they are 2PC or not, so we would need to store that info also so > that we can selectively release locks later. > > Question: is it possible to do a fast shutdown when we have a prepared > transaction? Yes. > Would it be better to take a different approach there for > prepared transactions? It seems strange to write a shutdown checkpoint > when the system isn't yet "clean". Hmm, I guess you could define prepared transactions as active backends from the shutdown point of view, and wait for them to finish. I can see one problem, though: Once you issue shutdown, fast or smart, we no longer accept new connections. So you can't connect to issue the ROLLBACK/COMMIT PREPARED anymore. Anyway, it would be a change from the current behavior, so it would be better to cope with prepared transactions in the standby. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 25 Sep 2009 06:28 On Fri, 2009-09-25 at 13:14 +0300, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Wed, 2009-09-23 at 19:07 +0300, Heikki Linnakangas wrote: > > > >> Rather than keep the numHeldLocks counters per-proc in proc array, I > >> think it would be simpler to have a single (or one per lock partition) > >> counter in shared memory in lock.c. It's just an optimization to make it > >> faster to find out that there is no loggable AccessExclusiveLocks in the > >> system, so it really rather belongs into the lock manager. > > > > What lock would protect that value? The whole purpose is to avoid taking > > the LockMgrLocks and to give something that is accessible by the locks > > already held by GetRunningTransactionData(). > > The lock partition lock (so we really need one counter per partition, a > single counter would need additional locking). We're already holding > that in LockAcquire/LockRelease when we need to increment/decrement the > counter. Again: The whole purpose is to avoid taking those locks. Why would we put something behind a lock we are trying to avoid taking? -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 25 Sep 2009 06:32 On Fri, 2009-09-25 at 13:23 +0300, Heikki Linnakangas wrote: > Simon Riggs wrote: > > On Wed, 2009-09-23 at 17:45 +0300, Heikki Linnakangas wrote: > >> XactClearRecoveryTransactions() when we see a shutdown checkpoint, which > >> clears all recovery locks. But doesn't that prematurely clear all locks > >> belonging to prepared transactions as well? > > > > Much better to read your second post(s). :-) > > > > Yes, you have found a(nother) issue. This was the first one that gave me > > pause to think of the answer. The locks currently aren't tracked as to > > whether they are 2PC or not, so we would need to store that info also so > > that we can selectively release locks later. > > > > Question: is it possible to do a fast shutdown when we have a prepared > > transaction? > > Yes. > > > Would it be better to take a different approach there for > > prepared transactions? It seems strange to write a shutdown checkpoint > > when the system isn't yet "clean". > > Hmm, I guess you could define prepared transactions as active backends > from the shutdown point of view, and wait for them to finish. I can see > one problem, though: Once you issue shutdown, fast or smart, we no > longer accept new connections. So you can't connect to issue the > ROLLBACK/COMMIT PREPARED anymore. Anyway, it would be a change from the > current behavior, so it would be better to cope with prepared > transactions in the standby. Definitely need to cope with them for Hot Standby. My point was general one to say that behaviour is very non-useful for users with prepared transactions. It just causes manual effort by a DBA each time the system is shutdown. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Heikki Linnakangas on 25 Sep 2009 07:00
Simon Riggs wrote: > Definitely need to cope with them for Hot Standby. My point was general > one to say that behaviour is very non-useful for users with prepared > transactions. It just causes manual effort by a DBA each time the system > is shutdown. The transaction manager is supposed to reconnect and finish any in-doubt transactions, eventually. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |