Prev: [HACKERS] GUCs that need restart
Next: Need to contact driver authors about change in indexnaming behavior ...
From: Simon Riggs on 4 May 2010 19:42 On Tue, 2010-05-04 at 13:23 -0400, Tom Lane wrote: > * LogStandbySnapshot is merest fantasy: no guarantee that either the > XIDs list or the locks list will be consistent with the point in WAL > where it will get inserted. What's worse, locking things down enough > to guarantee consistency would be horrid for performance, or maybe > even deadlock-inducing. Could lose both ways: list might contain an > XID whose commit/abort went to WAL before the snapshot did, or list > might be missing an XID started just after snap was taken, The latter > case could possibly be dealt with via nextXid filtering, but that > doesn't fix the former case, and anyway we have both ends of the same > problem for locks. This was the only serious complaint on your list, so lets address it. Clearly we don't want to lock everything down, for all the reasons you say. That creates a gap between when data is derived and when data logged to WAL. LogStandbySnapshot() occurs during online checkpoints on or after the logical checkpoint location and before the physical checkpoint location. We start recovery from a checkpoint, so we have a starting point in WAL for our processing. The time sequence on the primary of these related events is Logical Checkpoint location newxids/commits/locks "Before1" AccessExclusiveLocks derived newxids/commits/locks "Before2" AccessExclusiveLocks WAL record inserted newxids/commits/locks "After1" RunningXact derived newxids/commits/locks "After2" RunningXact WAL record inserted though when we read them back from WAL, they will be in this order, and we cannot tell the difference between events at Before 1 & 2 or After 1 & 2. Logical Checkpoint location <= STANDBY_INITIALIZED newxids/commits/locks "Before1" newxids/commits/locks "Before2" AccessExclusiveLocks WAL record newxids/commits/locks "After1" newxids/commits/locks "After2" RunningXact WAL record <= STANDBY_SNAPSHOT_READY We're looking for a consistent point. We don't know what the exact time-synchronised point is on master, so we have to use an exact point in WAL and work from there. We need to understand that the serialization of events in the log can be slightly different to how they occurred on the primary, but that doesn't change anything important. So to get a set of xids + locks that are consistent at the moment the RunningXact WAL record is read we need to 1. Begin processing incoming changes from the time we are STANDBY_INITIALIZED, though forgive any errors for removals of missing items until we hit STANDBY_SNAPSHOT_READY a) locks - we ignore missing locks in StandbyReleaseLocks() b) xids - we ignore missing xids in KnownAssignedXidsRemove() 2. Any transaction commits/aborts from the time we are STANDBY_INITIALIZED, through to STANDBY_SNAPSHOT_READY need to be saved, so that we can remove them again from the snapshot state. That is because events might otherwise exist in the standby that will never be removed from snapshot. We do this by simple test whether the related xid has already completed. a) locks - we ignore locks for already completed xids in StandbyAcquireAccessExclusiveLock() b) xids - we ignore already completed xids in ProcArrayApplyRecoveryInfo() We currently do all of the above. So it looks correct to me. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |