Prev: Oddly indented raw_expression_tree_walker
Next: [HACKERS] Unsafe threading in syslogger on Windows
From: Simon Riggs on 8 Apr 2010 06:16 On Tue, 2010-04-06 at 10:22 +0100, Simon Riggs wrote: > Initial patch. I will be testing over next day. No commit before at > least midday on Wed 7 Apr. Various previous discussions sidelined a very important point: what exactly does it mean to "start recovery from a shutdown checkpoint"? If standby_mode is enabled and there is no source of WAL, then we get a stream of messages saying LOG: record with zero length at 0/C000088 .... but most importantly we never get to the main recovery loop, so Hot Standby never gets to start at all. We can't keep retrying the request for WAL and at the same time enter the retry loop, executing lots of things that expect non-NULL pointers using a NULL xlog pointer. What we are asking for here is a completely new state: the database is not "in recovery" - by definition there is nothing at all to recover. The following patch adds "Snapshot Mode", a very simple variation on the existing code - emphasis on the "simple": LOG: entering snapshot mode LOG: record with zero length at 0/C000088 LOG: consistent recovery state reached at 0/C000088 LOG: database system is ready to accept read only connections this mode does *not* continually check to see if new WAL files have been added. Startup just sits and waits, backends allowed. If a trigger file is specified, then we can leave recovery. Otherwise Startup process just sits doing nothing. There's possibly an argument for inventing some more special modes where we do allow read only connections but don't start the bgwriter. I don't personally wish to do this at this stage of the release cycle. The attached patch is non-invasive and safe and I want to leave it at that. I will be committing later today, unless major objections, but I ask you to read the patch before you sharpen your pen. It's simple. -- Simon Riggs www.2ndQuadrant.com
From: Heikki Linnakangas on 8 Apr 2010 06:33 Simon Riggs wrote: > On Tue, 2010-04-06 at 10:22 +0100, Simon Riggs wrote: > >> Initial patch. I will be testing over next day. No commit before at >> least midday on Wed 7 Apr. > > Various previous discussions sidelined a very important point: what > exactly does it mean to "start recovery from a shutdown checkpoint"? Hot standby should be possible as soon we know that the database is consistent. That is, as soon as we've replayed WAL past the minRecoveryPoint/backupStartPoint point indicated in pg_control. > If standby_mode is enabled and there is no source of WAL, then we get a > stream of messages saying > > LOG: record with zero length at 0/C000088 > ... > > but most importantly we never get to the main recovery loop, so Hot > Standby never gets to start at all. We can't keep retrying the request > for WAL and at the same time enter the retry loop, executing lots of > things that expect non-NULL pointers using a NULL xlog pointer. You mean it can't find even the checkpoint record to start replaying? I think the behavior in that scenario is fine as it is. The database isn't consistent (or at least we can't know if it is, because we don't know the redo pointer) until you read and replay the first checkpoint record. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 8 Apr 2010 07:32 On Thu, 2010-04-08 at 13:33 +0300, Heikki Linnakangas wrote: > > If standby_mode is enabled and there is no source of WAL, then we get a > > stream of messages saying > > > > LOG: record with zero length at 0/C000088 > > ... > > > > but most importantly we never get to the main recovery loop, so Hot > > Standby never gets to start at all. We can't keep retrying the request > > for WAL and at the same time enter the retry loop, executing lots of > > things that expect non-NULL pointers using a NULL xlog pointer. > > You mean it can't find even the checkpoint record to start replaying? Clearly I don't mean that. Otherwise it wouldn't be "start from a shutdown checkpoint". I think you are misunderstanding me. Let me explain in more detail though please also read the patch before replying, if you do. The patch I submitted at top of this thread works for allowing Hot Standby during recovery. Yes, of course that occurs when the database is consistent. The trick is to get recovery to the point where it can be enabled. The second patch on this thread presents a way to get the database to that point; it touches some of the other recovery code that you and Masao have worked on. We *must* touch that code if we are to enable Hot Standby in the way you desire. In StartupXlog() when we get to the point where we "Find the first record that logically follows the checkpoint", in the current code ReadRecord() loops forever, spitting out LOG: record with zero length at 0/C000088 .... That prevents us from going further down StartupXLog() to the point where we start the InRedo loop and hence start hot standby. As long as we retry we cannot progress further: this is the main problem. So in the patch, I have modified the retry test in ReadRecord() so it no longer retries iff there is no WAL source defined. Now, when ReadRecord() exits, record == NULL at that point and so we do not (and cannot) enter the redo loop. So I have introduced the new mode ("snapshot mode") to enter hot standby anyway. That avoids us having to screw around with the loop logic for redo. I don't see any need to support the case of where we have no WAL source defined, yet we want Hot Standby but we also want to allow somebody to drop a WAL file into pg_xlog at some future point. That has no use case of value AFAICS and is too complex to add at this stage of the release cycle. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 8 Apr 2010 09:49 On Thu, Apr 8, 2010 at 6:16 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote: > If standby_mode is enabled and there is no source of WAL, then we get a > stream of messages saying > > LOG: record with zero length at 0/C000088 > ... > > but most importantly we never get to the main recovery loop, so Hot > Standby never gets to start at all. We can't keep retrying the request > for WAL and at the same time enter the retry loop, executing lots of > things that expect non-NULL pointers using a NULL xlog pointer. This is pretty much a corner case, so I don't think it's a good idea to add a new mode to handle it. It also seems like it would be pretty inconsistent if we allow WAL to be dropped in pg_xlog, but only if we are also doing archive recovery or streaming replication. If we can't support this case with the same code path we use otherwise, I think we should revert to disallowing it. Having said that, I guess I don't understand how having a source of WAL solves the problem described above. Do we always have to read at least 1 byte of WAL from either SR or the archive before starting up? If not, why do we need to do so here? ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Heikki Linnakangas on 8 Apr 2010 11:35 Simon Riggs wrote: > In StartupXlog() when we get to the point where we "Find the first > record that logically follows the checkpoint", in the current code > ReadRecord() loops forever, spitting out > LOG: record with zero length at 0/C000088 > ... > > That prevents us from going further down StartupXLog() to the point > where we start the InRedo loop and hence start hot standby. As long as > we retry we cannot progress further: this is the main problem. > > So in the patch, I have modified the retry test in ReadRecord() so it no > longer retries iff there is no WAL source defined. Now, when > ReadRecord() exits, record == NULL at that point and so we do not (and > cannot) enter the redo loop. Oh, I see. > So I have introduced the new mode ("snapshot mode") to enter hot standby > anyway. That avoids us having to screw around with the loop logic for > redo. I don't see any need to support the case of where we have no WAL > source defined, yet we want Hot Standby but we also want to allow > somebody to drop a WAL file into pg_xlog at some future point. That has > no use case of value AFAICS and is too complex to add at this stage of > the release cycle. You don't need a new mode for that. Just do the same "are we consistent now?" check you do in the loop once before calling ReadRecord to fetch the record that follows the checkpoint pointer. Attached is a patch to show what I mean. We just need to let postmaster know that recovery has started a bit earlier, right after processing the checkpoint record, not delaying it until we've read the first record after it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
|
Next
|
Last
Pages: 1 2 Prev: Oddly indented raw_expression_tree_walker Next: [HACKERS] Unsafe threading in syslogger on Windows |