From: Fujii Masao on 18 May 2010 04:34 On Tue, May 18, 2010 at 5:10 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote: > On Tue, 2010-05-18 at 16:05 +0900, Fujii Masao wrote: >> >>> >> ISTM that we can use XLogCtl->SharedRecoveryInProgress for that. >> >>> >> Is this OK? >> >>> > >> >>> > That can change state at any time. Would that work? >> >>> >> >>> Yes. XLogCtl->SharedRecoveryInProgress is set to TRUE only when >> >>> XLogCtl structure is initialized (i.e., XLOGShmemInit), and it's >> >>> set to FALSE only at the end of recovery. >> >> >> >> You should be using RecoveryInProgress() >> > >> > Isn't access to a bool variable atomic? >> >> If it's not atomic, I'll add the following comment into CancelBackup(): >> >> Don't bother with lock to access XLogCtl->SharedRecoveryInProgress, >> because there should be no other processes running when this code >> is reached. > > Call it via a function. There is no need for postmaster to know the > innards of xlog.c, which could change in future. Modularity. In the patch, it's accessed in CancelBackup() which is in xlog.c. CancelBackup() should call further wrapping function? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 18 May 2010 07:35 On Mon, May 17, 2010 at 10:40 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote: > On Mon, May 17, 2010 at 10:20 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: >> OK, I think I understand now. But, the SIGTERM sent by the postmaster >> doesn't kill the recovery process unconditionally. It will invoke >> StartupProcShutdownHandler(), which will set set shutdown_requested = >> true. That gets checked by RestoreArchivedFile() and >> HandleStartupProcInterrupts(), and I think that neither of those can >> get invoked until after the control file has been updated. Do you see >> a way it can happen? > > Yeah, the way is: > StartupXLOG() --> ReadCheckpointRecord() --> ReadRecord() --> > XLogPageRead() --> XLogFileReadAnyTLI() --> XLogFileRead() --> > RestoreArchivedFile() > > ReadCheckpointRecord() is called before pg_control is updated. OK. In that case, I'm wondering if we should reverse course and rejigger the logic so that the shutdown gets processed when we transition to PM_RECOVERY. Seems like that might be simpler. > ISTM that walreceiver might be invoked even after shutdown is requested. > We should prevent the postmaster from starting up walreceiver if > Shutdown > NoShutdown? Well, when we did the previous shutdown patch, we decided it was not right to kill walreceiver until all backends had exited, so it seems inconsistent to make the opposite decision here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 18 May 2010 22:40 On Tue, May 18, 2010 at 8:35 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: > On Mon, May 17, 2010 at 10:40 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote: >> On Mon, May 17, 2010 at 10:20 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: >>> OK, I think I understand now. But, the SIGTERM sent by the postmaster >>> doesn't kill the recovery process unconditionally. It will invoke >>> StartupProcShutdownHandler(), which will set set shutdown_requested = >>> true. That gets checked by RestoreArchivedFile() and >>> HandleStartupProcInterrupts(), and I think that neither of those can >>> get invoked until after the control file has been updated. Do you see >>> a way it can happen? >> >> Yeah, the way is: >> StartupXLOG() --> ReadCheckpointRecord() --> ReadRecord() --> >> XLogPageRead() --> XLogFileReadAnyTLI() --> XLogFileRead() --> >> RestoreArchivedFile() >> >> ReadCheckpointRecord() is called before pg_control is updated. > > OK. In that case, I'm wondering if we should reverse course and > rejigger the logic so that the shutdown gets processed when we > transition to PM_RECOVERY. Seems like that might be simpler. You mean keeping shutdown waiting until the postmaster has reached PM_RECOVERY, i.e., the startup process has sent PMSIGNAL_RECOVERY_STARTED? The startup process must call ReadCheckpointRecord() before sending that signal. ReadCheckpointRecord() might get stuck for some reasons, e.g., neither master nor standby might have the recovery starting checkpoint WAL record. So that signal might not be sent forever, in this case, shutdown would get stuck. >> ISTM that walreceiver might be invoked even after shutdown is requested. >> We should prevent the postmaster from starting up walreceiver if >> Shutdown > NoShutdown? > > Well, when we did the previous shutdown patch, we decided it was not > right to kill walreceiver until all backends had exited, so it seems > inconsistent to make the opposite decision here. Oh, right. How about allowing the postmaster only in PM_STARTUP, PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke walreceiver? We can keep walreceiver alive until all read only backends have gone, and prevent unexpected startup of walreceiver. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 18 May 2010 23:59 On Tue, May 18, 2010 at 10:40 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote: > On Tue, May 18, 2010 at 8:35 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: >> On Mon, May 17, 2010 at 10:40 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote: >>> On Mon, May 17, 2010 at 10:20 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: >>>> OK, I think I understand now. But, the SIGTERM sent by the postmaster >>>> doesn't kill the recovery process unconditionally. It will invoke >>>> StartupProcShutdownHandler(), which will set set shutdown_requested = >>>> true. That gets checked by RestoreArchivedFile() and >>>> HandleStartupProcInterrupts(), and I think that neither of those can >>>> get invoked until after the control file has been updated. Do you see >>>> a way it can happen? >>> >>> Yeah, the way is: >>> StartupXLOG() --> ReadCheckpointRecord() --> ReadRecord() --> >>> XLogPageRead() --> XLogFileReadAnyTLI() --> XLogFileRead() --> >>> RestoreArchivedFile() >>> >>> ReadCheckpointRecord() is called before pg_control is updated. >> >> OK. In that case, I'm wondering if we should reverse course and >> rejigger the logic so that the shutdown gets processed when we >> transition to PM_RECOVERY. Seems like that might be simpler. > > You mean keeping shutdown waiting until the postmaster has reached > PM_RECOVERY, i.e., the startup process has sent PMSIGNAL_RECOVERY_STARTED? > > The startup process must call ReadCheckpointRecord() before sending > that signal. ReadCheckpointRecord() might get stuck for some reasons, > e.g., neither master nor standby might have the recovery starting > checkpoint WAL record. So that signal might not be sent forever, > in this case, shutdown would get stuck. Ah, OK. In terms of removing the backup label file, can we simply have an additional boolean in the postmaster that indicates whether we've ever reached PM_RUN, and only consider removing the backup file if so? >>> ISTM that walreceiver might be invoked even after shutdown is requested. >>> We should prevent the postmaster from starting up walreceiver if >>> Shutdown > NoShutdown? >> >> Well, when we did the previous shutdown patch, we decided it was not >> right to kill walreceiver until all backends had exited, so it seems >> inconsistent to make the opposite decision here. > > Oh, right. How about allowing the postmaster only in PM_STARTUP, > PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke > walreceiver? We can keep walreceiver alive until all read only > backends have gone, and prevent unexpected startup of walreceiver. Yes, that seems like something we should be checking, if we aren't already. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 19 May 2010 01:47
On Wed, May 19, 2010 at 12:59 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: > In terms of removing the backup label file, can we simply have an > additional boolean in the postmaster that indicates whether we've ever > reached PM_RUN, and only consider removing the backup file if so? Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost same indicator as the boolean you suggested. Thought? >>>> ISTM that walreceiver might be invoked even after shutdown is requested. >>>> We should prevent the postmaster from starting up walreceiver if >>>> Shutdown > NoShutdown? >>> >>> Well, when we did the previous shutdown patch, we decided it was not >>> right to kill walreceiver until all backends had exited, so it seems >>> inconsistent to make the opposite decision here. >> >> Oh, right. How about allowing the postmaster only in PM_STARTUP, >> PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke >> walreceiver? We can keep walreceiver alive until all read only >> backends have gone, and prevent unexpected startup of walreceiver. > > Yes, that seems like something we should be checking, if we aren't already. I'll do that. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |