From: Robert Haas on
On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
> On Wed, May 19, 2010 at 12:59 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
>> In terms of removing the backup label file, can we simply have an
>> additional boolean in the postmaster that indicates whether we've ever
>> reached PM_RUN, and only consider removing the backup file if so?
>
> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
> same indicator as the boolean you suggested. Thought?

It feels cleaner and simpler to me to use the information that the
postmaster already collects rather than having it take locks and check
shared memory, but I might be wrong. Why do you prefer doing it that
way?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Robert Haas <robertmhaas(a)gmail.com> writes:
> On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
>> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
>> same indicator as the boolean you suggested. Thought?

> It feels cleaner and simpler to me to use the information that the
> postmaster already collects rather than having it take locks and check
> shared memory, but I might be wrong. Why do you prefer doing it that
> way?

The postmaster must absolutely not take locks (once there are competing
processes). This is non negotiable from a system robustness standpoint.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Wed, May 19, 2010 at 8:49 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> On Wed, 2010-05-19 at 08:21 -0400, Tom Lane wrote:
>> Robert Haas <robertmhaas(a)gmail.com> writes:
>> > On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
>> >> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
>> >> same indicator as the boolean you suggested. Thought?
>>
>> > It feels cleaner and simpler to me to use the information that the
>> > postmaster already collects rather than having it take locks and check
>> > shared memory, but I might be wrong.  Why do you prefer doing it that
>> > way?
>>
>> The postmaster must absolutely not take locks (once there are competing
>> processes).  This is non negotiable from a system robustness standpoint.
>
> Masao has not proposed this, in fact his proposal was to deliberately
> avoid do so.
>
> I proposed using the state recorded in xlog.c rather than attempting to
> duplicate that with a second boolean in postmaster because that seems
> likely to be more buggy.

Well then how are we reading XLogCtl?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Wed, May 19, 2010 at 10:03 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
> On Wed, May 19, 2010 at 8:49 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
>> On Wed, 2010-05-19 at 08:21 -0400, Tom Lane wrote:
>>> Robert Haas <robertmhaas(a)gmail.com> writes:
>>> > On Wed, May 19, 2010 at 1:47 AM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
>>> >> Yes, but I prefer XLogCtl->SharedRecoveryInProgress, which is the almost
>>> >> same indicator as the boolean you suggested. Thought?
>>>
>>> > It feels cleaner and simpler to me to use the information that the
>>> > postmaster already collects rather than having it take locks and check
>>> > shared memory, but I might be wrong.  Why do you prefer doing it that
>>> > way?
>>>
>>> The postmaster must absolutely not take locks (once there are competing
>>> processes).  This is non negotiable from a system robustness standpoint.
>>
>> Masao has not proposed this, in fact his proposal was to deliberately
>> avoid do so.
>>
>> I proposed using the state recorded in xlog.c rather than attempting to
>> duplicate that with a second boolean in postmaster because that seems
>> likely to be more buggy.
>
> Well then how are we reading XLogCtl?

In my patch, XLogCtl is directly read in xlog.c without any lock since
there should be no other processes running when CancelBackup() is called.


*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 8975,8980 **** CancelBackup(void)
--- 8975,8987 ----
{
struct stat stat_buf;

+ /*
+ * During recovery, we don't rename the "backup_label" file since
+ * it might be required for subsequent recovery.
+ */
+ if (XLogCtl->SharedRecoveryInProgress)
+ return;
+
/* if the file is not there, return */
if (stat(BACKUP_LABEL_FILE, &stat_buf) < 0)
return;

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Wed, May 19, 2010 at 2:47 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
>>> Oh, right. How about allowing the postmaster only in PM_STARTUP,
>>> PM_RECOVERY, PM_HOT_STANDBY or PM_WAIT_READONLY state to invoke
>>> walreceiver? We can keep walreceiver alive until all read only
>>> backends have gone, and prevent unexpected startup of walreceiver.
>>
>> Yes, that seems like something we should be checking, if we aren't already.
>
> I'll do that.

Here is the updated version. I added the above-mentioned check
into the patch.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center