From: Robert Haas on
On Fri, Apr 23, 2010 at 4:44 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Simon Riggs <simon(a)2ndQuadrant.com> writes:
>> No intention of doing that. This change allows people to see what the
>> dependency actually is once the bug has been fixed. Change needs to
>> start from here, not from where we were before.
>
> Well, actually, now that I've looked at the patch I think it's starting
> from a fundamentally wrong position anyway.  Checkpoint records are a
> completely wrong mechanism for transmitting this data to slaves, because
> a checkpoint is emitted *after* we do something, not *before* we do it.
> In particular it's ludicrous to be looking at shutdown checkpoints to
> try to determine whether the subsequent WAL will meet the slave's
> requirements.  There's no connection at all between what the GUC state
> was at shutdown and what it might be after starting again.
>
> A design that might work is
> (1) store the active value of wal_mode in pg_control (but NOT as part of
> the last-checkpoint-record image).
> (2) invent a new WAL record type that is transmitted when we change
> wal_mode.
>
> Then, slaves could check whether the master's wal_mode is high enough
> by looking at pg_control when they start plus any wal_mode_change
> records they come across.
>
> If we did this then we could get rid of those WAL record types that were
> added to signify that information had been omitted from WAL at specific
> times.

<dons project manager hat>

I notice that Heikki's patch doesn't include doing the above. Should
we? If so, who's going to do it?

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Tue, Apr 27, 2010 at 5:09 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Ok, here's a patch that includes the changes to add new wal_mode GUC
> (http://archives.postgresql.org/message-id/4BD581A6.60602(a)enterprisedb.com),
> and implements Tom's design to keep a copy of wal_mode and the
> max_connections, max_prepared_xacts and max_locks_per_xact settings in
> pg_control.

I have some comments:

config.sgml
> <literal>on</literal>. It is thought that there is little
> measurable difference in performance from using this feature, so
> feedback is welcome if any production impacts are noticeable.
> It is likely that this parameter will be removed in later releases.

Is this description still required for recovery_connections?


> if (!XLogArchivingActive())
> ereport(ERROR,
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> errmsg("WAL archiving is not active"),
> errhint("archive_mode must be enabled at server start.")));

You need to change the error messages which refer to archive_mode,
like the above.


+ /*
+ * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
+ * and we must have at least as many backend slots as the primary.
+ */
+ if (InArchiveRecovery && XLogRequestRecoveryConnections)
+ {
+ if (ControlFile->wal_mode < WAL_MODE_HOT_STANDBY)
+ ereport(ERROR,
+ (errmsg("recovery connections cannot start because
wal_mode was not set to 'hot_standby' on the WAL source server")));

This seems to always prevent the server from doing an archive recovery
since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Tue, Apr 27, 2010 at 6:49 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Fujii Masao wrote:
>> config.sgml
>>> <literal>on</literal>.  It is thought that there is little
>>> measurable difference in performance from using this feature, so
>>> feedback is welcome if any production impacts are noticeable.
>>> It is likely that this parameter will be removed in later releases.
>>
>> Is this description still required for recovery_connections?
>
> Hmm, I guess it was referring to setting recovery_connections in the
> master, I don't see us removing that option from the standby in the
> future. recovery_connections in the master is being replaced with the
> wal_mode setting, so I guess that's not required anymore.

Agreed.

>>> if (!XLogArchivingActive())
>>>   ereport(ERROR,
>>>     (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>>      errmsg("WAL archiving is not active"),
>>>      errhint("archive_mode must be enabled at server start.")));
>>
>> You need to change the error messages which refer to archive_mode,
>> like the above.
>
> Hmm, I think we should change not only the error message, but the logic
> too. There's two related checks there:
>
>>       if (!XLogArchivingActive())
>>               ereport(ERROR,
>>                               (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>                                errmsg("WAL archiving is not active"),
>>                                errhint("archive_mode must be enabled at server start.")));
>>
>>       if (!XLogArchiveCommandSet())
>>               ereport(ERROR,
>>                               (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>                                errmsg("WAL archiving is not active"),
>>                                errhint("archive_command must be defined before "
>>                                                "online backups can be made safely.")));
>
> You can use streaming replication too to transport the WAL generated
> during the backup, so I think we should just check that wal_mode>='archive'.

It's OK in pg_start_backup(), but seems NG in pg_stop_backup() since
it waits until some WAL files have been archived by the archiver. No?

>> + /*
>> +  * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
>> +  * and we must have at least as many backend slots as the primary.
>> +  */
>> + if (InArchiveRecovery && XLogRequestRecoveryConnections)
>> + {
>> +   if (ControlFile->wal_mode < WAL_MODE_HOT_STANDBY)
>> +       ereport(ERROR,
>> +              (errmsg("recovery connections cannot start because
>> wal_mode was not set to 'hot_standby' on the WAL source server")));
>>
>> This seems to always prevent the server from doing an archive recovery
>> since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.
>
> No, it doesn't prevent archive recovery. It only prevents hot standby if
> wal_mode was not 'hot_standby' in the master. I think you missed the "&&
> XLogRequestRecoveryConnections" condition above.

Even if we do only archive recovery, XLogRequestRecoveryConnections
might be TRUE. Or we need to ensure that the recovery_connection is
FALSE in the postgresql.conf before starting archive recovery?

And I tried archive recovery, and encountered the following error.

LOG: starting archive recovery
LOG: restored log file "000000010000000000000001" from archive
FATAL: recovery connections cannot start because wal_mode was not
set to 'hot_standby' on the WAL source server
LOG: startup process (PID 32512) exited with exit code 1
LOG: aborting startup due to startup process failure

XLOG-related parameters in postgresql.conf
archive_mode = on
archive_command = 'cp %p ../data.arh/%f'
wal_mode = archive
recovery_connections = on

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Tue, Apr 27, 2010 at 7:50 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
>> It's OK in pg_start_backup(), but seems NG in pg_stop_backup() since
>> it waits until some WAL files have been archived by the archiver. No?
>
> Good point, that logic would need to be changed too. Should it simply
> return immediately if archive_mode=off?

What if we wrongly set archive_mode to on and wal_mode to minimal?
I think that checking XLogArchivingActive() in pg_stop_backup() is
adequate.

>>>> + /*
>>>> +  * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
>>>> +  * and we must have at least as many backend slots as the primary.
>>>> +  */
>>>> + if (InArchiveRecovery && XLogRequestRecoveryConnections)
>>>> + {
>>>> +   if (ControlFile->wal_mode < WAL_MODE_HOT_STANDBY)
>>>> +       ereport(ERROR,
>>>> +              (errmsg("recovery connections cannot start because
>>>> wal_mode was not set to 'hot_standby' on the WAL source server")));
>>>>
>>>> This seems to always prevent the server from doing an archive recovery
>>>> since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.
>>> No, it doesn't prevent archive recovery. It only prevents hot standby if
>>> wal_mode was not 'hot_standby' in the master. I think you missed the "&&
>>> XLogRequestRecoveryConnections" condition above.
>>
>> Even if we do only archive recovery, XLogRequestRecoveryConnections
>> might be TRUE. Or we need to ensure that the recovery_connection is
>> FALSE in the postgresql.conf before starting archive recovery?
>
> Umm, yes, if you have recovery_connnections=on, it means you want hot
> standby. And for that you need wal_mode='hot_standby'.

Since the default value of recovery_connections is TRUE, I think that
the trouble which I encountered would often happen. We should disable
recovery_connections by default? Furthermore should move it from
postgresql.conf to recovery.conf?

On the other hand, I feel that recovery_connections=on in an archive
recovery is valid configuration *until* any read only connections are
requested. How about moving the above check to postmaster or backend?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Tue, Apr 27, 2010 at 7:24 AM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
> On Tue, Apr 27, 2010 at 7:50 PM, Heikki Linnakangas
> <heikki.linnakangas(a)enterprisedb.com> wrote:
>>> It's OK in pg_start_backup(), but seems NG in pg_stop_backup() since
>>> it waits until some WAL files have been archived by the archiver. No?
>>
>> Good point, that logic would need to be changed too. Should it simply
>> return immediately if archive_mode=off?
>
> What if we wrongly set archive_mode to on and wal_mode to minimal?
> I think that checking XLogArchivingActive() in pg_stop_backup() is
> adequate.

That case should be rejected at primary startup.

>>>>> + /*
>>>>> +  * For Hot Standby, the WAL must be generated with 'hot_standby' mode,
>>>>> +  * and we must have at least as many backend slots as the primary.
>>>>> +  */
>>>>> + if (InArchiveRecovery && XLogRequestRecoveryConnections)
>>>>> + {
>>>>> +   if (ControlFile->wal_mode < WAL_MODE_HOT_STANDBY)
>>>>> +       ereport(ERROR,
>>>>> +              (errmsg("recovery connections cannot start because
>>>>> wal_mode was not set to 'hot_standby' on the WAL source server")));
>>>>>
>>>>> This seems to always prevent the server from doing an archive recovery
>>>>> since wal_mode is expected to be WAL_MODE_ARCHIVE in that case.
>>>> No, it doesn't prevent archive recovery. It only prevents hot standby if
>>>> wal_mode was not 'hot_standby' in the master. I think you missed the "&&
>>>> XLogRequestRecoveryConnections" condition above.
>>>
>>> Even if we do only archive recovery, XLogRequestRecoveryConnections
>>> might be TRUE. Or we need to ensure that the recovery_connection is
>>> FALSE in the postgresql.conf before starting archive recovery?
>>
>> Umm, yes, if you have recovery_connnections=on, it means you want hot
>> standby. And for that you need wal_mode='hot_standby'.
>
> Since the default value of recovery_connections is TRUE, I think that
> the trouble which I encountered would often happen. We should disable
> recovery_connections by default? Furthermore should move it from
> postgresql.conf to recovery.conf?
>
> On the other hand, I feel that recovery_connections=on in an archive
> recovery is valid configuration *until* any read only connections are
> requested. How about moving the above check to postmaster or backend?

Or just not starting recovery connections, but still doing archive
recovery? I think in this case a WARNING might be adequate.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers