Streaming replication, and walsender during recovery [PgSql]

Prev: [HACKERS] Bloom filters bloom filters bloom filters
Next: Build farm tweaks

From: Fujii Masao on 29 Jan 2010 06:25

On Fri, Jan 29, 2010 at 5:41 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
>> To improve the situation, I think that we need to use
>> checkpoint_segment/timeout as a trigger of restartpoint, regardless
>> of the checkpoint record. Though I'm not sure that is possible and
>> should be included in v9.0.
>
> Yes, that is a simple change. I think it is needed now.

On second thought, it's difficult to force restartpoint without
a checkpoint record. A recovery always needs to start from a
checkpoint redo location. Otherwise a torn page might be caused
because a full-page image has not been replayed. So restartpoint
will not start without a checkpoint record.

But at least we might have to change the bgwriter so as to use
not only checkpoint_timeout but also checkpoint_segments as a
trigger of restartpoint. It would be useful for people who want
to control the cycle of checkpoint by using only checkpoint_segments.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 18 Feb 2010 01:23

On Mon, Jan 18, 2010 at 2:19 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
> When I configured a cascaded standby (i.e, made the additional
> standby server connect to the standby), I got the following
> errors, and a cascaded standby didn't start replication.
>
> ERROR: timeline 0 of the primary does not match recovery target timeline 1
>
> I didn't care about that case so far. To avoid a confusing error
> message, we should forbid a startup of walsender during recovery,
> and emit a suitable message? Or support such cascade-configuration?
> Though I don't think that the latter is difficult to be implemented,
> ISTM it's not the time to do that now.

We got the consensus that the cascading standby feature should be
postponed to v9.1 or later. But when we wrongly make the standby
connect to another standby, the following confusing message is still
output.

FATAL: timeline 0 of the primary does not match recovery target timeline 1

How about emitting the following message instead? Here is the patch.

FATAL: recovery is in progress
HINT: cannot accept the standby server during recovery.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From: Heikki Linnakangas on 16 Mar 2010 05:11

Fujii Masao wrote:
> On Mon, Jan 18, 2010 at 2:19 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
>> When I configured a cascaded standby (i.e, made the additional
>> standby server connect to the standby), I got the following
>> errors, and a cascaded standby didn't start replication.
>>
>> ERROR: timeline 0 of the primary does not match recovery target timeline 1
>>
>> I didn't care about that case so far. To avoid a confusing error
>> message, we should forbid a startup of walsender during recovery,
>> and emit a suitable message? Or support such cascade-configuration?
>> Though I don't think that the latter is difficult to be implemented,
>> ISTM it's not the time to do that now.
>
> We got the consensus that the cascading standby feature should be
> postponed to v9.1 or later. But when we wrongly make the standby
> connect to another standby, the following confusing message is still
> output.
>
> FATAL: timeline 0 of the primary does not match recovery target timeline 1
>
> How about emitting the following message instead? Here is the patch.
>
> FATAL: recovery is in progress
> HINT: cannot accept the standby server during recovery.

Commmitted. I edited the message and error code a bit:

ereport(FATAL,
(errcode(ERRCODE_CANNOT_CONNECT_NOW),
errmsg("recovery is still in progress, can't accept WAL
streaming connections")));

ERRCODE_CANNOT_CONNECT_NOW is what we use when the system is shutting
down etc, so that that seems appropriate.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 16 Mar 2010 20:29

On Tue, Mar 16, 2010 at 6:11 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Commmitted. I edited the message and error code a bit:
>
> ereport(FATAL,
> (errcode(ERRCODE_CANNOT_CONNECT_NOW),
> errmsg("recovery is still in progress, can't accept WAL
> streaming connections")));
>
> ERRCODE_CANNOT_CONNECT_NOW is what we use when the system is shutting
> down etc, so that that seems appropriate.

Thanks! I agree that ERRCODE_CANNOT_CONNECT_NOW is more suitable.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev |
Pages: 1 2 3 4 5 6 7
Prev: [HACKERS] Bloom filters bloom filters bloom filters
Next: Build farm tweaks