Streaming Replication: Checkpoint_segment and wal_keep

Prev: [HACKERS] Streaming Replication: Checkpoint_segment and wal_keep_segments on standby
Next: [HACKERS] 9.0beta2 release plans

From: Fujii Masao on 1 Jun 2010 23:23

On Mon, May 31, 2010 at 7:17 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
> 4) Change it so that checkpoint_segments takes effect in standby mode,
> but not during recovery otherwise

I revised the patch to achieve 4). This will enable checkpoint_segments
to trigger a restartpoint like checkpoint_timeout already does, in
standby mode (i.e., streaming replication or file-based log shipping).

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From: Fujii Masao on 2 Jun 2010 09:24

On Wed, Jun 2, 2010 at 8:40 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> On 02/06/10 06:23, Fujii Masao wrote:
>>
>> On Mon, May 31, 2010 at 7:17 PM, Fujii Masao<masao.fujii(a)gmail.com>
>> wrote:
>>>
>>> 4) Change it so that checkpoint_segments takes effect in standby mode,
>>> but not during recovery otherwise
>>
>> I revised the patch to achieve 4). This will enable checkpoint_segments
>> to trigger a restartpoint like checkpoint_timeout already does, in
>> standby mode (i.e., streaming replication or file-based log shipping).
>
> Hmm, XLogCtl->Insert.RedoRecPtr is not updated during recovery, so this
> doesn't work.

Oops! I revised the patch, which changes CreateRestartPoint() so that
it updates XLogCtl->Insert.RedoRecPtr.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From: Fujii Masao on 8 Jun 2010 22:26

On Wed, Jun 2, 2010 at 10:24 PM, Fujii Masao <masao.fujii(a)gmail.com> wrote:
> On Wed, Jun 2, 2010 at 8:40 PM, Heikki Linnakangas
> <heikki.linnakangas(a)enterprisedb.com> wrote:
>> On 02/06/10 06:23, Fujii Masao wrote:
>>>
>>> On Mon, May 31, 2010 at 7:17 PM, Fujii Masao<masao.fujii(a)gmail.com>
>>> �wrote:
>>>>
>>>> 4) Change it so that checkpoint_segments takes effect in standby mode,
>>>> but not during recovery otherwise
>>>
>>> I revised the patch to achieve 4). This will enable checkpoint_segments
>>> to trigger a restartpoint like checkpoint_timeout already does, in
>>> standby mode (i.e., streaming replication or file-based log shipping).
>>
>> Hmm, XLogCtl->Insert.RedoRecPtr is not updated during recovery, so this
>> doesn't work.
>
> Oops! I revised the patch, which changes CreateRestartPoint() so that
> it updates XLogCtl->Insert.RedoRecPtr.

This is one of open items. Please review the patch I submitted, and
please feel free to comment!

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 10 Jun 2010 02:14

On Thu, Jun 10, 2010 at 12:09 AM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Ok, committed with some cosmetic changes.

Thanks!

> BTW, should there be doc changes for this? I didn't find anything explaining
> how restartpoints are triggered, we should add a paragraph somewhere.

+1

What about the attached patch?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

From: Fujii Masao on 10 Jun 2010 06:59

On Thu, Jun 10, 2010 at 7:19 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
>> --- 1902,1908 ----
>> � � � � �for standby purposes, and the number of old WAL segments
>> available
>> � � � � �for standbys is determined based only on the location of the
>> previous
>> � � � � �checkpoint and status of WAL archiving.
>> + � � � � This parameter has no effect on a restartpoint.
>> � � � � �This parameter can only be set in the
>> <filename>postgresql.conf</>
>> � � � � �file or on the server command line.
>> � � � � </para>
>
> Hmm, I wonder if wal_keep_segments should take effect during recovery too?
> We don't support cascading slaves, but if you have two slaves connected to
> one master (without an archive), and you perform failover to one of them,
> without wal_keep_segments the 2nd slave might not find all the files it
> needs in the new master. Then again, that won't work without an archive
> anyway, because we error out at a TLI mismatch in replication. Seems like
> this is 9.1 material..

Yep, since currently SR cannot get over the gap of TLI, wal_keep_segments
is not worth taking effect during recovery.

>> *** a/doc/src/sgml/wal.sgml
>> --- b/doc/src/sgml/wal.sgml
>> ***************
>> *** 424,429 ****
>> --- 424,430 ----
>> � �<para>
>> � � There will always be at least one WAL segment file, and will normally
>> � � not be more than (2 + <varname>checkpoint_completion_target</varname>)
>> * <varname>checkpoint_segments</varname> + 1
>> + � �or <varname>checkpoint_segments</> + <xref
>> linkend="guc-wal-keep-segments"> + 1
>> � � files. �Each segment file is normally 16 MB (though this size can be
>> � � altered when building the server). �You can use this to estimate space
>> � � requirements for <acronym>WAL</acronym>.
>
> That's not true, wal_keep_segments is the minimum number of files retained,
> independently of checkpoint_segments. The corret formula is (2 +
> checkpoint_completion_target * checkpoint_segments, wal_keep_segments)

You mean that the maximum number of WAL files is: ?

max {
(2 + checkpoint_completion_target) * checkpoint_segments,
wal_keep_segments
}

Just after a checkpoint removes old WAL files, there might be wal_keep_segments
WAL files. Additionally, checkpoint_segments WAL files might be generated before
the subsequent checkpoint removes old WAL files. So I think that the maximum
number is

max {
(2 + checkpoint_completion_target) * checkpoint_segments,
wal_keep_segments + checkpoint_segments
}

Am I missing something?

>> � �<para>
>> + � �In archive recovery or standby mode, the server periodically performs
>> + � �<firstterm>restartpoints</><indexterm><primary>restartpoint</></>
>> + � �which are similar to checkpoints in normal operation: the server
>> forces
>> + � �all its state to disk, updates the <filename>pg_control</> file to
>> + � �indicate that the already-processed WAL data need not be scanned
>> again,
>> + � �and then recycles old log segment files if they are in the
>> + � �<filename>pg_xlog</> directory. Note that this recycling is not
>> affected
>> + � �by <varname>wal_keep_segments</> at all. A restartpoint is triggered,
>> + � �if at least one checkpoint record has been replayed since the last
>> + � �restartpoint, every <varname>checkpoint_timeout</> seconds, or every
>> + � �<varname>checkoint_segments</> log segments only in standby mode,
>> + � �whichever comes first....
>
> That last sentence is a bit unclear. How about:
>
> A restartpoint is triggered if at least one checkpoint record has been
> replayed and <varname>checkpoint_timeout</> seconds have passed since last
> restartpoint. In standby mode, a restartpoint is also triggered if
> <varname>checkoint_segments</> log segments have been replayed since last
> restartpoint and at least one checkpoint record has been replayed since.

Thanks! Seems good.

>> ... In log shipping case, the checkpoint interval
>> + � �on the standby is normally smaller than that on the master.
>> + � </para>
>
> What does that mean? Restartpoints can't be performed more frequently than
> checkpoints in the master because restartpoints can only be performed at
> checkpoint records.

Yes, that's what I meant.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3
Prev: [HACKERS] Streaming Replication: Checkpoint_segment and wal_keep_segments on standby
Next: [HACKERS] 9.0beta2 release plans