From: Simon Riggs on
On Wed, 2010-05-26 at 16:22 -0700, Josh Berkus wrote:
> > Just this second posted about that, as it turns out.
> >
> > I have a v3 *almost* ready of the keepalive patch. It still makes sense
> > to me after a few days reflection, so is worth discussion and review. In
> > or out, I want this settled within a week. Definitely need some R&R
> > here.
>
> Does the keepalive fix all the issues with max_standby_delay? Tom?

OK, here's v4.

Summary

* WALSender adds a timestamp onto the header of every WAL chunk sent.

* Each WAL record now has a conceptual "send timestamp" that remains
constant while that record is replayed. This is used as the basis from
which max_standby_delay is calculated when required during replay.

* Send timestamp is calculated as the later of the timestamp of chunk in
which WAL record was sent and the latest XLog time.

* WALSender sends an empty message as a keepalive when nothing else to
send. (No longer a special message type for the keepalive).

I think its close, but if there's a gaping hole here somewhere then I'll
punt for this release.

--
Simon Riggs www.2ndQuadrant.com
From: Bruce Momjian on

Uh, we have three days before we package 9.0beta2. It would be good if
we could decide on the max_standby_delay issue soon.

---------------------------------------------------------------------------

Simon Riggs wrote:
> On Wed, 2010-05-26 at 16:22 -0700, Josh Berkus wrote:
> > > Just this second posted about that, as it turns out.
> > >
> > > I have a v3 *almost* ready of the keepalive patch. It still makes sense
> > > to me after a few days reflection, so is worth discussion and review. In
> > > or out, I want this settled within a week. Definitely need some R&R
> > > here.
> >
> > Does the keepalive fix all the issues with max_standby_delay? Tom?
>
> OK, here's v4.
>
> Summary
>
> * WALSender adds a timestamp onto the header of every WAL chunk sent.
>
> * Each WAL record now has a conceptual "send timestamp" that remains
> constant while that record is replayed. This is used as the basis from
> which max_standby_delay is calculated when required during replay.
>
> * Send timestamp is calculated as the later of the timestamp of chunk in
> which WAL record was sent and the latest XLog time.
>
> * WALSender sends an empty message as a keepalive when nothing else to
> send. (No longer a special message type for the keepalive).
>
> I think its close, but if there's a gaping hole here somewhere then I'll
> punt for this release.
>
> --
> Simon Riggs www.2ndQuadrant.com

[ Attachment, skipping... ]

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on
On 27/05/10 20:26, Simon Riggs wrote:
> On Wed, 2010-05-26 at 16:22 -0700, Josh Berkus wrote:
>>> Just this second posted about that, as it turns out.
>>>
>>> I have a v3 *almost* ready of the keepalive patch. It still makes sense
>>> to me after a few days reflection, so is worth discussion and review. In
>>> or out, I want this settled within a week. Definitely need some R&R
>>> here.
>>
>> Does the keepalive fix all the issues with max_standby_delay? Tom?
>
> OK, here's v4.
>
> Summary
>
> * WALSender adds a timestamp onto the header of every WAL chunk sent.
>
> * Each WAL record now has a conceptual "send timestamp" that remains
> constant while that record is replayed. This is used as the basis from
> which max_standby_delay is calculated when required during replay.
>
> * Send timestamp is calculated as the later of the timestamp of chunk in
> which WAL record was sent and the latest XLog time.
>
> * WALSender sends an empty message as a keepalive when nothing else to
> send. (No longer a special message type for the keepalive).
>
> I think its close, but if there's a gaping hole here somewhere then I'll
> punt for this release.

This certainly alleviates some of the problems. You still need to ensure
that master and standby have synchronized clocks, and you still get zero
grace time after a long period of inactivity when not using streaming
replication, however.

Sending a keep-alive message every 100ms seems overly aggressive to me.


If we really want to try to salvage max_standby_delay with a meaning
similar to what it has now, I think we should go with the idea some
people bashed around earlier and define the grace period as the
difference between a WAL record becoming available to the standby for
replay, and between replaying it. An approximation of that is to do
"lastIdle = gettimeofday()" in XLogPageRead() whenever it needs to wait
for new WAL to arrive, whether that's via streaming replication or by a
success return code from restore_command, and compare the difference of
that with current timestamp in WaitExceedsMaxStandbyDelay().

That's very simple, doesn't require synchronized clocks, and works the
same with file- and stream-based setups.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
Thanks for the review.

On Tue, 2010-06-01 at 13:36 +0300, Heikki Linnakangas wrote:

> If we really want to try to salvage max_standby_delay with a meaning
> similar to what it has now, I think we should go with the idea some
> people bashed around earlier and define the grace period as the
> difference between a WAL record becoming available to the standby for
> replay, and between replaying it. An approximation of that is to do
> "lastIdle = gettimeofday()" in XLogPageRead() whenever it needs to wait
> for new WAL to arrive, whether that's via streaming replication or by a
> success return code from restore_command, and compare the difference of
> that with current timestamp in WaitExceedsMaxStandbyDelay().

That wouldn't cope with a continuous stream of records arriving, unless
you also include the second half of the patch.

> That's very simple, doesn't require synchronized clocks, and works the
> same with file- and stream-based setups.

Nor does it provide a mechanism for monitoring of SR. standby_delay is
explicitly defined in terms of the gap between two servers, so is a
useful real world concept. apply_delay is somewhat less interesting.

I'm sure most people would rather have monitoring and therefore the
requirement for synchronised-ish clocks, than no monitoring. If you
think no monitoring is OK, I don't, but there are other ways, so its not
a point to fight about.

> This certainly alleviates some of the problems. You still need to ensure
> that master and standby have synchronized clocks, and you still get zero
> grace time after a long period of inactivity when not using streaming
> replication, however.

Second issue can be added once we approve the rest of this if you like.

> Sending a keep-alive message every 100ms seems overly aggressive to me.

It's sent every wal_sender_delay. Why is that a negative?

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Mon, 2010-05-31 at 14:40 -0400, Bruce Momjian wrote:

> Uh, we have three days before we package 9.0beta2. It would be good if
> we could decide on the max_standby_delay issue soon.

I've heard something from Heikki, not from anyone else. Those comments
amount to "lets replace max_standby_delay with max_apply_delay".

Got a beta idea?

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers