From: Tom Lane on
Fujii Masao <masao.fujii(a)gmail.com> writes:
> When I configured a cascaded standby (i.e, made the additional
> standby server connect to the standby), I got the following
> errors, and a cascaded standby didn't start replication.

> ERROR: timeline 0 of the primary does not match recovery target timeline 1

> I didn't care about that case so far. To avoid a confusing error
> message, we should forbid a startup of walsender during recovery,
> and emit a suitable message? Or support such cascade-configuration?
> Though I don't think that the latter is difficult to be implemented,
> ISTM it's not the time to do that now.

It would be kind of silly to add code to forbid it if making it work
would be about the same amount of effort. I think it'd be worth looking
closer to find out what the problem is.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Mon, 2010-01-18 at 09:31 -0500, Tom Lane wrote:
> Fujii Masao <masao.fujii(a)gmail.com> writes:
> > When I configured a cascaded standby (i.e, made the additional
> > standby server connect to the standby), I got the following
> > errors, and a cascaded standby didn't start replication.
>
> > ERROR: timeline 0 of the primary does not match recovery target timeline 1
>
> > I didn't care about that case so far. To avoid a confusing error
> > message, we should forbid a startup of walsender during recovery,
> > and emit a suitable message? Or support such cascade-configuration?
> > Though I don't think that the latter is difficult to be implemented,
> > ISTM it's not the time to do that now.
>
> It would be kind of silly to add code to forbid it if making it work
> would be about the same amount of effort. I think it'd be worth looking
> closer to find out what the problem is.

There is an ERROR, but no problem AFAICS. The tli isn't set until end of
recovery because it doesn't need to have been set yet. That shouldn't
prevent retrieving WAL data.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Mon, Jan 18, 2010 at 11:42 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> On Mon, 2010-01-18 at 09:31 -0500, Tom Lane wrote:
>> Fujii Masao <masao.fujii(a)gmail.com> writes:
>> > When I configured a cascaded standby (i.e, made the additional
>> > standby server connect to the standby), I got the following
>> > errors, and a cascaded standby didn't start replication.
>>
>> >   ERROR:  timeline 0 of the primary does not match recovery target timeline 1
>>
>> > I didn't care about that case so far. To avoid a confusing error
>> > message, we should forbid a startup of walsender during recovery,
>> > and emit a suitable message? Or support such cascade-configuration?
>> > Though I don't think that the latter is difficult to be implemented,
>> > ISTM it's not the time to do that now.
>>
>> It would be kind of silly to add code to forbid it if making it work
>> would be about the same amount of effort.  I think it'd be worth looking
>> closer to find out what the problem is.
>
> There is an ERROR, but no problem AFAICS. The tli isn't set until end of
> recovery because it doesn't need to have been set yet. That shouldn't
> prevent retrieving WAL data.

OK. Here is the patch which supports a walsender process during recovery;

* Change walsender so as to send the WAL written by the walreceiver
if it has been started during recovery.
* Kill the walsenders started during recovery at the end of recovery
because replication cannot survive the change of timeline ID.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
From: Simon Riggs on
On Tue, 2010-01-19 at 15:04 +0900, Fujii Masao wrote:
> >
> > There is an ERROR, but no problem AFAICS. The tli isn't set until end of
> > recovery because it doesn't need to have been set yet. That shouldn't
> > prevent retrieving WAL data.
>
> OK. Here is the patch which supports a walsender process during recovery;
>
> * Change walsender so as to send the WAL written by the walreceiver
> if it has been started during recovery.
> * Kill the walsenders started during recovery at the end of recovery
> because replication cannot survive the change of timeline ID.

Good patch.

I think we need to add a longer comment explaining the tli issues. I
agree with your handling of them.

It would be useful to have the ps display differentiate between multiple
walsenders, and in this case have it indicate cascading also.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Tue, Jan 19, 2010 at 4:41 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> It would be useful to have the ps display differentiate between multiple
> walsenders, and in this case have it indicate cascading also.

Since a normal walsender and a "cascading" one will not be running
at the same time, I don't think that it's worth adding that label
into the PS display. Am I missing something?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers