From: Tom Lane on
Fujii Masao <masao.fujii(a)gmail.com> writes:
> Should the standby also have to follow the WAL rule during recovery?
> The current patch doesn't care about the write order of the data page
> and WAL in the standby. So, after both servers fail, restarting the
> ex-standby by itself might corrupt the data.

Surely the receiver should fsync the WAL itself to disk before
acknowledging it. Assuming you've done that, I don't see any
corruption risk.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Thu, Nov 12, 2009 at 12:03 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Fujii Masao <masao.fujii(a)gmail.com> writes:
>> Should the standby also have to follow the WAL rule during recovery?
>> The current patch doesn't care about the write order of the data page
>> and WAL in the standby. So, after both servers fail, restarting the
>> ex-standby by itself might corrupt the data.
>
> Surely the receiver should fsync the WAL itself to disk before
> acknowledging it.  Assuming you've done that, I don't see any
> corruption risk.

"acknowledging it" means "letting the startup process know the arrival
of WAL records"? If so, I agree that there is no risk of data corruption.

The problem is that fsync needs to be issued too frequently, which would
be harmless in asynchronous replication, but not in synchronous one.
A transaction would have to wait for the primary's and standby's fsync
before returning a "success" to a client.

So I'm inclined to change the startup process and bgwriter, instead of
walreceiver, so as to fsync the WAL for the WAL rule.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on
Fujii Masao wrote:
> The problem is that fsync needs to be issued too frequently, which would
> be harmless in asynchronous replication, but not in synchronous one.
> A transaction would have to wait for the primary's and standby's fsync
> before returning a "success" to a client.
>
> So I'm inclined to change the startup process and bgwriter, instead of
> walreceiver, so as to fsync the WAL for the WAL rule.

Let's keep it simple for now. Just make the walreceiver do the fsync. We
can optimize later. For now, we're only going to have async mode anyway.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
Hi,

On Thu, Nov 12, 2009 at 4:32 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Fujii Masao wrote:
>> The problem is that fsync needs to be issued too frequently, which would
>> be harmless in asynchronous replication, but not in synchronous one.
>> A transaction would have to wait for the primary's and standby's fsync
>> before returning a "success" to a client.
>>
>> So I'm inclined to change the startup process and bgwriter, instead of
>> walreceiver, so as to fsync the WAL for the WAL rule.
>
> Let's keep it simple for now. Just make the walreceiver do the fsync. We
> can optimize later. For now, we're only going to have async mode anyway.

Okey, I'll do that; the walreceiver issues the fsync for each arrival of
the WAL records, and the startup process replays only the records already
fsynced.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Thu, Nov 12, 2009 at 6:27 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> I agree with you, though it has taken some time to understand what you
> said and at first my reaction was to disagree. I think the responses you
> got on this are because you dived straight in with a question before
> explaining other things around this.

Thanks for clarifying this topic ;)

> If recovery starts reading WAL records that have not been fsynced then
> we may need to flush a shared buffer to disk that depends upon a
> non-fsynced(yet) WAL record. Fsyncing WAL after *every* WAL record is
> going to make performance suck even worse and is completely out of the
> question. So implementing the fsync-WAL-before-buffer-flush rule during
> recovery makes much more sense. It's also only small change during
> XlogFlush().

Agreed. This approach has lesser impact on the performance.

But, as I said on my first post on this thread, even such low-frequent
fsync-WAL-before-buffer-flush might cause a response time spike on the
primary because the walreceiver must sleep during that fsync. I think
that leaving the WAL-logging business to another process like walwriter
is a good idea for reducing further the impact on the walreceiver; In
typical case,

* The walreceiver receives WAL records, returns the ACK to the primary,
saves them in the wal_buffers, and lets the startup process know
the arrival.

* The walwriter writes and fsyncs the WAL records in the wal_buffers.

* The startup process applies the WAL records in the wal_buffers
when it receives the notice of the arrival.

* The startup process and bgwriter fsyncs the WAL before the buffer
flush.

Of course, since this approach is too complicated, it's out of the scope
of the development for v8.5.

> But I also agree with Heikki. Let's plan to do this later in this
> release.

Okey. I implement nothing around this topic until the core part of
asynchronous replication will have been committed.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers