Prev: [HACKERS] Proposal for 9.1: WAL streaming from WAL buffers
Next: [HACKERS] pg_upgrade output directory
From: Robert Haas on 14 Jun 2010 09:13 On Mon, Jun 14, 2010 at 8:41 AM, Fujii Masao <masao.fujii(a)gmail.com> wrote: > On Mon, Jun 14, 2010 at 8:10 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: >> Maybe. �That sounds like a pretty enormous foot-gun to me, considering >> that we have no way of recovering from the situation where the standby >> gets ahead of the master. > > No, we can do that by reconstructing the standby from the backup. > > And, that situation is not a problem for users including me who prefer to > perform a failover when the master goes down. You don't get to pick - if a backend crashes on the master, it will restart right away and come up, but the slave will now be hosed... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 14 Jun 2010 11:02 Fujii Masao <masao.fujii(a)gmail.com> writes: > On Fri, Jun 11, 2010 at 11:47 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: >> Well, we're already not waiting for fsync, which is the slowest part. > No, currently walsender waits for fsync. No, you're mistaken. > Walsender tries to send WAL up to xlogctl->LogwrtResult.Write. OTOH, > xlogctl->LogwrtResult.Write is updated after XLogWrite() performs fsync. Wrong. LogwrtResult.Write tracks how far we've written out data, but it is only (known to be) fsync'd as far as LogwrtResult.Flush. > But that change would cause the problem that Robert pointed out. > http://archives.postgresql.org/pgsql-hackers/2010-06/msg00670.php Yes. Possibly walsender should only be allowed to send as far as LogwrtResult.Flush. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 15 Jun 2010 00:46 On Mon, Jun 14, 2010 at 10:13 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: > On Mon, Jun 14, 2010 at 8:41 AM, Fujii Masao <masao.fujii(a)gmail.com> wrote: >> On Mon, Jun 14, 2010 at 8:10 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: >>> Maybe. �That sounds like a pretty enormous foot-gun to me, considering >>> that we have no way of recovering from the situation where the standby >>> gets ahead of the master. >> >> No, we can do that by reconstructing the standby from the backup. >> >> And, that situation is not a problem for users including me who prefer to >> perform a failover when the master goes down. > > You don't get to pick - if a backend crashes on the master, it will > restart right away and come up, but the slave will now be hosed... You are concerned about the case where postmaster automatically restarts the crash recovery, in particular? Yes, this case is more problematic. If the standby is ahead of the master, the standby might find an invalid record and run into the infinite retry loop, or keep working without noticing the inconsistency between the database and the WAL. I'm thinking that walreceiver should throw a PANIC when it receives the record which is in the LSN older than the last WAL receive location, except the beginning of streaming (because the standby always requests for streaming from the starting of WAL file at first even if some records have already been received in previous time). Thought? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 15 Jun 2010 00:47 On Tue, Jun 15, 2010 at 12:02 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > Fujii Masao <masao.fujii(a)gmail.com> writes: >> On Fri, Jun 11, 2010 at 11:47 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: >>> Well, we're already not waiting for fsync, which is the slowest part. > >> No, currently walsender waits for fsync. > > No, you're mistaken. > >> Walsender tries to send WAL up to xlogctl->LogwrtResult.Write. OTOH, >> xlogctl->LogwrtResult.Write is updated after XLogWrite() performs fsync. > > Wrong. �LogwrtResult.Write tracks how far we've written out data, > but it is only (known to be) fsync'd as far as LogwrtResult.Flush. Hmm.. I agree that xlogctl->LogwrtResult.Write indicates the byte position we've written. But in the current XLogWrite() code, it's updated after XLogWrite() calls issue_xlog_fsync(). No? Of course, the backend-local LogwrtResult.Write is updated before issue_xlog_fsync(), but it's not available by walsender. Am I missing something? >> But that change would cause the problem that Robert pointed out. >> http://archives.postgresql.org/pgsql-hackers/2010-06/msg00670.php > > Yes. �Possibly walsender should only be allowed to send as far as > LogwrtResult.Flush. Yes, in order to avoid that problem, walsender should wait for WAL to be fsync'd before sending it. But I'm worried that this would slow down the performance on the master significantly because WAL flush and WAL streaming are not performed concurrently and the backend must wait for both in a serial manner. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Heikki Linnakangas on 15 Jun 2010 01:16
On 15/06/10 07:47, Fujii Masao wrote: > On Tue, Jun 15, 2010 at 12:02 AM, Tom Lane<tgl(a)sss.pgh.pa.us> wrote: >> Fujii Masao<masao.fujii(a)gmail.com> writes: >>> Walsender tries to send WAL up to xlogctl->LogwrtResult.Write. OTOH, >>> xlogctl->LogwrtResult.Write is updated after XLogWrite() performs fsync. >> >> Wrong. LogwrtResult.Write tracks how far we've written out data, >> but it is only (known to be) fsync'd as far as LogwrtResult.Flush. > > Hmm.. I agree that xlogctl->LogwrtResult.Write indicates the byte position > we've written. But in the current XLogWrite() code, it's updated after > XLogWrite() calls issue_xlog_fsync(). No? issue_xlog_fsync() is only called if the caller requested a flush by advancing WriteRqst.Flush. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |