Prev: [HACKERS] Proposal for 9.1: WAL streaming from WAL buffers
Next: [HACKERS] pg_upgrade output directory
From: Fujii Masao on 14 Jun 2010 04:42 On Sat, Jun 12, 2010 at 12:15 AM, Stefan Kaltenbrunner <stefan(a)kaltenbrunner.cc> wrote: > hmm ok - but assuming sync rep we would end up with something like the > following(hypotetically assuming each operation takes 1 time unit): > > originally: > > write 1 > sync 1 > network 1 > write 1 > sync 1 > > total: 5 > > whereas in the new case we would basically have the write+sync compete with > network+write+sync in parallel(total 3 units) and we would only have to wait > for the slower of those two sets of operations instead of the total time of > both or am I missing something. Yeah, this is what I'd like to say. Thanks! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 14 Jun 2010 07:10 On Mon, Jun 14, 2010 at 4:14 AM, Fujii Masao <masao.fujii(a)gmail.com> wrote: > On Fri, Jun 11, 2010 at 11:24 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: >> I think the failover case might be OK. �But if the master crashes and >> restarts, the slave might be left thinking its xlog position is ahead >> of the xlog position on the master. > > Right. Unless we perform a failover in this case, the standby might go down > because of inconsistency of WAL after restarting the master. To avoid this > problem, walsender must wait for WAL to be not only written but also *fsynced* > on the master before sending it as 9.0 does. Though this would degrade the > performance, this might be useful for some cases. We should provide the knob > to specify whether to allow the standby to go ahead of the master or not? Maybe. That sounds like a pretty enormous foot-gun to me, considering that we have no way of recovering from the situation where the standby gets ahead of the master. Right now, I believe we're still in the situation where the standby goes into an infinite CPU-chewing, log-spewing loop, but even after we fix that it's not going to be good enough to really handle that case sensibly, which we probably need to do if we want to make this change. Come to think of it, can this happen already? Can the master stream WAL to the standby after it's written but before it's fsync'd? We should get the open item fixed for 9.0 here before we start worrying about 9.1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 14 Jun 2010 07:54 On Mon, 2010-06-14 at 17:39 +0900, Fujii Masao wrote: > On Fri, Jun 11, 2010 at 11:47 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > > Stefan Kaltenbrunner <stefan(a)kaltenbrunner.cc> writes: > >> hmm not sure that is what fujii tried to say - I think his point was > >> that in the original case we would have serialized all the operations > >> (first write+sync on the master, network afterwards and write+sync on > >> the slave) and now we could try parallelizing by sending the wal before > >> we have synced locally. > > > > Well, we're already not waiting for fsync, which is the slowest part. > > No, currently walsender waits for fsync. > > Walsender tries to send WAL up to xlogctl->LogwrtResult.Write. OTOH, > xlogctl->LogwrtResult.Write is updated after XLogWrite() performs fsync. > As the result, walsender cannot send WAL not fsynced yet. We should > update xlogctl->LogwrtResult.Write before XLogWrite() performs fsync > for 9.0? > > But that change would cause the problem that Robert pointed out. > http://archives.postgresql.org/pgsql-hackers/2010-06/msg00670.php ISTM you just defined some clear objectives for next work. Copying the data from WAL buffers is mostly irrelevant. The majority of time is lost waiting for fsync. The biggest issue is about how to allow WAL write and WALSender to act concurrently and have backend wait for both. Sure, copying data from wal_buffers will be faster still, but it will cause you to address some subtle data structure locking operations that we could solve at a later time. And it still gives the problem of how the master resets itself if the standby really is ahead. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 14 Jun 2010 07:55 On Mon, 2010-06-14 at 17:39 +0900, Fujii Masao wrote: > No, currently walsender waits for fsync. > ... > But that change would cause the problem that Robert pointed out. > http://archives.postgresql.org/pgsql-hackers/2010-06/msg00670.php Presumably this means that if synchronous_commit = off on primary that SR in 9.0 will no longer work correctly if the primary crashes? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 14 Jun 2010 08:41
On Mon, Jun 14, 2010 at 8:10 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: > Maybe. �That sounds like a pretty enormous foot-gun to me, considering > that we have no way of recovering from the situation where the standby > gets ahead of the master. No, we can do that by reconstructing the standby from the backup. And, that situation is not a problem for users including me who prefer to perform a failover when the master goes down. Of course, we can just restart the master in that case, but it's likely to take longer than a failover because there would be a cause of the crash. For example, if the master goes down because of a media crash, the master would never start up unless PITR is performed. So I'm not sure how many users prefer a restart to a failover. > We should get the open item fixed for 9.0 here before we start > worrying about 9.1. Yep, so I was submitting some patches in these days :) Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |