From: Fujii Masao on 21 Jul 2010 22:51 On Wed, Jul 21, 2010 at 9:52 PM, Aidan Van Dyk <aidan(a)highrise.ca> wrote: > * Fujii Masao <masao.fujii(a)gmail.com> [100721 03:49]: > >> >> The patch provides quorum parameter in postgresql.conf, which >> >> specifies how many standby servers transaction commit will wait for >> >> WAL records to be replicated to, before the command returns a >> >> "success" indication to the client. The default value is zero, which >> >> always doesn't make transaction commit wait for replication without >> >> regard to replication_mode. Also transaction commit always doesn't >> >> wait for replication to asynchronous standby (i.e., replication_mode >> >> is set to async) without regard to this parameter. If quorum is more >> >> than the number of synchronous standbys, transaction commit returns >> >> a "success" when the ACK has arrived from all of synchronous standbys. >> > >> > There should be a way to specify "wait for *all* connected standby servers >> > to acknowledge" >> >> Agreed. I'll allow -1 as the valid value of the quorum parameter, which >> means that transaction commit waits for all connected standbys. > > Hm... so if my 1 synchronouse standby is operatign normally, and quarum > is set to 1, I'll get what I want (commit waits until it's safely on both > servers). �But what happens if my standby goes bad. �Suddenly the quarum > setting is ignored (because it's > number of connected standby > servers?) �Is there a way for me to not allow any commits if the quarum > setting number of standbies is *not* availble? �Yes, I want my db to > "halt" in that situation, and yes, alarmbells will be ringing... > > In reality, I'm likely to run 2 synchronous slaves, with quarum of 1. > So 1 slave can fail an dI can still have 2 going. �But if that 2nd slave > ever failed while the other was down, I definately don't want the master > to forge on ahead! > > Of course, this won't be for everyone, just as the current "just > connected standbys" isn't for everything either... Yeah, we need to clear up the detailed design of quorum commit feature, and reach consensus on that. How should the synchronous replication behave when the number of connected standby servers is less than quorum? 1. Ignore quorum. The current patch adopts this. If the ACKs from all connected standbys have arrived, transaction commit is successful even if the number of standbys is less than quorum. If there is no connected standby, transaction commit always is successful without regard to quorum. 2. Observe quorum. Aidan wants this. Until the number of connected standbys has become more than or equal to quorum, transaction commit waits. Which is the right behavior of quorum commit? Or we should add new parameter specifying the behavior of quorum commit? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Yeb Havinga on 22 Jul 2010 04:37 Fujii Masao wrote: > How should the synchronous replication behave when the number of connected > standby servers is less than quorum? > > 1. Ignore quorum. The current patch adopts this. If the ACKs from all > connected standbys have arrived, transaction commit is successful > even if the number of standbys is less than quorum. If there is no > connected standby, transaction commit always is successful without > regard to quorum. > > 2. Observe quorum. Aidan wants this. Until the number of connected > standbys has become more than or equal to quorum, transaction commit > waits. > > Which is the right behavior of quorum commit? Or we should add new > parameter specifying the behavior of quorum commit? > Initially I also expected the quorum to behave like described by Aidan/option 2. Also, IMHO the name "quorom" is a bit short, like having "maximum" but not saying a max_something. quorum_min_sync_standbys quorum_max_sync_standbys The question remains what are the sync standbys? Does it mean not-async? Intuitively by looking at the enumeration of replication_mode I'd think that the sync standbys are all standby's that operate in a not async mode. That would be clearer with a boolean sync (or not) and for sync standbys the replication_mode specified. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 26 Jul 2010 02:56 On Thu, Jul 22, 2010 at 5:37 PM, Yeb Havinga <yebhavinga(a)gmail.com> wrote: > Fujii Masao wrote: >> >> How should the synchronous replication behave when the number of connected >> standby servers is less than quorum? >> >> 1. Ignore quorum. The current patch adopts this. If the ACKs from all >> � connected standbys have arrived, transaction commit is successful >> � even if the number of standbys is less than quorum. If there is no >> � connected standby, transaction commit always is successful without >> � regard to quorum. >> >> 2. Observe quorum. Aidan wants this. Until the number of connected >> � standbys has become more than or equal to quorum, transaction commit >> � waits. >> >> Which is the right behavior of quorum commit? Or we should add new >> parameter specifying the behavior of quorum commit? >> > > Initially I also expected the quorum to behave like described by > Aidan/option 2. OK. But some people (including me) would like to prevent the master from halting when the standby fails, so I think that 1. also should be supported. So I'm inclined to add new parameter specifying the behavior of quorum commit when the number of synchronous standbys becomes less than quorum. > Also, IMHO the name "quorom" is a bit short, like having > "maximum" but not saying a max_something. > > quorum_min_sync_standbys > quorum_max_sync_standbys What about quorum_standbys? > The question remains what are the sync standbys? Does it mean not-async? It's the standby which sets replication_mode to "recv", "fsync", or "replay". > Intuitively by looking at the enumeration of replication_mode I'd think that > the sync standbys are all standby's that operate in a not async mode. That > would be clearer with a boolean sync (or not) and for sync standbys the > replication_mode specified. You mean that something like synchronous_replication as the recovery.conf parameter should be added in addition to replication_mode? Since increasing the number of similar parameters would confuse users, I don't like do that. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Yeb Havinga on 26 Jul 2010 04:27 Fujii Masao wrote: >> Intuitively by looking at the enumeration of replication_mode I'd think that >> the sync standbys are all standby's that operate in a not async mode. That >> would be clearer with a boolean sync (or not) and for sync standbys the >> replication_mode specified. >> > > You mean that something like synchronous_replication as the recovery.conf > parameter should be added in addition to replication_mode? Since increasing > the number of similar parameters would confuse users, I don't like do that. > I think what would be confusing if there is a mismatch between implemented concepts and parameters. 1 does the master wait for standby servers on commit? 2 how many acknowledgements must the master receive before it can continue? 3 is a standby server a synchronous one, i.e. does it acknowledge a commit? 4 when do standby servers acknowledge a commit? 5 does it only wait when the standby's are connected, or also when they are not connected? 6..? When trying to match parameter names for the concepts above: 1 - does not exist, but can be answered with quorum_standbys = 0 2 - quorum_standbys 3 - yes, if replication_mode != async (here is were I thought I had to think to much) 4 - replication modes recv, fsync and replay bot not async 5 - Zoltan's strict_sync_replication parameter Just an idea, what about for 4: acknowledge_commit = {no|recv|fsync|replay} then 3 = yes, if acknowledge_commit != no regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 26 Jul 2010 04:41
On Mon, Jul 26, 2010 at 5:27 PM, Yeb Havinga <yebhavinga(a)gmail.com> wrote: > Fujii Masao wrote: >>> >>> Intuitively by looking at the enumeration of replication_mode I'd think >>> that >>> the sync standbys are all standby's that operate in a not async mode. >>> That >>> would be clearer with a boolean sync (or not) and for sync standbys the >>> replication_mode specified. >>> >> >> You mean that something like synchronous_replication as the recovery.conf >> parameter should be added in addition to replication_mode? Since >> increasing >> the number of similar parameters would confuse users, I don't like do >> that. >> > > I think what would be confusing if there is a mismatch between implemented > concepts and parameters. > > 1 does the master wait for standby servers on commit? > 2 how many acknowledgements must the master receive before it can continue? > 3 is a standby server a synchronous one, i.e. does it acknowledge a commit? > 4 when do standby servers acknowledge a commit? > 5 does it only wait when the standby's are connected, or also when they are > not connected? > 6..? > > When trying to match parameter names for the concepts above: > 1 - does not exist, but can be answered with quorum_standbys = 0 > 2 - quorum_standbys > 3 - yes, if replication_mode != async (here is were I thought I had to think > to much) > 4 - replication modes recv, fsync and replay bot not async > 5 - Zoltan's strict_sync_replication parameter > > Just an idea, what about > for 4: acknowledge_commit = {no|recv|fsync|replay} > then 3 = yes, if acknowledge_commit != no Thanks for the clarification. I still like replication_mode = {async|recv|fsync|replay} rather than synchronous_replication = {on|off} acknowledge_commit = {no|recv|fsync|replay} because the former is more intuitive for me and I don't want to increase the number of parameters. We need to hear from some users in this respect. If most want the latter, of course, I'd love to adopt it. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |