Prev: [HACKERS] Keepalive for max_standby_delay
Next: Unexpected page allocation behavior on insert-only tables
From: "Kevin Grittner" on 2 Jun 2010 12:59 Simon Riggs <simon(a)2ndQuadrant.com> wrote: > On Mon, 2010-05-31 at 14:40 -0400, Bruce Momjian wrote: > >> Uh, we have three days before we package 9.0beta2. It would be >> good if we could decide on the max_standby_delay issue soon. > > I've heard something from Heikki, not from anyone else. Those > comments amount to "lets replace max_standby_delay with > max_apply_delay". > > Got a beta idea? Given the incessant ticking of the clock, I have a hard time believing we have any real options besides max_standby_delay or a boolean which corresponds to the -1 and 0 settings of max_standby_delay. I think it's pretty clear that there's a use case for the positive values, although there are bound to be some who try it and are surprised by behavior at transition from idle to active. The whole debate seems to boil down to how important a middle ground is versus how damaging the surprise factor is. (I don't really buy the argument that we won't be able to remove it later if we replace it with something better.) I know there were initially some technical problems, too; have those been resolved? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 2 Jun 2010 13:14 Simon Riggs <simon(a)2ndQuadrant.com> writes: > OK, here's v4. I've been trying to stay out of this thread, but with beta2 approaching and no resolution in sight, I'm afraid I have to get involved. This patch seems to me to be going in fundamentally the wrong direction. It's adding complexity and overhead (far more than is needed), and it's failing utterly to resolve the objections that I raised to start with. In particular, Simon seems to be basically refusing to do anything about the complaint that the code fails unless master and standby clocks are in close sync. I do not believe that this is acceptable, and since he won't fix it, I guess I'll have to. The other basic problem that I had with the current code, which this does nothing about, is that it believes that timestamps pulled from WAL archive should be treated on the same footing as timestamps of "live" actions. That might work in certain limited scenarios but it's going to be a disaster in general. I believe that the motivation for treating archived timestamps as live is, essentially, to force rapid catchup if a slave falls behind so far that it's reading from archive instead of SR. There are certainly use-cases where that's appropriate (though, again, not all); but even when you do want it it's a pretty inflexible implementation. For realistic values of max_standby_delay the behavior is going to pretty much be "instant kill whenever we're reading from archive". I have backed off my original suggestion that we should reduce max_standby_delay to a boolean: that was based on the idea that delays would only occur in response to DDL on the master, but since vacuum and btree page splits can also trigger delays, it seems clear that a "kill queries immediately" policy isn't really very useful in practice. So we need to make max_standby_delay work rather than just get rid of it. What I think might be a realistic compromise is this: 1. Separate max_standby_delay into two GUCs, say "max_streaming_delay" and "max_archive_delay". 2. When applying WAL that came across SR, use max_streaming_delay and let the time measurement be current time minus time of receipt of the current WAL send chunk. 3. When applying WAL that came from archive, use max_archive_delay and let the time measurement be current time minus time of acquisition of the current WAL segment from the archive. The current code's behavior in the latter case could effectively be modeled by setting max_archive_delay to zero, but that isn't the only plausible setting. More likely DBAs would set max_archive_delay to something smaller than max_streaming_delay, but still positive so as to not kill conflicting queries instantly. An important property of this design is that all relevant timestamps are taken on the slave, so clock skew isn't an issue. I'm still inclined to apply the part of Simon's patch that adds a transmit timestamp to each SR send chunk. That would actually be completely unused by the slave given my proposal above, but I think that it is an important step to take to future-proof the SR protocol against possible changes in the slave-side timing logic. I don't however see the value of transmitting "keepalive" records when we'd otherwise not have anything to send. The value of adding timestamps to the SR protocol is really to let the slave determine the current amount of clock skew between it and the master; which is a number that should not change so rapidly that it has to be updated every 100ms even in an idle system. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Stephen Frost on 2 Jun 2010 13:36 * Tom Lane (tgl(a)sss.pgh.pa.us) wrote: > An important property of this design is that all relevant timestamps > are taken on the slave, so clock skew isn't an issue. I agree that this is important, and I do run NTP on all my servers and even monitor it using Nagios. It's still not a cure-all for time skew issues. > Comments? I'm not really a huge fan of adding another GUC, to be honest. I'm more inclined to say we treat 'max_archive_delay' as '0', and turn max_streaming_delay into what you've described. If we fall back so far that we have to go back to reading WALs, then we need to hurry up and catch-up and damn the torpedos. I'd also prefer that we only wait the delay time once until we're fully caught up again (and have gotten back around to waiting for new data). Thanks, Stephen
From: Andrew Dunstan on 2 Jun 2010 13:44 Tom Lane wrote: > I'm still inclined to apply the part of Simon's patch that adds a > transmit timestamp to each SR send chunk. That would actually be > completely unused by the slave given my proposal above, but I think that > it is an important step to take to future-proof the SR protocol against > possible changes in the slave-side timing logic. > +1. From a radically different perspective, I had to do something similar in the buildfarm years ago to protect us from machines reporting with grossly inaccurate timestamps. This was part of the solution. The client adds its current timestamp setting just before transmitting the data to the server. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 2 Jun 2010 13:45
Stephen Frost <sfrost(a)snowman.net> writes: > * Tom Lane (tgl(a)sss.pgh.pa.us) wrote: >> Comments? > I'm not really a huge fan of adding another GUC, to be honest. I'm more > inclined to say we treat 'max_archive_delay' as '0', and turn > max_streaming_delay into what you've described. If we fall back so far > that we have to go back to reading WALs, then we need to hurry up and > catch-up and damn the torpedos. If I thought that 0 were a generally acceptable value, I'd still be pushing the "simplify it to a boolean" agenda ;-). The problem is that that will sometimes kill standby queries even when they are quite short and doing nothing objectionable. > I'd also prefer that we only wait the > delay time once until we're fully caught up again (and have gotten > back around to waiting for new data). The delays will be measured from a receipt instant to current time, which means that the longer it takes to apply a WAL segment or WAL send chunk, the less grace period there will be. (Which is the same as what CVS HEAD does --- I'm just arguing about where we get the start time from.) I believe this does what you suggest and more. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |