Prev: Further Hot Standby documentation required
Next: [HACKERS] Streaming replication - unable to stop the standby
From: Robert Haas on 3 May 2010 16:10 On Mon, May 3, 2010 at 3:39 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas(a)gmail.com> writes: >> On Mon, May 3, 2010 at 11:37 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: >>> I'm inclined to think that we should throw away all this logic and just >>> have the slave cancel competing queries if the replay process waits >>> more than max_standby_delay seconds to acquire a lock. > >> What if we somehow get into a situation where the replay process is >> waiting for a lock over and over and over again, because it keeps >> killing conflicting processes but something restarts them and they >> take locks over again? > > They won't be able to take locks "over again", because the lock manager > won't allow requests to pass a pending previous request, except in > very limited circumstances that shouldn't hold here. They'll queue > up behind the replay process's lock request, not in front of it. > (If that isn't the case, it needs to be fixed, quite independently > of this concern.) Well, the new backends needn't try to take "the same" locks as the existing backends - the point is that in the worst case this proposal means waiting max_standby_delay for EACH replay that requires taking a lock. And that might be a LONG time. One idea I had while thinking this over was to bound the maximum amount of unapplied WAL rather than the absolute amount of time lag. Now, that's a little fruity, because your WAL volume might fluctuate considerably, so you wouldn't really know how far the slave was behind the master chronologically. However, it would avoid all the time skew issues, and it would also more accurately model the idea of a bound on recovery time should we need to promote the standby to master, so maybe it works out to a win. You could still end up stuck semi-permanently behind, but never by more than N segments. Stephen's idea of a mode where we wait up to max_standby_delay for a lock but then kill everything in our path until we've caught up again is another possible way of approaching this problem, although it may lead to "kill storms". Some of that may be inevitable, though: a bound on WAL lag has the same issue - if the primary is generating WAL faster than the standby can apply it, the standby will eventually decide to slaughter everything in its path. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 3 May 2010 18:04 Greg, Robert, > Certainly that one particular case can be solved by making the > servers be in time sync a prereq for HS working (in the traditional way). > And by "prereq" I mean a "user beware" documentation warning. > Last I checked, you work with *lots* of web developers and web companies. I'm sure you can see the issue with the above. > Stephen's idea of a mode where we wait up to max_standby_delay for a > lock but then kill everything in our path until we've caught up again > is another possible way of approaching this problem, although it may > lead to "kill storms". Personally, I thought that the kill storms were exactly what was wrong with max_standby_delay. That is, with MSD, no matter *what* your settings or traffic are, you're going to get query cancel occasionally. I don't see the issue with Tom's approach from a wait perspective. The max wait becomes 1.001X max_standby_delay; there's no way I can think of that replay would wait longer than that. I've yet to see an explanation why it would be longer. Simon's assertion that not all operations take a conventional lock is a much more serious potential flaw. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Bruce Momjian on 3 May 2010 22:45 Simon Riggs wrote: > On Mon, 2010-05-03 at 13:13 -0400, Stephen Frost wrote: > > > Perhaps you could speak to the specific user > > experience difference that you think there would be from this change? > > The difference is really to do with the weight you give to two different > considerations > > * avoid query cancellations > * avoid having recovery fall behind, so that failover time is minimised > > Some people recognise the trade-offs and are planning multiple standby > servers dedicated to different roles/objectives. I understand Simon's point that the two behaviors have different benefits. However, I believe few users will be able to understand when to use which. As I remember, 9.0 has two behaviors: o master delays vacuum cleanup o slave delays WAL application and in 9.1 we will be adding: o slave communicates snapshots to master How would this figure into what we ultimately want in 9.1? -- Bruce Momjian <bruce(a)momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 4 May 2010 03:36 On Mon, 2010-05-03 at 15:04 -0700, Josh Berkus wrote: > I don't see the issue with Tom's approach from a wait perspective. The > max wait becomes 1.001X max_standby_delay; there's no way I can think of > that replay would wait longer than that. I've yet to see an explanation > why it would be longer. Yes, the max wait on any *one* blocker will be max_standby_delay. But if you wait for two blockers, then the total time by which the standby lags will now be 2*max_standby_delay. Add a third, fourth etc and the standby lag keeps rising. We need to avoid confusing these two measurables * standby lag - defined as the total delay from when a WAL record is written to the time the WAL record is applied. This includes both transfer time and any delays imposed by Hot Standby. * standby query delay - defined as the time that recovery will wait for a query to complete before a cancellation takes place. (We could complicate this by asking what happens when recovery is blocked twice by the same query? Would it wait twice, or does it have to track how much it has waited for each query in total so far?) Currently max_standby_delay seeks to constrain the standby lag to a particular value, as a way of providing a bounded time for failover, and also to constrain the amount of WAL that needs to be stored as the lag increases. Currently, there is no guaranteed minimum query delay given to each query. If every query is guaranteed its requested query delay then the standby lag will be unbounded. Less cancellations, higher lag. Some people do want this, though is not currently available. We can do this with two new GUCs: * standby_query_delay - USERSET parameter that allows user to specify a guaranteed query delay, anywhere from 0 to maximum_standby_query_delay * max_standby_query_delay - SIGHUP parameter - parameter exists to provide DBA with a limit on the USERSET standby_query_delay, though I can see some would say this is optional Current behaviour is same as global settings of standby_query_delay = 0 max_standby_query_delay = 0 max_standby_delay = X So if people want minimal cancellations they would specify standby_query_delay = Y (e.g. 30) max_standby_query_delay = Z (e.g. 300) max_standby_delay = -1 -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 4 May 2010 04:37
On Mon, 2010-05-03 at 22:45 -0400, Bruce Momjian wrote: > As I remember, 9.0 has two behaviors: > > o master delays vacuum cleanup > o slave delays WAL application > > and in 9.1 we will be adding: > > o slave communicates snapshots to master > How would this figure into what we ultimately want in 9.1? We would still want all options, since "slave communicates snapshot to master" doesn't solve the problem it just moves the problem elsewhere. It's a question of which factors the user wishes to emphasise for their specific use. > I understand Simon's point that the two behaviors have different > benefits. However, I believe few users will be able to understand when > to use which. If users can understand how to set NDISTINCT for a column, they can understand this. It's not about complexity of UI, its about solving problems. When people hit an issue, I don't want to be telling people "we thought you wouldn't understand it, so we removed the parachute". They might not understand it *before* they hit a problem, so what? But users certainly will afterwards and won't say "thanks" if you prevent an option for them, especially for the stated reason. (My point about ndistinct: 99% of users have no idea that exists or when to use it, but it still exists as an option because it solves a known issue, just like this.) -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |