Prev: Hot Standby query cancellation and Streaming Replication integration
Next: ProcSignalSlot vs. PGPROC
From: Heikki Linnakangas on 26 Feb 2010 10:21 Richard Huxton wrote: > Can we not wait to cancel the transaction until *any* new lock is > attempted though? That should protect all the single-statement > long-running transactions that are already underway. Aggregates etc. Hmm, that's an interesting thought. You'll still need to somehow tell the victim backend "you have to fail if you try to acquire any more locks", but a single per-backend flag in the procarray would suffice. You could also clear the flag whenever you free the last snapshot in the transaction (ie. between each query in read committed mode). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Richard Huxton on 26 Feb 2010 11:43 Replying to my own post - first sign of madness... Let's see if I've got the concepts clear here, and hopefully my thinking it through will help others reading the archives. There are two queues: 1. Cleanup on the master 2. Replay on the slave Running write queries on the master adds to both queues. Running (read-only) queries on the slave prevents you removing from both queues. There are two interesting measurements of "age"/"size": 1. Oldest item in / length of queue (knowable) 2. How long will it take to clear the queue (estimable at best) You'd like to know #2 to keep up with your workload. Unfortunately, you can't for certain unless you have control over new incoming queries (on both master and slave). You might want four separate GUCs for the two measurements on the two queues. We currently have two that (sort of) match #1 "Oldest item" (vacuum_defer_cleanup_age, max_standby_delay). Delaying replay on a slave has no effect on the master. If a slave falls too far behind it's responsible for catch-up (via normal WAL archives). There is no point in delaying cleanup on the master unless it's going to help one or more slaves. In fact, you don't want to start delaying cleanup until you have to, otherwise you're wasting your delay time. This seems to be the case with vacuum_defer_cleanup_age. If I have a heavily-updated table and I defer vacuuming then when any given query starts on the slave it's going to be half used up already. There's also no point in deferring cleanup on the master if the standby is already waiting on a conflict that will cause its queries to be cancelled anyway. Not only won't it help, but it might make things worse since transactions will be cancelled, the conflict will be replayed and (presumably) queries will be re-submitted only to be cancelled again. This is what Greg Smith's discussion of the keep-alives was about. Giving the master enough information to be smarter about cleanup (and making the conflicts more fine-grained). The situation with deferring on one or both ends of process just gets more complicated with multiple slaves. There's all sorts of unpleasant feedback loops I can envisage there. For the case of single slave being used to run long reporting queries the ideal scenario would be the following. Master starts deferring vacuum activity just before the query starts. When that times out, the slave will receive the cleanup info, refuse to replay it and start its delay. This gives you a total available query time of: natural time between vacuums + vacuum delay + WAL transfer time + standby delay I can think of five useful things we should be doing (and might be already - don't know). 1. On the master, deduce whether the slave is already waiting on a query. If so, don't bother delaying cleanup. Clearly you don't want to be signalling hundreds of times a second though. Does the slave pause fetching via streaming replication if replay is blocked on a query? Could we signal "half-way to max-age" or some such? 2. Perhaps simpler than trying to make the master smarter, just allow SET this_transaction_is_probably_a_long_one=true on the slave. That (a) clears the queue on the slave and (b) sends the signal to the master which then starts deferring vacuum. 3. Do a burst of cleanup activity on the master after blocking. This should concentrate conflicts together when they reach the slave. Perhaps vacuum_defer_cleanup_age should be vacuum_deferred_queue_size and measure the amount of work to do, rather than the max age of the oldest cleanup (if I've understood correctly). 4. Do a burst of replay on the slave after blocking. Perhaps every time it cancels a transaction it should replay at least half the queued WAL before letting new transactions start. Or perhaps it replays any vacuum activity it comes across and then stops. That should sync with #2 assuming the slave doesn't lag the master too much. 5. I've been mixing "defer" and "delay", as do the docs. We should probably settle on one or the other. I think defer conveys the meaning more precisely, but what about non-native English speakers? -- Richard Huxton Archonet Ltd -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 26 Feb 2010 14:02 On 2/26/10 6:57 AM, Richard Huxton wrote: > > Can we not wait to cancel the transaction until *any* new lock is > attempted though? That should protect all the single-statement > long-running transactions that are already underway. Aggregates etc. I like this approach. Is it fragile in some non-obvious way? --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 26 Feb 2010 14:06 On 2/26/10 10:53 AM, Tom Lane wrote: > I think that what we are going to have to do before we can ship 9.0 > is rip all of that stuff out and replace it with the sort of closed-loop > synchronization Greg Smith is pushing. It will probably be several > months before everyone is forced to accept that, which is why 9.0 is > not going to ship this year. I don't think that publishing visibility info back to the master ... and subsequently burdening the master substantially for each additional slave ... are what most users want. Certainly for use cases like NTT's, it is, but not for most of our users. In fact, I seem to remember specifically discussing the approach of trying to publish snapshots back to the master, and rejecting it on this list during the development of SR. Does anyone know how Oracle solves these issues? Does their structure (separate rollback log) make it easier for them? --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 26 Feb 2010 15:02
> I don't see a "substantial additional burden" there. What I would > imagine is needed is that the slave transmits a single number back > --- its current oldest xmin --- and the walsender process publishes > that number as its transaction xmin in its PGPROC entry on the master. If the main purpose of the slave is long-running queries, though, this could cause a lot of bloat on the master. That's a special case, but a reason why we would want to preserve the stop replication functionality. > I don't doubt that this approach will have its own gotchas that we > find as we get into it. But it looks soluble. I have no faith in > either the correctness or the usability of the approach currently > being pursued. So, why not start working on it now, instead of arguing about it? It'll be easy to prove the approach once we have some test code. -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |