Prev: Hot Standby query cancellation and Streaming Replication integration
Next: ProcSignalSlot vs. PGPROC
From: Josh Berkus on 2 Mar 2010 14:03 On 3/2/10 10:30 AM, Bruce Momjian wrote: > Right now you can't choose "master bloat", but you can choose the other > two. I think that is acceptable for 9.0, assuming the other two don't > have the problems that Tom foresees. Actually, if vacuum_defer_cleanup_age can be used, "master bloat" is an option. Hopefully I'll get some time for serious testing this weekend. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Smith on 2 Mar 2010 15:11 Bruce Momjian wrote: >> Right now you can't choose "master bloat", but you can choose the other >> two. I think that is acceptable for 9.0, assuming the other two don't >> have the problems that Tom foresees. >> > > I was wrong. You can choose "master bloat" with > vacuum_defer_cleanup_age, but only crudely because it is measured in > xids and the master defers no matter what queries are running on the > slave... OK with you finding the situation acceptable, so long as it's an informed decision. From how you're writing about this, I'm comfortable you (and everybody else still involved here) have absorbed the issues enough that we're all talking about the same thing now. Since there are a couple of ugly user-space hacks possible for prioritizing "master bloat", and nobody is stepping up to work on resolving this via my suggestion involving better SR integration, seems to me heated discussion of code changes has come to a resolution of sorts I (and Simon, just checked) can live with. Sounds like we have three action paths here: -Tom already said he was planning a tour through the HS/SR code, I wanted that to happen with him aware of this issue. -Josh will continue doing his testing, also better informed about this particular soft spot. -I'll continue test-case construction for the problems here there are still concerns about (pathologic max_standby_delay and b-tree split issues being the top two on that list), and keep sharing particularly interesting ones here to help everyone else's testing. If it turns out any of those paths leads to a must-fix problem that doesn't have an acceptable solution, at least the idea of this as a "plan B" is both documented and more widely understood then when I started ringing this particular bell. I just updated the Open Items list: http://wiki.postgresql.org/wiki/PostgreSQL_9.0_Open_Items to officially put myself on the hook for the following HS related documentation items that have come up recently, aiming to get them all wrapped up in time before or during early beta: -Update Hot Standby documentation: clearly explain relationships between the 3 major setup trade-offs, "buffer cleanup lock", notes on which queries are killed once max_standby_delay is reached, measuring XID churn on master for setting vacuum_defer_cleanup_age -Clean up archive_command docs related to recent "/bin/true" addition. Given that's where I expect people who run into the pg_stop_backup warning message recently added will end up at, noting its value for escaping from that particular case might be useful too. To finish airing my personal 9.0 TODO list now that I've gone this far, I'm also still working on completing the following patches that initial versions have been submitted of, was close to finishing both before getting side-tracked onto this larger issue: -pgbench > 4000 scale bug fix: http://archives.postgresql.org/message-id/4B621BA3.7090306(a)2ndquadrant.com -Improving the logging/error reporting/no timestamp issues in pg_standby re-raised recently by Selena: http://archives.postgresql.org/message-id/2b5e566d1001250945oae17be8n6317f827e3bd7492(a)mail.gmail.com If nobody else claims them as something they're working on before, I suspect I'll then move onto building some of the archiver UI improvements discussed most recently as part of the "pg_stop_backup does not complete" thread, despite Heikki having crushed my dreams of a simple solution to those by pointing out the shared memory memory limitation involved. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg(a)2ndQuadrant.com www.2ndQuadrant.us
From: Josh Berkus on 2 Mar 2010 19:22 On 3/2/10 12:47 PM, Marc Munro wrote: > To take it further still, if vacuum on the master could be prevented > from touching records that are less than max_standby_delay seconds old, > it would be safe to apply WAL from the very latest vacuum. I guess HOT > could be handled similarly though that may eliminate much of the > advantage of HOT updates. Aside from the inability to convert between transcation count and time, isn't this what vacuum_defer_cleanup_age is supposed to do? Or does it not help with HOT updates? --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 10 Mar 2010 01:29 All, I've been playing with vacuum_defer_cleanup_age in reference to the query cancel problem. It really seems to me that this is the way forward in terms of dealing with query cancel for normal operation rather than wal_standby_delay, or maybe in combination with it. As a first test, I set up a deliberately pathological situation with pgbench and a wal_standby_delay of 1 second. This allowed me to trigger query cancel on a relatively simple reporting query; in fact, to make it impossible to complete. Then I increased vacuum_defer_cleanup_age to 100000, which represents about 5 minutes of transactions on the test system. This eliminated all query cancels for the reporting query, which takes an average of 10s. Next is a database bloat test, but I'll need to do that on a system with more free space than my laptop. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 10 Mar 2010 13:59
On 3/10/10 3:38 AM, Greg Stark wrote: > I think that means that a > vacuum_defer_cleanup of up to about 100 or so (it depends on the width > of your counter record) might be reasonable as a general suggestion > but anything higher will depend on understanding the specific system. 100 wouldn't be useful at all. It would increase bloat without doing anything about query cancel except on a very lightly used system. > With vacuum_defer_cleanup that will no longer be true. > It will be as if you always have a query lasting n transactions in > your system at all times. Yep, but until we get XID-publish-to-master working in 9.1, I think it's probably the best we can do. At least it's no *worse* than having a long-running query on the master at all times. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |