Prev: Why isn't stats_temp_directory automatically created?
Next: Hot Standby query cancellation and Streaming Replicationintegration
From: Greg Stark on 26 Feb 2010 16:22 On Fri, Feb 26, 2010 at 9:19 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > There's *definitely* not going to be enough information in the WAL > stream coming from a master that doesn't think it has HS slaves. > We can't afford to record all that extra stuff in installations for > which it's just useless overhead. BTW, has anyone made any attempt > to measure the performance hit that the patch in its current form is > creating via added WAL entries? What extra entries? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Dimitri Fontaine on 26 Feb 2010 16:39 Tom Lane <tgl(a)sss.pgh.pa.us> writes: > Well, as Heikki said, a stop-and-go WAL management approach could deal > with that use-case. What I'm concerned about here is the complexity, > reliability, maintainability of trying to interlock WAL application with > slave queries in any sort of fine-grained fashion. Some admin functions for Hot Standby were removed from the path to ease its integration, there was a pause() and resume() feature. I think that offering this explicit control to the user would allow them to choose between HA setup and reporting setup easily enough: just pause the replay when running the reporting, resume it to get fresh data again. If you don't pause, any query can get killed, replay is the priority. Now as far as the feedback loop is concerned, I guess the pause() function would cause the slave to stop publishing any xmin in the master's procarray so that it's free to vacuum and archive whatever it wants to. Should the slave accumulate too much lag, it will resume from the archive rather than live from the SR link. How much that helps? Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 26 Feb 2010 16:44 Greg Stark <gsstark(a)mit.edu> writes: > On Fri, Feb 26, 2010 at 9:19 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: >> There's *definitely* not going to be enough information in the WAL >> stream coming from a master that doesn't think it has HS slaves. >> We can't afford to record all that extra stuff in installations for >> which it's just useless overhead. �BTW, has anyone made any attempt >> to measure the performance hit that the patch in its current form is >> creating via added WAL entries? > What extra entries? Locks, just for starters. I haven't read enough of the code yet to know what else Simon added. In the past it's not been necessary to record any transient information in WAL, but now we'll have to. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Dimitri Fontaine on 26 Feb 2010 17:11 Bruce Momjian <bruce(a)momjian.us> writes: > Doesn't the system already adjust the delay based on the length of slave > transactions, e.g. max_standby_delay. It seems there is no need for a > user switch --- just max_standby_delay really high. Well that GUC looks like it allows to set a compromise between HA and reporting, not to say "do not ever give the priority to the replay while I'm running my reports". At least that's how I understand it. The feedback loop might get expensive on master server when running reporting queries on the slave, unless you can "pause" it explicitly I think. I don't see how the system will guess that you're running a reporting server rather than a HA node, and max_standby_delay is just a way to tell the standby to please be nice in case of abuse. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Stark on 26 Feb 2010 19:43
On Fri, Feb 26, 2010 at 11:56 PM, Greg Smith <greg(a)2ndquadrant.com> wrote: > This is also the reason why the whole "pause recovery" idea is a fruitless > path to wander down. The whole point of this feature is that people have a > secondary server available for high-availability, *first and foremost*, but > they'd like it to do something more interesting that leave it idle all the > time. The idea that you can hold off on applying standby updates for long > enough to run seriously long reports is completely at odds with the idea of > high-availability. Well you can go sit in the same corner as Simon with your high availability servers. I want my ability to run large batch queries without any performance or reliability impact on the primary server. You can have one or the other but you can't get both. If you set max_standby_delay low then you get your high availability server, if you set it high you get a useful report server. If you build sync replication which we don't have today and which will open another huge can of usability worms when we haven't even finish bottling the two we've already opened then you lose the lack of impact on the primary. Suddenly the queries you run on the slaves cause your production database to bloat. Plus you have extra network connections which take resources on your master and have to be kept up at all times or you lose your slaves. I think the design constraint of not allowing any upstream data flow is actually very valuable. Eventually we'll have it for sync replication but it's much better that we've built things incrementally and can be sure that nothing really depends on it for basic functionality. This is what allows us to know that the slave imposes no reliability impact on the master. It's what allows us to know that everything will work identically regardless of whether you have a walreceiver running or are running off archived log files. Remember I wanted to entirely abstract away the walreciever and allow multiple wal communication methods. I think it would make more sense to use something like Spread to distribute the logs so the master only has to send them once and as many slaves as you want can pick them up. The current architecture doesn't scale very well if you want to have hundreds of slaves for one master. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |