Prev: Why isn't stats_temp_directory automatically created?
Next: Hot Standby query cancellation and Streaming Replicationintegration
From: Greg Stark on 26 Feb 2010 21:30 On Sat, Feb 27, 2010 at 1:53 AM, Greg Smith <greg(a)2ndquadrant.com> wrote: > Greg Stark wrote: >> >> Well you can go sit in the same corner as Simon with your high >> availability servers. >> >> I want my ability to run large batch queries without any performance >> or reliability impact on the primary server. >> > > Thank you for combining a small personal attack with a selfish commentary > about how yours is the only valid viewpoint. Saves me a lot of trouble > replying to your messages, can just ignore them instead if this is how > you're going to act. Eh? That's not what I meant at all. Actually it's kind of the exact opposite of what I meant. What I meant was that your description of the "High Availability first and foremost" is only one possible use case. Simon in the past expressed the same single-minded focus on that use case. It's a perfectly valid use case and I would probably agree if we had to choose just one it would be the most important. But we don't have to choose just one. There are other valid use cases such as load balancing and isolating your large batch queries from your production systems. I don't want us to throw out all these other use cases because we only consider high availability as the only use case we're interested in. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Stark on 26 Feb 2010 23:02 On Fri, Feb 26, 2010 at 9:44 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > Greg Stark <gsstark(a)mit.edu> writes: > >> What extra entries? > > Locks, just for starters. I haven't read enough of the code yet to know > what else Simon added. In the past it's not been necessary to record > any transient information in WAL, but now we'll have to. Haven't we been writing locks to the WAL since two-phase commit? -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 27 Feb 2010 20:00 On Fri, Feb 26, 2010 at 1:53 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > Greg Stark <gsstark(a)mit.edu> writes: >> In the model you describe any long-lived queries on the slave cause >> tables in the master to bloat with dead records. > > Yup, same as they would do on the master. > >> I think this model is on the roadmap but it's not appropriate for >> everyone and I think one of the benefits of having delayed it is that >> it forces us to get the independent model right before throwing in >> extra complications. It would be too easy to rely on the slave >> feedback as an answer for hard questions about usability if we had it >> and just ignore the question of what to do when it's not the right >> solution for the user. > > I'm going to make an unvarnished assertion here. I believe that the > notion of synchronizing the WAL stream against slave queries is > fundamentally wrong and we will never be able to make it work. > The information needed isn't available in the log stream and can't be > made available without very large additions (and consequent performance > penalties). As we start getting actual beta testing we are going to > uncover all sorts of missed cases that are not going to be fixable > without piling additional ugly kluges on top of the ones Simon has > already crammed into the system. Performance and reliability will both > suffer. > > I think that what we are going to have to do before we can ship 9.0 > is rip all of that stuff out and replace it with the sort of closed-loop > synchronization Greg Smith is pushing. It will probably be several > months before everyone is forced to accept that, which is why 9.0 is > not going to ship this year. Somewhat unusually for me, I haven't been able to keep up with my email over the last few days, so I'm weighing in on this one a bit late. It seems to me that if we're forced to pass the xmin from the slave back to the master, that would be a huge step backward in terms of both scalability and performance, so I really hope it doesn't come to that. I wish I understood better exactly what you mean by "the notion of synchronizing the WAL stream against slave queries" and why you don't think it will work. Can you elaborate? ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Stark on 28 Feb 2010 08:54 On Sun, Feb 28, 2010 at 6:07 AM, Greg Smith <greg(a)2ndquadrant.com> wrote: > Not forced to--have the option of. There are obviously workloads where you > wouldn't want this. At the same time, I think there are some pretty common > ones people are going to expect HS+SR to work on transparently where this > would obviously be the preferred trade-off to make, were it available as one > of the options. The test case I put together shows an intentionally > pathological but not completely unrealistic example of such a workload. Well if we're forced to eventually have both then it kind of takes the wind out of Tom's arguments. We had better get both features working so it becomes only a question of which is worth doing first and which can be held off. Since there aren't any actual bugs in evidence for the current setup and we already have it that's a pretty easy decision. > What I am sure of is that a SR-based xmin passing approach is simpler, > easier to explain, more robust for some common workloads, and less likely to > give surprised "wow, I didn't think *that* would cancel my standby query" > reports from the field Really? I think we get lots of suprised wows from the field from the idea that a long-running read-only query can cause your database to bloat. I think the only reason that's obvious to us is that we've been grappling with that problem for so long. > And since I never like to bet against Tom's gut feel, having it > around as a "plan B" in case he's right about an overwhelming round of bug > reports piling up against the max_standby_delay etc. logic doesn't hurt > either. Agreed. Though I think it'll be bad in that case even if we have a plan B. It'll mean no file-based log shipping replicas and no guarantee that what you run on the standby can't affect the master -- which is a pretty nice guarantee. It'll also mean it'll be much more fragile against network interruptions. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Joachim Wieland on 28 Feb 2010 10:56
On Sun, Feb 28, 2010 at 2:54 PM, Greg Stark <gsstark(a)mit.edu> wrote: > Really? I think we get lots of suprised wows from the field from the > idea that a long-running read-only query can cause your database to > bloat. I think the only reason that's obvious to us is that we've been > grappling with that problem for so long. It seems to me that the scenario that you are looking at is one where people run different queries with and without HS, i.e. that they will run longer read-only queries than now once they have HS. I don't think that is the case. If it isn't you cannot really speak of a master "bloat". Instead, I assume that most people who will grab 9.0 and use HS+SR do already have a database with a certain query profile. Now with HS+SR they will try to put the most costly and longest read-only queries to the standby but in the end will run the same number of queries with the same overall complexity. Now let's take a look at both scenarios from the administrators' point of view: 1) With the current implementation they will see better performance on the master and more aggressive vacuum (!), since they have less long-running queries now on the master and autovacuum can kick in and clean up with less delay than before. On the other hand their queries on the standby might fail and they will start thinking that this HS+SR feature is not as convincing as they thought it was... Next step for them is to take the documentation and study it for a few days to learn all about vacuum, different delays, transaction ids and age parameters and experiment a few weeks until no more queries fail - for a while... But they can never be sure... In the end they might also modify the parameters in the wrong direction or overshoot because of lack of time to experiment and lose another important property without noticing (like being as close as possible to the master). 2) On the other hand if we could ship 9.0 with the xmin-propagation feature, people would still see a better performance and have a hot standby system but this time without query cancellations. Again: the read-only queries that will be processed by the HS in the future are being processed by the master today anyway, so why should it get worse? The first impression will be that it just works nicely out of the box, is easy to set up and has no negative effect (query cancellation) that has not already shown up before (vacuum lag). I guess that most people will just run fine with this setup and never get to know about the internals. Of course we should still offer an expert mode where you can turn all kinds of knobs and where you can avoid the vacuum dependency but it would be nice if this could be the expert mode only. Tuning this is highly installation specific and you need to have a deep understanding of how PostgreSQL and HS work internally and what you actually want to achieve... > Agreed. Though I think it'll be bad in that case even if we have a > plan B. It'll mean no file-based log shipping replicas and no > guarantee that what you run on the standby can't affect the master -- > which is a pretty nice guarantee. It'll also mean it'll be much more > fragile against network interruptions. Regarding the network interruptions... in reality if you have network interruptions of several minutes between your primary and your standby, you have worse problems anyway... If the standby does not renew its xmin for n seconds, log a message and just go on... Joachim -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |