Prev: Further Hot Standby documentation required
Next: [HACKERS] Streaming replication - unable to stop the standby
From: Tom Lane on 8 May 2010 20:57 Andres Freund <andres(a)anarazel.de> writes: > On Sunday 09 May 2010 01:34:18 Bruce Momjian wrote: >> I think everyone agrees the current code is unusable, per Heikki's >> comment about a WAL file arriving after a period of no WAL activity, and >> look how long it took our group to even understand why that fails so >> badly. > To be honest its not *that* hard to simply make sure generating wal regularly > to combat that. While it surely aint a nice workaround its not much of a > problem either. Well, that's dumping a kluge onto users; but really that isn't the point. What we have here is a badly designed and badly implemented feature, and we need to not ship it like this so as to not institutionalize a bad design. I like the proposal of a boolean because it provides only the minimal feature set of two cases that are both clearly needed and easily implementable. Whatever we do later is certain to provide a superset of those two cases. If we do something else (and that includes my own proposal of a straight lock timeout), we'll be implementing something we might wish to take back later. Taking out features after they've been in a release is very hard, even if we realize they're badly designed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Smith on 8 May 2010 23:00 Tom Lane wrote: > Taking out features after they've been in a release is very hard, even if we realize they're badly > designed. > It doesn't have to be; that's the problem the "release often" part takes care of. If a release has only been out a year, and a new one comes out saying "oh, that thing we released for the first time in the last version, it didn't work as well as we'd hoped in the field; you should try to avoid that and use this new implementation that works better instead once you can upgrade", that's not only not hard, it's exactly what people using a X.0 release expect to happen. I've read the message from you that started off this thread several times now. Your low-level code implementation details shared later obviously need to be addressed. But all of the "fundamental" and "fatal" issues you mentioned at the start continue to strike me as either situations where you don't agree with the use case this was designed for, or spots where you feel the userland workarounds required to make it work right are too onerous. Bruce's objections seem to fall mainly into the latter category. I've been wandering around talking to people about that exact subject--what do people want and expect from Hot Standby, and what would they do to gain its benefits--for over six months now, independently of Simon's work which did a lot of that before me too. The use cases are covered as best they can be without better support from expected future SR features like heartbeats and XID loopback. As for the workarounds required to make things work, the responses I get match what we just saw from Andres. When the required details are explained, people say "that's annoying but I can do that", and off we go. There are significant documentation issues I know need to be cleaned up here, and I've already said I'll take care of that as soon as freeze is really here and I have a stable target. (That this discussion is still going on says that's not yet) What I fail to see are problems significant enough to not ship the parts of this feature that are done, so that it can be used by those it is appropriate for, allow feedback, and make it easy to test individual improvements upon what's already there. I can't make you prioritize based on what people are telling me. All I can do is suggest you reconsider handing control over the decision to use this feature or not to the users of the software, so they can make their own choice. I'm tired of arguing about this instead of doing productive work, and I've done all I can here to try and work within the development process of the community. If talk of removing the max_standby_delay feature clears up, I'll happily provide my promised round of documentation updates, to make its limitations and associated workarounds as clear as they can be, within a week of being told go on that. If instead this capability goes away, making those moot, I'll maintain my own release for the 2ndQuadrant customers who have insisted they need this capability if I have to. That would be really unfortunate, because the only bucket I can pull time out of for that is the one I currently allocate to answering questions on the mailing lists here most days. I'd rather spend that helping out the PostgreSQL community, but we do need to deliver what our customers want too. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg(a)2ndQuadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Bruce Momjian on 8 May 2010 23:46 Greg Smith wrote: > Tom Lane wrote: > > > Taking out features after they've been in a release is very hard, even if we realize they're badly > > designed. > > > > It doesn't have to be; that's the problem the "release often" part takes > care of. If a release has only been out a year, and a new one comes out > saying "oh, that thing we released for the first time in the last > version, it didn't work as well as we'd hoped in the field; you should > try to avoid that and use this new implementation that works better > instead once you can upgrade", that's not only not hard, it's exactly > what people using a X.0 release expect to happen. I think this is the crux of the issue. Tom and I are saying that historically we have shipped only complete features, or as complete as reasonable, and have removed items during beta that we found didn't meet this criteria, in an attempt to reduce the amount of feature set churn in Postgres. A database is complex, so modifying the API between major releases is something we only do when we find a significant benefit. In this case, if we keep max_standby_delay as non-boolean, we know it will have to be redesigned in 9.1, and it is unclear to me what additional knowledge we will gain by shipping it in 9.0, except to have to tell people that it doesn't work well or requires complex work-arounds, and that doesn't thrill any of us. (I already suggested that statement_timeout might supply a reasonable and predictable workaround for non-boolean usage of max_standby_delay.) -- Bruce Momjian <bruce(a)momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 8 May 2010 23:50 On Sat, May 8, 2010 at 6:51 PM, Bruce Momjian <bruce(a)momjian.us> wrote: > Robert Haas wrote: >> On Sat, May 8, 2010 at 3:40 PM, Bruce Momjian <bruce(a)momjian.us> wrote: >> > Robert Haas wrote: >> >> On Sat, May 8, 2010 at 2:48 PM, Bruce Momjian <bruce(a)momjian.us> wrote: >> >> > I think the concensus is to change this setting to a boolean. ?If you >> >> > don't want to do it, I am sure we can find someone who will. >> >> >> >> I still think we should revert to Tom's original proposal. >> > >> > And Tom's proposal was to do it on WAL slave arrival time? ?If we could >> > get agreement from everyone that that is the proper direction, fine, but >> > I am hearing things like plugins, and other complexity that makes it >> > seem we are not getting closer to an agreed solution, and without >> > agreement, the simplest approach seems to be just to remove the part we >> > can't agree upon. >> > >> > I think the big question is whether this issue is significant enough >> > that we should ignore our policy of no feature design during beta. >> >> Tom's proposal was basically to define recovery_process_lock_timeout. >> The recovery process would wait X seconds for a lock, then kill >> whoever held it. It's not the greatest knob in the world for the >> reasons already pointed out, but I think it's still better than a >> boolean and will be useful to some users. And it's pretty simple. > > I thought there was concern about lock stacking causing > unpredictable/unbounded delays. I am not sure boolean has a majority > vote, but I am suggesting that because it is the _minimal_ feature set, > and when we can't agree during beta, the minimal feature set seems like > the best choice. > > Clearly, anything is more feature-full than boolean --- the big question > is whether Tom's proposal is significantly better than boolean that we > should spend the time designing and implementing it, with the > possibility it will all be changed in 9.1. I doubt it's likely to be thrown out completely. We might decide to fine-tune it in some way. My fear is that if we ship this with only a boolean, we're shipping crippleware. If that fear turns out to be unfounded, I will of course be happy, but that's my concern, and I don't believe that it's entirely unfounded. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Bruce Momjian on 9 May 2010 00:08
Robert Haas wrote: > > Clearly, anything is more feature-full than boolean --- the big question > > is whether Tom's proposal is significantly better than boolean that we > > should spend the time designing and implementing it, with the > > possibility it will all be changed in 9.1. > > I doubt it's likely to be thrown out completely. We might decide to > fine-tune it in some way. My fear is that if we ship this with only a > boolean, we're shipping crippleware. If that fear turns out to be > unfounded, I will of course be happy, but that's my concern, and I > don't believe that it's entirely unfounded. Well, historically, we have been willing to not ship features if we can't get it right. No one has ever accused us of crippleware, but our hesitancy has caused slower user adoption, though long-term, it has helped us grow a dedicated user base that trusts us. -- Bruce Momjian <bruce(a)momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |