From: Tom Lane on
Andres Freund <andres(a)anarazel.de> writes:
> On Sunday 09 May 2010 01:34:18 Bruce Momjian wrote:
>> I think everyone agrees the current code is unusable, per Heikki's
>> comment about a WAL file arriving after a period of no WAL activity, and
>> look how long it took our group to even understand why that fails so
>> badly.

> To be honest its not *that* hard to simply make sure generating wal regularly
> to combat that. While it surely aint a nice workaround its not much of a
> problem either.

Well, that's dumping a kluge onto users; but really that isn't the
point. What we have here is a badly designed and badly implemented
feature, and we need to not ship it like this so as to not
institutionalize a bad design.

I like the proposal of a boolean because it provides only the minimal
feature set of two cases that are both clearly needed and easily
implementable. Whatever we do later is certain to provide a superset
of those two cases. If we do something else (and that includes my own
proposal of a straight lock timeout), we'll be implementing something
we might wish to take back later. Taking out features after they've
been in a release is very hard, even if we realize they're badly
designed.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Smith on
Tom Lane wrote:

> Taking out features after they've been in a release is very hard, even if we realize they're badly
> designed.
>

It doesn't have to be; that's the problem the "release often" part takes
care of. If a release has only been out a year, and a new one comes out
saying "oh, that thing we released for the first time in the last
version, it didn't work as well as we'd hoped in the field; you should
try to avoid that and use this new implementation that works better
instead once you can upgrade", that's not only not hard, it's exactly
what people using a X.0 release expect to happen.

I've read the message from you that started off this thread several
times now. Your low-level code implementation details shared later
obviously need to be addressed. But all of the "fundamental" and
"fatal" issues you mentioned at the start continue to strike me as
either situations where you don't agree with the use case this was
designed for, or spots where you feel the userland workarounds required
to make it work right are too onerous. Bruce's objections seem to fall
mainly into the latter category.

I've been wandering around talking to people about that exact
subject--what do people want and expect from Hot Standby, and what would
they do to gain its benefits--for over six months now, independently of
Simon's work which did a lot of that before me too. The use cases are
covered as best they can be without better support from expected future
SR features like heartbeats and XID loopback. As for the workarounds
required to make things work, the responses I get match what we just saw
from Andres. When the required details are explained, people say
"that's annoying but I can do that", and off we go. There are
significant documentation issues I know need to be cleaned up here, and
I've already said I'll take care of that as soon as freeze is really
here and I have a stable target. (That this discussion is still going
on says that's not yet)

What I fail to see are problems significant enough to not ship the parts
of this feature that are done, so that it can be used by those it is
appropriate for, allow feedback, and make it easy to test individual
improvements upon what's already there. I can't make you prioritize
based on what people are telling me. All I can do is suggest you
reconsider handing control over the decision to use this feature or not
to the users of the software, so they can make their own choice.

I'm tired of arguing about this instead of doing productive work, and
I've done all I can here to try and work within the development process
of the community. If talk of removing the max_standby_delay feature
clears up, I'll happily provide my promised round of documentation
updates, to make its limitations and associated workarounds as clear as
they can be, within a week of being told go on that. If instead this
capability goes away, making those moot, I'll maintain my own release
for the 2ndQuadrant customers who have insisted they need this
capability if I have to. That would be really unfortunate, because the
only bucket I can pull time out of for that is the one I currently
allocate to answering questions on the mailing lists here most days.
I'd rather spend that helping out the PostgreSQL community, but we do
need to deliver what our customers want too.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(a)2ndQuadrant.com www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Bruce Momjian on
Greg Smith wrote:
> Tom Lane wrote:
>
> > Taking out features after they've been in a release is very hard, even if we realize they're badly
> > designed.
> >
>
> It doesn't have to be; that's the problem the "release often" part takes
> care of. If a release has only been out a year, and a new one comes out
> saying "oh, that thing we released for the first time in the last
> version, it didn't work as well as we'd hoped in the field; you should
> try to avoid that and use this new implementation that works better
> instead once you can upgrade", that's not only not hard, it's exactly
> what people using a X.0 release expect to happen.

I think this is the crux of the issue. Tom and I are saying that
historically we have shipped only complete features, or as complete as
reasonable, and have removed items during beta that we found didn't meet
this criteria, in an attempt to reduce the amount of feature set churn
in Postgres. A database is complex, so modifying the API between major
releases is something we only do when we find a significant benefit.

In this case, if we keep max_standby_delay as non-boolean, we know it
will have to be redesigned in 9.1, and it is unclear to me what
additional knowledge we will gain by shipping it in 9.0, except to have
to tell people that it doesn't work well or requires complex
work-arounds, and that doesn't thrill any of us. (I already suggested
that statement_timeout might supply a reasonable and predictable
workaround for non-boolean usage of max_standby_delay.)

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Sat, May 8, 2010 at 6:51 PM, Bruce Momjian <bruce(a)momjian.us> wrote:
> Robert Haas wrote:
>> On Sat, May 8, 2010 at 3:40 PM, Bruce Momjian <bruce(a)momjian.us> wrote:
>> > Robert Haas wrote:
>> >> On Sat, May 8, 2010 at 2:48 PM, Bruce Momjian <bruce(a)momjian.us> wrote:
>> >> > I think the concensus is to change this setting to a boolean. ?If you
>> >> > don't want to do it, I am sure we can find someone who will.
>> >>
>> >> I still think we should revert to Tom's original proposal.
>> >
>> > And Tom's proposal was to do it on WAL slave arrival time? ?If we could
>> > get agreement from everyone that that is the proper direction, fine, but
>> > I am hearing things like plugins, and other complexity that makes it
>> > seem we are not getting closer to an agreed solution, and without
>> > agreement, the simplest approach seems to be just to remove the part we
>> > can't agree upon.
>> >
>> > I think the big question is whether this issue is significant enough
>> > that we should ignore our policy of no feature design during beta.
>>
>> Tom's proposal was basically to define recovery_process_lock_timeout.
>> The recovery process would wait X seconds for a lock, then kill
>> whoever held it.  It's not the greatest knob in the world for the
>> reasons already pointed out, but I think it's still better than a
>> boolean and will be useful to some users.  And it's pretty simple.
>
> I thought there was concern about lock stacking causing
> unpredictable/unbounded delays.   I am not sure boolean has a majority
> vote, but I am suggesting that because it is the _minimal_ feature set,
> and when we can't agree during beta, the minimal feature set seems like
> the best choice.
>
> Clearly, anything is more feature-full than boolean --- the big question
> is whether Tom's proposal is significantly better than boolean that we
> should spend the time designing and implementing it, with the
> possibility it will all be changed in 9.1.

I doubt it's likely to be thrown out completely. We might decide to
fine-tune it in some way. My fear is that if we ship this with only a
boolean, we're shipping crippleware. If that fear turns out to be
unfounded, I will of course be happy, but that's my concern, and I
don't believe that it's entirely unfounded.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Bruce Momjian on
Robert Haas wrote:
> > Clearly, anything is more feature-full than boolean --- the big question
> > is whether Tom's proposal is significantly better than boolean that we
> > should spend the time designing and implementing it, with the
> > possibility it will all be changed in 9.1.
>
> I doubt it's likely to be thrown out completely. We might decide to
> fine-tune it in some way. My fear is that if we ship this with only a
> boolean, we're shipping crippleware. If that fear turns out to be
> unfounded, I will of course be happy, but that's my concern, and I
> don't believe that it's entirely unfounded.

Well, historically, we have been willing to not ship features if we
can't get it right. No one has ever accused us of crippleware, but our
hesitancy has caused slower user adoption, though long-term, it has
helped us grow a dedicated user base that trusts us.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers