From: Dimitri Fontaine on
Simon Riggs <simon(a)2ndQuadrant.com> writes:

> In the original patch I had Pause/Resume feature for controlling
> recovery during Hot Standby. It was removed for lack of time.
>
> With all the discussion around the HS UI, it would be something that
> could be back very easily.

Please!

Manual control over recovery is the best solution ever proposed for
giving the user explicit control over the trade-off between HA and slave
queries.

It would allow us to say that by default, conflict favors WAL recovery
no matter what. If you want to ensure your queries won't get canceled,
pause the recovery, run your report, resume the recovery.

I understand that automated and flexible conflict resolution still is
needed or wanted even with this UI, but that would allow a much more
crude automated tool to be acceptable. Specifically, it could only
target short queries on the standby, for long running queries you don't
want to get cancelled, pause the recovery.

Regards,
--
dim

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Tue, May 4, 2010 at 4:02 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> In the original patch I had Pause/Resume feature for controlling
> recovery during Hot Standby. It was removed for lack of time.

Well, it's not like we have more time now than we did then. I think
we need to postpone this discussion to 9.1. If we're going to start
accepting patches for new features, then why should we accept only
patches for HS/SR? I have two patches already in the queue that I'd
like to see committed and if I thought that there was a chance of
getting anything further done for 9.0, there'd be several more. Many
other people have patches waiting also, or are holding off development
because we are in feature freeze right now. Hot Standby is a great
feature, but, I don't see any reason to say that we're going to allow
new feature development just for HS but not for anything else.

I also think that worrying about fine-tuning HS at this point is a bit
like complaining that the jump suits of the crew of the Space Shuttle
Challenger were not made of 100% recyclable materials. Just yesterday
we had a report of an HS server getting into a state where it failed
to shut down properly; and I believe that we never fully resolved the
issue of occasional extremely-long spikes in HS response time, either.
Heikki just fixed a bug our btree recovery code which is apparently
new to 9.0 since he did not backpatch it. I think that getting into a
discussion of pausing and resuming recovery, or even the parallel
discussion on max_standby_delay, are fiddling with things that,
granted, are probably not ideal, and yes, we should improve them in a
future release, but they're not what we should be worrying about right
now. What I think we SHOULD be worried about right now - VERY worried
- is stabilizing the existing Hot Standby code to the point where it
won't be an embarrassment to us when we ship it. The rate at which
we're finding new problems even with the small number of people who
test alpha releases and nightly snapshots suggests to me that we're
not there yet.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on

On Tue, 2010-05-04 at 09:36 -0400, Robert Haas wrote:
> On Tue, May 4, 2010 at 4:02 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> > In the original patch I had Pause/Resume feature for controlling
> > recovery during Hot Standby. It was removed for lack of time.
>
> Well, it's not like we have more time now than we did then. I think
> we need to postpone this discussion to 9.1. If we're going to start
> accepting patches for new features, then why should we accept only
> patches for HS/SR?

Robert,

This is clearly a response to issues raised about HS and not a new
feature. It's also proposed in the most minimal way possible with
respect for the current state of release. Why is you think I want to go
to beta less quickly than anyone else? I have many other items to work
on in the new release also, none of them have been even discussed, again
out of respect for the timing and the process.

> I also think that worrying about fine-tuning HS at this point is a bit
> like complaining that the jump suits of the crew of the Space Shuttle
> Challenger were not made of 100% recyclable materials. Just yesterday
> we had a report of an HS server getting into a state where it failed
> to shut down properly; and I believe that we never fully resolved the
> issue of occasional extremely-long spikes in HS response time, either.
> Heikki just fixed a bug our btree recovery code which is apparently
> new to 9.0 since he did not backpatch it. I think that getting into a
> discussion of pausing and resuming recovery, or even the parallel
> discussion on max_standby_delay, are fiddling with things that,
> granted, are probably not ideal, and yes, we should improve them in a
> future release, but they're not what we should be worrying about right
> now. What I think we SHOULD be worried about right now - VERY worried
> - is stabilizing the existing Hot Standby code to the point where it
> won't be an embarrassment to us when we ship it. The rate at which
> we're finding new problems even with the small number of people who
> test alpha releases and nightly snapshots suggests to me that we're
> not there yet.

There hasn't been anything more than a minor bug in weeks, so not really
sure how you arrive at that the idea the code needs "stabilising".

But even if you think we need "stabilising", how do you propose I do
that? What exact action?

When people complain, I propose solutions. If you then object that the
proposed solution is actually a new feature, that leaves us in a
deadlock.

There is no evidence that Erik's strange performance has anything to do
with HS; it hasn't been seen elsewhere and he didn't respond to
questions about the test setup to provide background. The profile didn't
fit any software problem I can see.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Simon Riggs <simon(a)2ndQuadrant.com> writes:
> In the original patch I had Pause/Resume feature for controlling
> recovery during Hot Standby. It was removed for lack of time.

> With all the discussion around the HS UI, it would be something that
> could be back very easily.

Sure. In 9.1. You have enough bugs to fix that you have *no* business
thinking about adding features for 9.0, even if that were permissible
under the ground rules for beta. Pretending that it's a contrib module
is just a transparent end-run around that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Tue, 2010-05-04 at 11:12 -0400, Tom Lane wrote:
> Simon Riggs <simon(a)2ndQuadrant.com> writes:
> > In the original patch I had Pause/Resume feature for controlling
> > recovery during Hot Standby. It was removed for lack of time.
>
> > With all the discussion around the HS UI, it would be something that
> > could be back very easily.
>
> Sure. In 9.1. You have enough bugs to fix that you have *no* business
> thinking about adding features for 9.0, even if that were permissible
> under the ground rules for beta. Pretending that it's a contrib module
> is just a transparent end-run around that.

As stated, this was proposed as a response to your gripes elsewhere.

If people gripe, I propose a solution. I'm happy if you say No to the
proposed solution, but let's not pretend I'm breaking rules all the time
when I do.

What bugs do I have to fix? I am not aware of any.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers