From: Robert Haas on
On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(a)hi-media.com> wrote:
> Florian Pflug <fgp(a)phlo.org> writes:
>> The only remaining option is to continue applying WAL until you reach
>> a point where no locks are held, then pause. But from a user's POV
>> that is nearly indistinguishable from simply setting
>> hot_standby_conflict_winner to in the first place I think.
>
> Not really, the use case would be using the slave as a reporting server,
> you know you have say 4 hours of reporting queries during which you will
> pause the recovery. So it's ok for the pause command to take time.

Seems like it could take FOREVER on a busy system. Surely that's not
OK. The fact that Hot Standby has to take exclusive locks that can't
be released until WAL replay has progressed to a certain point seems
like a fairly serious wart. We had a discussion on another thread of
how this can make the database fail to shut down properly, a problem
we're not addressing because we're too busy arguing about
max_standby_delay. In fact, if we knew how to pause replay without
leaving random locks lying around, we could rearrange the whole smart
shutdown sequence so that we paused replay FIRST and then waited for
all backends to exit, but the consensus on the thread where we
discussed this was that we did not know how to do that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Sun, 2010-05-09 at 16:01 -0400, Robert Haas wrote:

> The fact that Hot Standby has to take exclusive locks that can't
> be released until WAL replay has progressed to a certain point seems
> like a fairly serious wart.

LOL

And people lecture me about design.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Florian Pflug on
On May 9, 2010, at 22:01 , Robert Haas wrote:
> On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(a)hi-media.com> wrote:
>> Florian Pflug <fgp(a)phlo.org> writes:
>>> The only remaining option is to continue applying WAL until you reach
>>> a point where no locks are held, then pause. But from a user's POV
>>> that is nearly indistinguishable from simply setting
>>> hot_standby_conflict_winner to in the first place I think.
>>
>> Not really, the use case would be using the slave as a reporting server,
>> you know you have say 4 hours of reporting queries during which you will
>> pause the recovery. So it's ok for the pause command to take time.
>
> Seems like it could take FOREVER on a busy system. Surely that's not
> OK. The fact that Hot Standby has to take exclusive locks that can't
> be released until WAL replay has progressed to a certain point seems
> like a fairly serious wart.

If this is a serious wart then it's not one of hot standby, but one of postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as opposed to UPDATE/DELETE-blocking locks) are never necessary from a correctness POV, they're only there for implementation reasons.

Getting rid of them doesn't seem completely insurmountable either - just as multiple row versions remove the need to block SELECTs dues to concurrent UPDATEs, multiple datafile versions could remove the need to block SELECTs due to concurrent ALTERs. But people seem to live with them quite well, judged from the amount of work put into getting rid of them (zero). I therefore fail to see why they should pose a significant problem in HS setups.

> We had a discussion on another thread of
> how this can make the database fail to shut down properly, a problem
> we're not addressing because we're too busy arguing about
> max_standby_delay. In fact, if we knew how to pause replay without
> leaving random locks lying around, we could rearrange the whole smart
> shutdown sequence so that we paused replay FIRST and then waited for
> all backends to exit, but the consensus on the thread where we
> discussed this was that we did not know how to do that.

Yeah, this was exactly my line of thought too.

best regards,
Florian Pflug


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andres Freund on
On Monday 10 May 2010 00:25:44 Florian Pflug wrote:
> On May 9, 2010, at 22:01 , Robert Haas wrote:
> > On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(a)hi-media.com>
wrote:
> >> Florian Pflug <fgp(a)phlo.org> writes:
> >>> The only remaining option is to continue applying WAL until you reach
> >>> a point where no locks are held, then pause. But from a user's POV
> >>> that is nearly indistinguishable from simply setting
> >>> hot_standby_conflict_winner to in the first place I think.
> >>
> >> Not really, the use case would be using the slave as a reporting server,
> >> you know you have say 4 hours of reporting queries during which you will
> >> pause the recovery. So it's ok for the pause command to take time.
> >
> > Seems like it could take FOREVER on a busy system. Surely that's not
> > OK. The fact that Hot Standby has to take exclusive locks that can't
> > be released until WAL replay has progressed to a certain point seems
> > like a fairly serious wart.
>
> If this is a serious wart then it's not one of hot standby, but one of
> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
> correctness POV, they're only there for implementation reasons.
>
> Getting rid of them doesn't seem completely insurmountable either - just as
> multiple row versions remove the need to block SELECTs dues to concurrent
> UPDATEs, multiple datafile versions could remove the need to block SELECTs
> due to concurrent ALTERs. But people seem to live with them quite well,
> judged from the amount of work put into getting rid of them (zero). I
> therefore fail to see why they should pose a significant problem in HS
> setups.
The difference is that in HS you have to wait for a moment where *no exclusive
lock at all* exist, possibly without contending for any of them, while on the
master you might not even blocked by the existence of any of those locks.

If you have two sessions which in overlapping transactions lock different
tables exlusively you have no problem shutting the master down, but you will
never reach a point where no exclusive lock is taken on the slave.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Sun, May 9, 2010 at 6:58 PM, Andres Freund <andres(a)anarazel.de> wrote:
> On Monday 10 May 2010 00:25:44 Florian Pflug wrote:
>> On May 9, 2010, at 22:01 , Robert Haas wrote:
>> > On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(a)hi-media.com>
> wrote:
>> >> Florian Pflug <fgp(a)phlo.org> writes:
>> >>> The only remaining option is to continue applying WAL until you reach
>> >>> a point where no locks are held, then pause. But from a user's POV
>> >>> that is nearly indistinguishable from simply setting
>> >>> hot_standby_conflict_winner to in the first place I think.
>> >>
>> >> Not really, the use case would be using the slave as a reporting server,
>> >> you know you have say 4 hours of reporting queries during which you will
>> >> pause the recovery. So it's ok for the pause command to take time.
>> >
>> > Seems like it could take FOREVER on a busy system.  Surely that's not
>> > OK.  The fact that Hot Standby has to take exclusive locks that can't
>> > be released until WAL replay has progressed to a certain point seems
>> > like a fairly serious wart.
>>
>> If this is a serious wart then it's not one of hot standby, but one of
>> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
>> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
>> correctness POV, they're only there for implementation reasons.
>>
>> Getting rid of them doesn't seem completely insurmountable either - just as
>> multiple row versions remove the need to block SELECTs dues to concurrent
>> UPDATEs, multiple datafile versions could remove the need to block SELECTs
>> due to concurrent ALTERs. But people seem to live with them quite well,
>> judged from the amount of work put into getting rid of them (zero). I
>> therefore fail to see why they should pose a significant problem in HS
>> setups.
> The difference is that in HS you have to wait for a moment where *no exclusive
> lock at all* exist, possibly without contending for any of them, while on the
> master you might not even blocked by the existence of any of those locks.
>
> If you have two sessions which in overlapping transactions lock different
> tables exlusively you have no problem shutting the master down, but you will
> never reach a point where no exclusive lock is taken on the slave.

A possible solution to this in the shutdown case is to kill anyone
waiting on a lock held by the startup process at the same time we kill
the startup process, and to kill anyone who subsequently waits for
such a lock as soon as they attempt to take it. I'm not sure if this
would also make sense in the pause case.

Another possible solution would be to try to figure out if there's a
way to delay application of WAL that requires the taking of AELs to
the point where we could apply it all at once. That might not be
feasible, though, or only in some cases, and it's certainly 9.1
material (at least) in any case.

Anyway, this is all a little off-topic. We need to get back to
arguing about how best to cut the legs out from under a feature that's
been in the tree for six months but Tom didn't get around to looking
at until last week. I'll restate my position: now that I understand
what the issues are (I think), the feature as currently implemented
seems pretty wonky, but cutting it down to a boolean seems like an
exercise in excessive pessimism about our ability to predict future
development directions, as well as possibly quite inconvenient for
people attempting to use Hot Standby. Therefore I think we should
adopt Tom's original proposal (with +1 also from Stephen Frost), but
that doesn't seem likely to fly because, on the one hand, we have Tom
himself arguing (along with Bruce and possibly Heikki) that we should
whack it down all the way to a boolean; and on the other hand Simon
and Greg Smith and I think also Andres Freund and Kevin Grittner
arguing that the original feature is OK as-is.

Other people who weighed in include Stefan Kaltenbrunner (who opined
that Tom had a legitimate complaint about the current design but
didn't vote for a specific resolution), Greg Sabino Mullane (who
pointed out that SOME of the issues that Tom raised could be solved
with proper time synchronization), Josh Drake (who thought requiring
NTP to be working was a bad idea, and therefore presumably favors
changing something), Josh Berkus (who changed his vote at least once
and whose priority seems to have to do with releasing before the turn
of the century than with the actual technical option we select,
apologies if I'm misreading his emails), Greg Stark (who seems to
think that a boolean will be bad news but didn't specifically vote for
another option), Dimitri Fontaine (who wants a boolean plus
pause/resume functions, or maybe a plugin facility of some kind), Rob
Wultsch (who doesn't ever want to kill queries and therefore would be
happy with a boolean), Yeb Havinga (who never wants to stall recovery
and therefore would also be happy with a boolean), and Florian Pflug
(who points out that pause/resume is actually a nontrivial feature).
Apologies if I've left anyone out or misrepresented their position.

Overall I would say opinion is about evenly split between:

- leave it as-is
- make it a Boolean
- change it in some way but to something more expressive than a Boolean

I can't presume to extract a consensus from that; I don't think there
is one. You could say "the majority of people want to change
something" and that would be true; you could also say "the majority of
people don't want a Boolean" and that would also be true.

IF we adopt "leave it as-is", then we need to document that you will
need to both run ntp and run some sort of heartbeat process on the
master to make sure that at least a small amount of WAL keeps getting
generated; or else you'll have massive query cancellations. IF we
decide to make it a Boolean, then we need to document that you have to
choose between the possibility of recovery falling arbitrarily behind
as a result of even one query holding an exclusive lock, or
alternatively instantaneously canceling queries that conflict, however
briefly, with replay. IF we adopt Tom's original proposal, then we'll
need to document that the timeout given is per-lock-wait, and
therefore if the lock timeout is not zero and there are many lock
waits the standby may fall far behind and have difficulty catching up.
IF we decide to do something else, then I don't know.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers