From: Simon Riggs on
On Sun, 2010-05-09 at 20:56 -0400, Robert Haas wrote:

> >> > Seems like it could take FOREVER on a busy system. Surely that's not
> >> > OK. The fact that Hot Standby has to take exclusive locks that can't
> >> > be released until WAL replay has progressed to a certain point seems
> >> > like a fairly serious wart.
> >>
> >> If this is a serious wart then it's not one of hot standby, but one of
> >> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
> >> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
> >> correctness POV, they're only there for implementation reasons.
> >>
> >> Getting rid of them doesn't seem completely insurmountable either - just as
> >> multiple row versions remove the need to block SELECTs dues to concurrent
> >> UPDATEs, multiple datafile versions could remove the need to block SELECTs
> >> due to concurrent ALTERs. But people seem to live with them quite well,
> >> judged from the amount of work put into getting rid of them (zero). I
> >> therefore fail to see why they should pose a significant problem in HS
> >> setups.
> > The difference is that in HS you have to wait for a moment where *no exclusive
> > lock at all* exist, possibly without contending for any of them, while on the
> > master you might not even blocked by the existence of any of those locks.
> >
> > If you have two sessions which in overlapping transactions lock different
> > tables exlusively you have no problem shutting the master down, but you will
> > never reach a point where no exclusive lock is taken on the slave.
>
> A possible solution to this in the shutdown case is to kill anyone
> waiting on a lock held by the startup process at the same time we kill
> the startup process, and to kill anyone who subsequently waits for
> such a lock as soon as they attempt to take it.

I already explained that killing the startup process first is a bad idea
for many reasons when shutdown was discussed. Can't remember who added
the new standby shutdown code recently, but it sounds like their design
was pretty poor if it didn't include shutting down properly with HS. I
hope they fix the bug they have introduced. HS was never designed to
work that way, so there is no flaw there; it certainly worked when
committed.

> I'm not sure if this
> would also make sense in the pause case.

Not sure why pausing replay would make any difference at all. Being
between one WAL record and the next is a valid and normal state that
exists many thousands of times per second. If making that state longer
would cause problems we would already have seen any issues. There are
none, it will work fine.

> Another possible solution would be to try to figure out if there's a
> way to delay application of WAL that requires the taking of AELs to
> the point where we could apply it all at once. That might not be
> feasible, though, or only in some cases, and it's certainly 9.1
> material (at least) in any case.

Locks usually protect users from accessing a table while its being
clustered or dropped or something like that. Locks are not bad. They are
also used by some developers to specifically serialize access to an
object. AccessExclusiveLocks are rare in normal running and not to be
avoided when they do exist. HS correctly supports locking, as and when
such locks are made on the master.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Dimitri Fontaine on
Robert Haas <robertmhaas(a)gmail.com> writes:
> On Sun, May 9, 2010 at 6:58 PM, Andres Freund <andres(a)anarazel.de> wrote:
>> The difference is that in HS you have to wait for a moment where *no exclusive
>> lock at all* exist, possibly without contending for any of them, while on the
>> master you might not even blocked by the existence of any of those locks.
>>
>> If you have two sessions which in overlapping transactions lock different
>> tables exlusively you have no problem shutting the master down, but you will
>> never reach a point where no exclusive lock is taken on the slave.
>
> A possible solution to this in the shutdown case is to kill anyone
> waiting on a lock held by the startup process at the same time we kill
> the startup process, and to kill anyone who subsequently waits for
> such a lock as soon as they attempt to take it. I'm not sure if this
> would also make sense in the pause case.

Well, wait, I'm getting lost here. It seems to me that no query on the
slave is granted to take AEL, not matter what. The only case is a query
waiting for the replay to release its locks.

The only consequence of pause not waiting for any lock to get released
from the replay is that those backends will be, well, paused. But that
applies the same to any backend started after we pause.

Waiting for replay to release all its locks before to pause would mean
that there's a possibility that the activity on the master is such that
you never reach a pause in the WAL stream. Let's assume we want any new
code we throw in at this stage to be a magic wand making every use happy
at once.

So we'd need a pause function taking either 1 or 2 arguments, first is
to say we pause now even if we know the replay is holding some locks
that might pause the reporting queries too, the other is to wait until
the locks are not held anymore, with a timeout (default 1min?).

Ok, that's designing the API we're missing, and we should not be in the
process of doing any design at this stage. But we are.

> [good summary of current positions]
> I can't presume to extract a consensus from that; I don't think there
> is one.

All we know for sure is that Tom does not want to release as-is, and he
rightfully insists on several objectives as far as the editing is
concerned:
- no addition of code we might want to throw away later
- avoid having to deprecate released behavior, it's too hard
- minimal change set, possibly with no new features.

One more, pausing the replay is *already* in the code base, it's exactly
what happens under the hood if you favor queries rather than replay, to
the point I don't understand why the pause design needs to happen
now. We're only talking about having an *explicit* version of it.

Regards,
--
dim

I too am growing tired of insisting this much. I only continue because I
really can't get to understand why-o-why considering a new API over
existing feature is not possible at this stage. I'm hitting my head on
the wal, so to say…

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on
Robert Haas wrote:
> On Sun, May 9, 2010 at 6:58 PM, Andres Freund <andres(a)anarazel.de> wrote:
>> On Monday 10 May 2010 00:25:44 Florian Pflug wrote:
>>> On May 9, 2010, at 22:01 , Robert Haas wrote:
>>>> On Sun, May 9, 2010 at 3:09 PM, Dimitri Fontaine <dfontaine(a)hi-media.com>
>> wrote:
>>>> Seems like it could take FOREVER on a busy system. Surely that's not
>>>> OK. The fact that Hot Standby has to take exclusive locks that can't
>>>> be released until WAL replay has progressed to a certain point seems
>>>> like a fairly serious wart.
>>> If this is a serious wart then it's not one of hot standby, but one of
>>> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
>>> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
>>> correctness POV, they're only there for implementation reasons.
>>>
>>> Getting rid of them doesn't seem completely insurmountable either - just as
>>> multiple row versions remove the need to block SELECTs dues to concurrent
>>> UPDATEs, multiple datafile versions could remove the need to block SELECTs
>>> due to concurrent ALTERs. But people seem to live with them quite well,
>>> judged from the amount of work put into getting rid of them (zero). I
>>> therefore fail to see why they should pose a significant problem in HS
>>> setups.
>> The difference is that in HS you have to wait for a moment where *no exclusive
>> lock at all* exist, possibly without contending for any of them, while on the
>> master you might not even blocked by the existence of any of those locks.
>>
>> If you have two sessions which in overlapping transactions lock different
>> tables exlusively you have no problem shutting the master down, but you will
>> never reach a point where no exclusive lock is taken on the slave.
>
> A possible solution to this in the shutdown case is to kill anyone
> waiting on a lock held by the startup process at the same time we kill
> the startup process, and to kill anyone who subsequently waits for
> such a lock as soon as they attempt to take it.

If you're not going to apply any more WAL records before shutdown, you
could also just release all the AccessExclusiveLocks held by the startup
process. Whatever the transaction was doing with the locked relation, if
we're not going to replay any more WAL records before shutdown, we will
not see the transaction committing or doing anything else with the
relation, so we should be safe. Whatever state the data on disk is in,
it must be valid, or we would have a problem with crash recovery
recovering up to this WAL record and then starting up too.

I'm not 100% clear if that reasoning applies to AccessExclusiveLocks
taken explicitly with LOCK TABLE. It's not clear what the application
would use the lock for.

Nevertheless, maybe killing the transactions that wait for the locks
would be more intuitive anyway.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on
Robert Haas wrote:
> On Thu, May 6, 2010 at 2:47 PM, Josh Berkus <josh(a)agliodbs.com> wrote:
>>> Now that I've realized what the real problem is with max_standby_delay
>>> (namely, that inactivity on the master can use up the delay), I think
>>> we should do what Tom originally suggested here. It's not as good as
>>> a really working max_standby_delay, but we're not going to have that
>>> for 9.0, and it's clearly better than a boolean.
>> I guess I'm not clear on how what Tom proposed is fundamentally
>> different from max_standby_delay = -1. If there's enough concurrent
>> queries, recovery would never catch up.
>
> If your workload is that the standby server is getting pounded with
> queries like crazy, then it's probably not that different: it will
> fall progressively further behind. But I suspect many people will set
> up standby servers where most of the activity happens on the primary,
> but they run some reporting queries on the standby. If you expect
> your reporting queries to finish in <10s, you could set the max delay
> to say 60s. In the event that something gets wedged, recovery will
> eventually kill it and move on rather than just getting stuck forever.
> If the volume of queries is known not to be too high, it's reasonable
> to expect that a few good whacks will be enough to get things back on
> track.

Yeah, I could live with that.

A problem with using the name "max_standby_delay" for Tom's suggestion
is that it sounds like a hard limit, which it isn't. But if we name it
something like:

# -1 = no timeout
# 0 = kill conflicting queries immediately
# > 0 wait for N seconds, then kill query
standby_conflict_timeout = -1

it's more clear that the setting is a timeout for each *conflict*, and
it's less surprising that the standby can fall indefinitely behind in
the worst case. If we name the setting along those lines, I could live
with that.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Florian Pflug on
On May 10, 2010, at 11:43 , Heikki Linnakangas wrote:
> If you're not going to apply any more WAL records before shutdown, you
> could also just release all the AccessExclusiveLocks held by the startup
> process. Whatever the transaction was doing with the locked relation, if
> we're not going to replay any more WAL records before shutdown, we will
> not see the transaction committing or doing anything else with the
> relation, so we should be safe. Whatever state the data on disk is in,
> it must be valid, or we would have a problem with crash recovery
> recovering up to this WAL record and then starting up too.

Sounds plausible. But wouldn't this imply that HS could *always* postpone the acquisition of an AccessExclusiveLocks until right before the corresponding commit record is replayed? If fail to see a case where this would fail, yet recovery in case of an intermediate crash would be correct.

best regards,
Florian Pflug


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers