max_standby_delay considered harmful [PgSql]

Prev: Further Hot Standby documentation required
Next: [HACKERS] Streaming replication - unable to stop the standby

From: Aidan Van Dyk on 10 May 2010 07:51

* Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com> [100510 06:03]:

> A problem with using the name "max_standby_delay" for Tom's suggestion
> is that it sounds like a hard limit, which it isn't. But if we name it
> something like:

I'ld still rather an "if your killing something, make sure you kill
enough to get all the way current" behaviour, but that's just me....

I'm want to run my standbys in a always current mode... But if I decide
to play with a lagged HR, I really want to make sure there is some
mechanism to cap the lag, and the "cap" is something I can understand
and use to make a reasonable estimate as to when data I know is live on
the primary will be seen on the standby...

bonus points if it works similarly for archive recovery ;-)

a.

--
Aidan Van Dyk Create like a god,
aidan(a)highrise.ca command like a king,
http://www.highrise.ca/ work like a slave.

From: Robert Haas on 10 May 2010 07:53

On Mon, May 10, 2010 at 2:27 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> I already explained that killing the startup process first is a bad idea
> for many reasons when shutdown was discussed. Can't remember who added
> the new standby shutdown code recently, but it sounds like their design
> was pretty poor if it didn't include shutting down properly with HS. I
> hope they fix the bug they have introduced. HS was never designed to
> work that way, so there is no flaw there; it certainly worked when
> committed.

The patch was written by Fujii Masao and committed, after review, by
me. Prior to that patch, smart shutdown never worked; now it works,
or so I believe, unless recovery is stalled holding a lock upon which
a regular back-end is blocking. Clearly that is both better and not
all that good. If you have any ideas to improve the situation
further, I'm all ears.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 10 May 2010 07:55

On Mon, May 10, 2010 at 6:03 AM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Yeah, I could live with that.
>
> A problem with using the name "max_standby_delay" for Tom's suggestion
> is that it sounds like a hard limit, which it isn't. But if we name it
> something like:
>
> # -1 = no timeout
> # 0 = kill conflicting queries immediately
> # > 0 wait for N seconds, then kill query
> standby_conflict_timeout = -1
>
> it's more clear that the setting is a timeout for each *conflict*, and
> it's less surprising that the standby can fall indefinitely behind in
> the worst case. If we name the setting along those lines, I could live
> with that.

Yeah, if we do it that way, +1 for changing the name, and your
suggestion seems good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 10 May 2010 07:57

On Mon, May 10, 2010 at 6:13 AM, Florian Pflug <fgp(a)phlo.org> wrote:
> On May 10, 2010, at 11:43 , Heikki Linnakangas wrote:
>> If you're not going to apply any more WAL records before shutdown, you
>> could also just release all the AccessExclusiveLocks held by the startup
>> process. Whatever the transaction was doing with the locked relation, if
>> we're not going to replay any more WAL records before shutdown, we will
>> not see the transaction committing or doing anything else with the
>> relation, so we should be safe. Whatever state the data on disk is in,
>> it must be valid, or we would have a problem with crash recovery
>> recovering up to this WAL record and then starting up too.
>
> Sounds plausible. But wouldn't this imply that HS could *always* postpone the acquisition of an AccessExclusiveLocks until right before the corresponding commit record is replayed? If fail to see a case where this would fail, yet recovery in case of an intermediate crash would be correct.

Yeah, I'd like to understand this, too. I don't have a clear
understanding of when HS needs to take locks here in the first place.

[removing Josh Berkus's persistently bouncing email from the CC line]

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on 10 May 2010 08:00

Florian Pflug wrote:
> On May 10, 2010, at 11:43 , Heikki Linnakangas wrote:
>> If you're not going to apply any more WAL records before shutdown, you
>> could also just release all the AccessExclusiveLocks held by the startup
>> process. Whatever the transaction was doing with the locked relation, if
>> we're not going to replay any more WAL records before shutdown, we will
>> not see the transaction committing or doing anything else with the
>> relation, so we should be safe. Whatever state the data on disk is in,
>> it must be valid, or we would have a problem with crash recovery
>> recovering up to this WAL record and then starting up too.
>
> Sounds plausible. But wouldn't this imply that HS could *always* postpone the acquisition of an AccessExclusiveLocks until right before the corresponding commit record is replayed? If fail to see a case where this would fail, yet recovery in case of an intermediate crash would be correct.

I guess it could in some situations, but for example the
AccessExclusiveLock taken at the end of lazy vacuum to truncate the
relation must be held during the truncation, or concurrent readers will
get upset.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Prev: Further Hot Standby documentation required
Next: [HACKERS] Streaming replication - unable to stop the standby