Hot standby, recent changes [PgSql]

Prev: Patch for information_schema performance
Next: [PATCH] Largeobject Access Controls (r2460)

From: Simon Riggs on 6 Dec 2009 06:18

On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:
> 1. The XLogFlush() call you added to dbase_redo doesn't help where it
> is. You need to call XLogFlush() after the *commit* record of the DROP
> DATABASE. The idea is minimize the window where the files have already
> been deleted but the entry in pg_database is still visible, if someone
> kills the standby and starts it up as a new master. This isn't really
> hot standby's fault, you have the same window with PITR, so I think it
> would be better to handle this as a separate patch. Also, please handle
> all the commands that use ForceSyncCommit().

I think I'll just add a flag to the commit record to do this. That way
anybody that sets ForceSyncCommit will do as you propose.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 6 Dec 2009 06:20

On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:

> 3. The "Out of lock mem killer" in StandbyAcquireAccessExclusiveLock is
> quite harsh. It aborts all read-only transactions. It should be enough
> to kill just one random one, or maybe the one that's holding most locks.
> Also, if there still isn't enough shared memory after killing all
> backends, it goes into an infinite loop. I guess that shouldn't happen,
> but it seems a bit squishy anyway. It would be nice to differentiate
> between "out of shared mem" and "someone else is holding the lock" more
> accurately. Maybe add a new return value to LockAcquire() for "out of
> shared mem".

OK, will abort infinite loop by adding new return value.

If people don't like having everything killed, they can add more lock
memory. It isn't worth adding more code because its hard to test and
unlikely to be an issue in real usage, and it has a clear workaround
that is already mentioned in a hint.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 6 Dec 2009 09:20

On Sun, 2009-12-06 at 11:20 +0000, Simon Riggs wrote:
> On Sun, 2009-12-06 at 12:32 +0200, Heikki Linnakangas wrote:
>
> > 3. The "Out of lock mem killer" in StandbyAcquireAccessExclusiveLock is
> > quite harsh. It aborts all read-only transactions. It should be enough
> > to kill just one random one, or maybe the one that's holding most locks.
> > Also, if there still isn't enough shared memory after killing all
> > backends, it goes into an infinite loop. I guess that shouldn't happen,
> > but it seems a bit squishy anyway. It would be nice to differentiate
> > between "out of shared mem" and "someone else is holding the lock" more
> > accurately. Maybe add a new return value to LockAcquire() for "out of
> > shared mem".
>
> OK, will abort infinite loop by adding new return value.

You had me for a minute, but there is no infinite loop. Once we start
killing people it reverts to throwing an ERROR on an out of memory
condition.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 6 Dec 2009 17:40

On Sun, 2009-12-06 at 17:26 -0500, Robert Haas wrote:

> For what it's worth, this doesn't seem particularly unlikely or
> unusual to me.

I don't know many people who shutdown both nodes of a highly available
application at the same time. If they did, I wouldn't expect them to
complain they couldn't run queries on the standby when an two obvious
and simple workarounds exist to allow them to access their data: start
the master again, or make the standby switchover, both of which are part
of standard operating procedures.

It doesn't seem very high up the list of additional features, at least.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on 7 Dec 2009 03:02

Simon Riggs wrote:
> On Sun, 2009-12-06 at 17:26 -0500, Robert Haas wrote:
>
>> For what it's worth, this doesn't seem particularly unlikely or
>> unusual to me.
>
> I don't know many people who shutdown both nodes of a highly available
> application at the same time. If they did, I wouldn't expect them to
> complain they couldn't run queries on the standby when an two obvious
> and simple workarounds exist to allow them to access their data: start
> the master again, or make the standby switchover, both of which are part
> of standard operating procedures.

It might not be common or expected in a typical HA installation, but it
would be a very strange limitation in my mind. It might well happen e.g
in a standby used for reporting, or when you do PITR.

> It doesn't seem very high up the list of additional features, at least.

Well, it's in the patch now. I'm just asking you to not break it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

| Next | Last
Pages: 1 2
Prev: Patch for information_schema performance
Next: [PATCH] Largeobject Access Controls (r2460)