HS/SR and smart shutdown [PgSql]

Prev: [HACKERS] HS/SR and smart shutdown
Next: plpython3 perf

From: Mark Kirkwood on 20 Jan 2010 20:59

Tom Lane wrote:
> Robert Haas <robertmhaas(a)gmail.com> writes:
>
>> On Wed, Jan 20, 2010 at 8:44 PM, Josh Berkus <josh(a)agliodbs.com> wrote:
>>
>>> Well, as long as streaming rep is running, you can't do a smart shutdown
>>> ... smart shutdown seems to treat the walreciever as a client
>>> connection. At the very least, this should be in the documentation.
>>>
>
>
>> How hard is it to fix?
>>
>
> I think the first question is do we *want* to fix it, or is it
> appropriate behavior?
>
> If the master shuts down, will the slaves try to fail over to become
> masters? When the master restarts, will the slaves automatically
> reconnect? If these questions have the wrong answers, shutting down the
> master isn't something to be done lightly, and automatically
> disconnecting slaves would be a real bad idea.
>
>
Right - surely people who have been using pg_standby etc have discovered
this behaviour, so documenting it is fine I would think.

regards

Mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 20 Jan 2010 21:42

On Thu, Jan 21, 2010 at 10:44 AM, Josh Berkus <josh(a)agliodbs.com> wrote:
>
>> If it's "standby", it's a previously-existing behavior that a "smart"
>> shutdown doesn't work immediately during recovery. After a recovery
>> has been completed, it would work. Of course, I agree that such a
>> behavior should be documented.
>
> Well, as long as streaming rep is running, you can't do a smart shutdown
> ... smart shutdown seems to treat the walreciever as a client
> connection.

Even if SR is not running, as long as the startup process is running,
we can't do a smart shutdown. It's not peculiar to SR.

> At the very least, this should be in the documentation.

Agreed. Something like "smart shutdown is not allowed during recovery"
should be in the following section.
http://developer.postgresql.org/pgdocs/postgres/server-shutdown.html

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on 21 Jan 2010 02:27

Fujii Masao wrote:
> On Thu, Jan 21, 2010 at 10:44 AM, Josh Berkus <josh(a)agliodbs.com> wrote:
>>> If it's "standby", it's a previously-existing behavior that a "smart"
>>> shutdown doesn't work immediately during recovery. After a recovery
>>> has been completed, it would work. Of course, I agree that such a
>>> behavior should be documented.
>> Well, as long as streaming rep is running, you can't do a smart shutdown
>> ... smart shutdown seems to treat the walreciever as a client
>> connection.
>
> Even if SR is not running, as long as the startup process is running,
> we can't do a smart shutdown. It's not peculiar to SR.

Right, that's the way a standby server (= one still in recovery) has
always behaved. It has made sense in the past: it's not in the spirit of
smart shutdown to kill the WAL replay immediately. "smart" means wait
for recovery to finish, then shutdown.

It's a good question if that still makes sense with Hot Standby. Perhaps
we should redefine smart shutdown in standby mode to shut down as soon
as all read-only connections have died.

>> At the very least, this should be in the documentation.
>
> Agreed. Something like "smart shutdown is not allowed during recovery"
> should be in the following section.
> http://developer.postgresql.org/pgdocs/postgres/server-shutdown.html

It's allowed, it just doesn't do what you might expect.

In the master, smart shutdown shuts down as soon as all regular backends
are gone. It doesn't wait for the standby connections to die. In fact
they're not killed until after the shutdown checkpoint is written, so
that it gets sent to the standbys too. I think we're good there.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Smith on 21 Jan 2010 03:35

Heikki Linnakangas wrote:
> It's a good question if that still makes sense with Hot Standby. Perhaps
> we should redefine smart shutdown in standby mode to shut down as soon
> as all read-only connections have died.
>

I've advocated in the past that an escalating shutdown procedure would
be helpful in general to have available. Start kicking off clients with
smart, continue to fast if there's any left, and if there's still any
left after that (have seen COPY clients that ignore fast) disconnect
them and go to immediate to completely kill them. Once you've started
the server on the road to shutdown, even with smart, you've basically
committed to going all the way down by whatever means is available
anyway, so why not make that more automated and easier.

If something like that were available, I could see inserting a step in
the middle there specifically aimed at resolving this issue. Maybe it's
just a change to the beginning of fast shutdown, or to the end of smart
as I think you're suggesting. Perhaps you only get it if you do one of
these escalating shutdowns I'm proposing, making that the preferred way
to handle HS servers.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(a)2ndQuadrant.com www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 29 Jan 2010 08:28

On Thu, Jan 21, 2010 at 4:27 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> It's a good question if that still makes sense with Hot Standby. Perhaps
> we should redefine smart shutdown in standby mode to shut down as soon
> as all read-only connections have died.

Okay. Let's work out the details.

I guess that the startup process and the walreceiver should wait
for all read only backends to exit in smart shutdown case. It's
because those backends might be waiting for the record that conflicts
with their queries to be replayed. Is this OK? Or we should kill the
startup process and the walreceiver on ahead?

If my guess is right, we would need to add new PMState to cancel
recovery and replication after all read only connections have died.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: [HACKERS] HS/SR and smart shutdown
Next: plpython3 perf