pending patch: Re: [HACKERS] HS/SR and smart shutdown [PgSql]

Prev: [HACKERS] I am interested in the MERGE command implementation as my gSoC project
Next: pending patch: Re: [HACKERS] Streaming replication and pg_xlogfile_name()

From: Fujii Masao on 31 Mar 2010 04:48

On Wed, Mar 31, 2010 at 5:00 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> Please add some docs that a) explains what the patch does and b) notes
> any changes from behaviour in previous releases. ISTM this is a major
> change in behaviour.

How about adding the following description into "17.5. Shutting Down
the Server"?

If the server is in recovery, it waits for all of the read only
connections to be closed.

And, where should the note be put in? AFAIK, the previous behavior
has not been documented anywhere.

> >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
> "backends might be waiting for the WAL record that conflicts with their
> queries to be replayed". Recovery sometimes waits for backends, but
> backends never wait for recovery.

Really? As Heikki explained before, backends might wait for the lock
taken by the startup process.
http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 31 Mar 2010 05:02

On Wed, 2010-03-31 at 17:48 +0900, Fujii Masao wrote:
> On Wed, Mar 31, 2010 at 5:00 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> > Please add some docs that a) explains what the patch does and b) notes
> > any changes from behaviour in previous releases. ISTM this is a major
> > change in behaviour.
>
> How about adding the following description into "17.5. Shutting Down
> the Server"?
>
> If the server is in recovery, it waits for all of the read only
> connections to be closed.

You need to explain which mode you're talking about. Adding minimal
comments isn't my objective. We need good, useful documentation on every
aspect that you add or change.

> And, where should the note be put in? AFAIK, the previous behavior
> has not been documented anywhere.

> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
> > "backends might be waiting for the WAL record that conflicts with their
> > queries to be replayed". Recovery sometimes waits for backends, but
> > backends never wait for recovery.
>
> Really? As Heikki explained before, backends might wait for the lock
> taken by the startup process.
> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php

Backends wait for locks, yes, but they could be waiting for user locks
also. That is not "waiting for the WAL record", that concept does not
exist.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 31 Mar 2010 05:58

On Wed, Mar 31, 2010 at 6:02 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> On Wed, 2010-03-31 at 17:48 +0900, Fujii Masao wrote:
>> On Wed, Mar 31, 2010 at 5:00 PM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
>> > Please add some docs that a) explains what the patch does and b) notes
>> > any changes from behaviour in previous releases. ISTM this is a major
>> > change in behaviour.
>>
>> How about adding the following description into "17.5. Shutting Down
>> the Server"?
>>
>> If the server is in recovery, it waits for all of the read only
>> connections to be closed.
>
> You need to explain which mode you're talking about.

Smart Shutdown mode

> Adding minimal
> comments isn't my objective. We need good, useful documentation on every
> aspect that you add or change.

But the patch doesn't provide anything beyond:

>> If the server is in recovery, it waits for all of the read only
>> connections to be closed.

What additional explanation do you think is required?

>> And, where should the note be put in? AFAIK, the previous behavior
>> has not been documented anywhere.
>
>> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
>> > "backends might be waiting for the WAL record that conflicts with their
>> > queries to be replayed". Recovery sometimes waits for backends, but
>> > backends never wait for recovery.
>>
>> Really? As Heikki explained before, backends might wait for the lock
>> taken by the startup process.
>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php
>
> Backends wait for locks, yes, but they could be waiting for user locks
> also. That is not "waiting for the WAL record", that concept does not
> exist.

How about the following change?

- * waiting for the WAL record that conflicts with their queries to be
- * replayed, recovery and replication need to remain until all read
+ * waiting until the startup process has released the lock that
+ * their queries are waiting for by replaying the WAL record,
+ * recovery and replication need to remain until all read

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 31 Mar 2010 09:29

On Wed, Mar 31, 2010 at 4:00 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> Please add some docs that a) explains what the patch does and b) notes
> any changes from behaviour in previous releases. ISTM this is a major
> change in behaviour.

I guess I see this a little bit differently. If you do a smart
shutdown on 8.4, the autovacuum launcher won't prevent s smart
shutdown from completing successfully. Neither will the background
writer. If they did, we'd consider that a bug and fix it.
walreceiver is just one more system process that needs to get properly
shut down when a smart shutdown is requested. So I don't think this
is a major behavior change - I think it's preserving the behavior
we've had all along.

The current documentation reads:

In stop mode, the server that is running in the specified data
directory is shut down. Three different shutdown methods can be
selected with the -m option: "Smart" mode waits for online backup mode
to finish and all the clients to disconnect. This is the default.
"Fast" mode does not wait for clients to disconnect and will terminate
an online backup in progress. All active transactions are rolled back
and clients are forcibly disconnected, then the server is shut down.
"Immediate" mode will abort all server processes without a clean
shutdown. This will lead to a recovery run on restart.

That all still seems accurate after this patch. I'm not even sure
what to add. I suppose we could add a sentence like

If a smart shutdown is requested while the server is in recovery,
recovery will stop and the server will shut down.

....but if we add that then why don't we have a similar sentence that says:

If a smart shutdown is requested while the autovacuum launcher is
running, the autovacuum launcher will be stopped and the server will
shut down.

I just don't see that we're adding any additional clarity here. I
think what would require documentation is if we DIDN'T apply this
patch. Then we'd need something like:

Smart shutdown mode should not be used if streaming replication is in
use. The server will begin to shut down but, because the streaming
replication process is not automatically shut down, it will never
actually finish shutting down unless the streaming replication process
crashes. If a server using streaming replication is accidentally shut
down using smart mode, the problem can be corrected by shutting down
again using fast or immediate mode.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 31 Mar 2010 11:16

On Wed, Mar 31, 2010 at 5:02 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
>> > >From what I have seen, the comment about PM_WAIT_BACKENDS is incorrect.
>> > "backends might be waiting for the WAL record that conflicts with their
>> > queries to be replayed". Recovery sometimes waits for backends, but
>> > backends never wait for recovery.
>>
>> Really? As Heikki explained before, backends might wait for the lock
>> taken by the startup process.
>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02984.php
>
> Backends wait for locks, yes, but they could be waiting for user locks
> also. That is not "waiting for the WAL record", that concept does not
> exist.

Hmm... this is a good point, on two levels. First, the comment is not
as well-phrased as it could be. Second, I wonder why we can't kill
the startup process and WAL receiver right away, and then wait for the
backends to die off afterwards.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: [HACKERS] I am interested in the MERGE command implementation as my gSoC project
Next: pending patch: Re: [HACKERS] Streaming replication and pg_xlogfile_name()