From: Josh Berkus on
Simon, Fujii, All:

While demoing HS/SR at SCALE, I ran into a problem which is likely to be
a commonly encountered bug when people first setup HS/SR. Here's the
sequence:

1) Set up a brand new master with an archive-commmand and archive=on.

2) Start the master

3) Do a pg_start_backup()

4) Realize, based on log error messages, that I've misconfigured the
archive_command.

5) Attempt to shut down the master. Master tells me that pg_stop_backup
must be run in order to shut down.

6) Execute pg_stop_backup.

7) pg_stop_backup waits forever without ever stopping backup. Ever 60
seconds, it give me a helpful "still waiting" message, but at least in
the amount of time I was willing to wait (5 minutes), it never completed.

8) do an immediate shutdown, as it's the only way I can get the database
unstuck.

With some experimentation, the problem seems to occur when you have a
failing archive_command and a master which currently has no database
traffic; for example, if I did some database write activity (a createdb)
then pg_stop_backup would complete after about 60 seconds (which, btw,
is extremely annoying, but at least tolerable).

This issue is 100% reproduceable.

--Josh Berkus

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers