From: "Kevin Grittner" on
Simon Riggs <simon(a)2ndQuadrant.com> wrote:

> The correct resolution is to put in an archive_command that works.

One really should ensure that WAL files (or should I now say data?
;-) are flowing before issuing running the pg_start_backup()
function. The documentation has always been pretty explicit about
that:

http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html

| 24.3.2. Making a Base Backup
|
| The procedure for making a base backup is relatively simple:
|
| 1. Ensure that WAL archiving is enabled and working.
|
| 2. Connect to the database as a superuser, and issue the command:
|
| SELECT pg_start_backup('label');
| ...

As long as the SR documentation is equally explicit on this point,
you'd have to be blatantly going against the instructions to hit
this.

Which makes me think that while pg_fail_backup() might actually be a
good idea, it's not really needed to solve this, so it's 9.1
material at best.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: David Fetter on
On Tue, Feb 23, 2010 at 06:58:22PM +0000, Simon Riggs wrote:
> On Tue, 2010-02-23 at 09:45 -0800, Josh Berkus wrote:
>
> > 1) Set up a brand new master with an archive-commmand and
> > archive=on.
> >
> > 2) Start the master
> >
> > 3) Do a pg_start_backup()
> >
> > 4) Realize, based on log error messages, that I've misconfigured
> > the archive_command.
>
> > 5) Attempt to shut down the master. Master tells me that
> > pg_stop_backup must be run in order to shut down.
> >
> > 6) Execute pg_stop_backup.
> >
> > 7) pg_stop_backup waits forever without ever stopping backup.
> > Ever 60 seconds, it give me a helpful "still waiting" message, but
> > at least in the amount of time I was willing to wait (5 minutes),
> > it never completed.
> >
> > 8) do an immediate shutdown, as it's the only way I can get the
> > database unstuck.
> >
> > With some experimentation, the problem seems to occur when you
> > have a failing archive_command and a master which currently has no
> > database traffic; for example, if I did some database write
> > activity (a createdb) then pg_stop_backup would complete after
> > about 60 seconds (which, btw, is extremely annoying, but at least
> > tolerable).
> >
> > This issue is 100% reproduceable.
>
> IMHO there in no problem in that behaviour. If somebody requests a
> backup then we should wait for it to complete. Kevin's suggestion of
> pg_fail_backup() is the only sensible conclusion there because it
> gives an explicit way out of deadlock.
>
> ISTM the problem is that you didn't test. Steps 3 and 4 should have
> been reversed. Perhaps we should put something in the docs to say
> "and test". The correct resolution is to put in an archive_command
> that works.

+1 for clarifying and extending the docs.

Cheers,
David.
--
David Fetter <david(a)fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter(a)gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Tue, 2010-02-23 at 11:24 -0800, Joshua D. Drake wrote:

> This will bite us if we release like this.

No it won't. The current behaviour was put there by user request a few
releases back. This isn't a 9.0 issue, and as I've said its addressing
something that we now longer see as mainstream going forwards.

There are plenty of things that will bite us, but not this.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Tue, Feb 23, 2010 at 12:52 PM, Joshua D. Drake <jd(a)commandprompt.com> wrote:
> On Tue, 2010-02-23 at 09:45 -0800, Josh Berkus wrote:
>> Simon, Fujii, All:
>>
>> While demoing HS/SR at SCALE, I ran into a problem which is likely to be
>> a commonly encountered bug when people first setup HS/SR.  Here's the
>> sequence:
>>
>> 1) Set up a brand new master with an archive-commmand and archive=on.
>>
>> 2) Start the master
>>
>> 3) Do a pg_start_backup()
>>
>> 4) Realize, based on log error messages, that I've misconfigured the
>> archive_command.
>>
>> 5) Attempt to shut down the master.  Master tells me that pg_stop_backup
>> must be run in order to shut down.
>
> If I issue a shutdown, PostgreSQL should do whatever it needs to do to
> shutdown; including issuing a pg_stop_backup.

Maybe. But for sure, if it doesn't, and instead tells the user to
issue pg_stop_backup(), then pg_stop_backup() had better WORK when the
user tries to execute it. I gather that the problem is that it has to
finish all that outstanding archiving before shutting down, in which
case Kevin's suggestion of having a command to abort the backup seems
reasonable. I might call it pg_abort_backup() rather than
pg_fail_backup(), but...

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Joshua D. Drake" on
On Tue, 2010-02-23 at 14:49 -0500, Robert Haas wrote:

> > If I issue a shutdown, PostgreSQL should do whatever it needs to do to
> > shutdown; including issuing a pg_stop_backup.
>
> Maybe. But for sure, if it doesn't, and instead tells the user to
> issue pg_stop_backup(), then pg_stop_backup() had better WORK when the
> user tries to execute it.

Right.

> I gather that the problem is that it has to
> finish all that outstanding archiving before shutting down, in which
> case Kevin's suggestion of having a command to abort the backup seems
> reasonable. I might call it pg_abort_backup() rather than
> pg_fail_backup(), but...
>

But...?

Joshua D. Drake


> ...Robert
>


--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir.


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers