Prev: [HACKERS] pg_stop_backup does not complete
Next: [PATCH] backend: compare word-at-a-time in bcTruelen
From: Simon Riggs on 24 Feb 2010 19:04 On Wed, 2010-02-24 at 14:20 -0800, Josh Berkus wrote: > Since Kevin suggested this in his first post and I agreed with that in > > the first paragraph of my first post, I think you've wasted a lot of > > time here going in circles. 42 posts, more than a dozen people. I > think > > Please tone down the hostility, Simon. I don't think talking about an > issue I encountered while testing is a waste of anyone's time, it's > how we improve the software. In fact, I'm hoping that potential > testers are noticing the drubbing you're getting over this, because > belittling anyone's bug reports is not exactly a good way to attract > new testers to the project. Saying "its not a bug" doesn't belittle your bug report. Your first report was not time wasting, but talking endlessly about a subject that you've had clear replies on becomes time wasting. As I've said many times now, this isn't even an 9.0 issue. Expressing that opinion is not hostility. I'm not sure why you think *I* am receiving a drubbing? You made a mistake on a demo, filed a bug report and wouldn't listen to people telling you its not a bug. I admire your attempts at oneupmanship. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 24 Feb 2010 19:08 Simon Riggs <simon(a)2ndQuadrant.com> writes: > On Wed, 2010-02-24 at 16:52 -0500, Tom Lane wrote: >> * emit a NOTICE as soon as pg_stop_backup's actual work is done and >> it's starting to wait for the archiver (or maybe after it's waited >> for a few seconds, but much less than the present 60). > Pointless really. Nobody runs backups in production by typing > pg_stop_backup() except in a demo. Nobody will see this. I agree it's pointless in production, but this isn't about production, it's about friendliness to people who are experimenting. The case will probably never come up in production because a production installation should have a non-broken archive_command. >> * extend the existing WARNING (and the NOTICE too if we elect to have >> one) with a HINT message explicitly saying that you can cancel the >> wait but thus-and-such consequences might ensue. > If you can see the HINT, you can also see the WARNING. If you can see > the WARNING and do nothing, I don't think we need a "objects in the > mirror may be closer than they appear" message. If people can't work out > that if a) they are running something and b) that something is waiting > that they should cancel it then we aren't going to have much luck with > them. The value of the HINT I think would be to make them (a) not afraid to hit control-C and (b) aware of the fact that their archiver has got a problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Joshua D. Drake" on 24 Feb 2010 19:18 On Wed, 2010-02-24 at 23:57 +0000, Simon Riggs wrote: > > > * emit a NOTICE as soon as pg_stop_backup's actual work is done and > > it's starting to wait for the archiver (or maybe after it's waited > > for a few seconds, but much less than the present 60). > > Pointless really. Nobody runs backups in production by typing > pg_stop_backup() except in a demo. Nobody will see this. This is not true. It is not uncommon for a pitr setup to get out of sync for any number of production reasons. It is one of the reasons that PITRTools supports executing a pg_stop_backup. Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering Respect is earned, not gained through arbitrary and repetitive use or Mr. or Sir. -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 24 Feb 2010 19:41 On Wed, 2010-02-24 at 19:08 -0500, Tom Lane wrote: > Simon Riggs <simon(a)2ndQuadrant.com> writes: > > On Wed, 2010-02-24 at 16:52 -0500, Tom Lane wrote: > >> * emit a NOTICE as soon as pg_stop_backup's actual work is done and > >> it's starting to wait for the archiver (or maybe after it's waited > >> for a few seconds, but much less than the present 60). > > > Pointless really. Nobody runs backups in production by typing > > pg_stop_backup() except in a demo. Nobody will see this. > > I agree it's pointless in production, but this isn't about production, > it's about friendliness to people who are experimenting. The case will > probably never come up in production because a production installation > should have a non-broken archive_command. No further objection. -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Smith on 24 Feb 2010 20:36
Tom Lane wrote: > The value of the HINT I think would be to make them (a) not afraid to > hit control-C and (b) aware of the fact that their archiver has got > a problem. > Agreed on both points. Patch attached that implements something similar to Josh's wording, tweaking the original warning too. Here's what it looks like when you run into the bad situation (which I easily simulated with "archive_command='/bin/false'") from the client's perspective: gsmith(a)meddle:~/pgwork/src/master/src$ psql -c "select pg_start_backup('test')" pg_start_backup ----------------- 0/5000020 (1 row) gsmith(a)meddle:~/pgwork/src/master/src$ psql psql (9.0devel) Type "help" for help. gsmith=# select pg_stop_backup(); NOTICE: pg_stop_backup cleanup done, waiting for required segments to archive WARNING: pg_stop_backup still waiting for all required segments to archive (60 seconds elapsed) HINT: Confirm your archive_command is executing successfully. pg_stop_backup can be aborted safely, but the resulting backup will not be usable. ^CCancel request sent ERROR: canceling statement due to user request And this is the sort of thing that shows up in the logs with default logging behavior while all this is happening; you don't see the NOTICE, but the WARNING and HINT are both there which I think is good: LOG: archive command failed with exit code 1 DETAIL: The failed archive command was: /bin/false WARNING: transaction log file "000000010000000000000000" could not be archived: too many failures WARNING: pg_stop_backup still waiting for all required segments to archive (60 seconds elapsed) HINT: Confirm your archive_command is executing successfully. pg_stop_backup can be aborted safely, but the resulting backup will not be usable. Does this solve the logging side of this? You can still make a case for a more forceful pg_stop_backup, this seems to at least remove much of the mystery and frustration from the whole exercise. This patch plus a little documentation suggesting how to recover from this issue might be enough. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg(a)2ndQuadrant.com www.2ndQuadrant.us |