From: Heikki Linnakangas on 10 Jun 2010 15:49 On 10/06/10 22:24, Dimitri Fontaine wrote: > Heikki Linnakangas<heikki.linnakangas(a)enterprisedb.com> writes: >> Maybe we could add a new pg_cleanuparchive binary, but we'll need some >> discussion... > > Would this binary ever be used manually, not invoked by PostgreSQL? As > it depends on the %r option to be given and to be right, I don't think > so. Hmm, actually it would be pretty handy. To make use of a base backup, you need all the WAL files following the one where pg_start_backup() was called. We create a .backup file in the archive to indicate that location, like: 00000001000000000000002F.00000020.backup So to clean up all WAL files older than those needed by that base backup, you would simply copy-paste that location and call pg_cleanuparchive: pg_cleanuparchive /walarchive/ 00000001000000000000002F Of course, if there's a perl one-liner to do that, we can just put that in the docs and don't really need pg_cleanuparchive at all. > Therefore my take on this problem is to provide internal commands here, > that maybe wouldn't need to be explicitly passed any argument. If > they're internal they certainly can access to the information they need? You want more flexibility in more advanced cases. Like if you have multiple standbys sharing the archive, you only want to remove old WAL files after they're not needed by *any* of the standbys anymore. Doing the cleanup directly in the archive_cleanup_command would cause the old WAL files to be removed prematurely, but you could put a shell script there to store the location to a file, and call pg_cleanuparchive with the max() of the locations reported by all standby servers. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 10 Jun 2010 16:09 On Thu, 2010-06-10 at 22:49 +0300, Heikki Linnakangas wrote: > On 10/06/10 22:24, Dimitri Fontaine wrote: > > Heikki Linnakangas<heikki.linnakangas(a)enterprisedb.com> writes: > >> Maybe we could add a new pg_cleanuparchive binary, but we'll need some > >> discussion... > > > > Would this binary ever be used manually, not invoked by PostgreSQL? As > > it depends on the %r option to be given and to be right, I don't think > > so. > > Hmm, actually it would be pretty handy. To make use of a base backup, > you need all the WAL files following the one where pg_start_backup() was > called. We create a .backup file in the archive to indicate that > location, like: > > 00000001000000000000002F.00000020.backup > > So to clean up all WAL files older than those needed by that base > backup, you would simply copy-paste that location and call > pg_cleanuparchive: > > pg_cleanuparchive /walarchive/ 00000001000000000000002F OK, sounds like we're on the same thought train. Here's the code. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services
From: Dimitri Fontaine on 11 Jun 2010 14:18 Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com> writes: > So to clean up all WAL files older than those needed by that base backup, > you would simply copy-paste that location and call pg_cleanuparchive: > > pg_cleanuparchive /walarchive/ 00000001000000000000002F Ok, idle though: what about having a superuser-only SRF doing the same? So that we have internal command for simple case, and SRF for use in scripts in more complex case. > Of course, if there's a perl one-liner to do that, we can just put that in > the docs and don't really need pg_cleanuparchive at all. psql -c "SELECT * FROM pg_cleanup_archive('00000001000000000000002F');" >> Therefore my take on this problem is to provide internal commands here, >> that maybe wouldn't need to be explicitly passed any argument. If >> they're internal they certainly can access to the information they need? > > You want more flexibility in more advanced cases. Like if you have multiple > standbys sharing the archive, you only want to remove old WAL files after > they're not needed by *any* of the standbys anymore. Doing the cleanup > directly in the archive_cleanup_command would cause the old WAL files to be > removed prematurely, but you could put a shell script there to store the > location to a file, and call pg_cleanuparchive with the max() of the > locations reported by all standby servers. Yes you still need to support external commands. That was not at all what I'm proposing: I'm just after having the simple case dead simple to setup. Like you don't write any script. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Dimitri Fontaine on 12 Jun 2010 16:51 Dimitri Fontaine <dfontaine(a)hi-media.com> writes: > Also, should I try to send a patch implementing my proposal (internal > command exposed as a function at the SQL level, and while at it, maybe > the internal command "pg_archive_bypass" to mimic /usr/bin/true as an > archive_command)? I had to have a try at it, even if quick and dirty. I've not tried to code the pg_archive_bypass internal command for lack of discussion, but I still think it would be great to have it. So here's a "see my idea in code" patch, that put the previous code by Simon into a backend function. As the goal was not to adapt the existing code intended as external to use the internal APIs, you'll find it quite ugly I'm sure. For example, this #define XLOG_DATA_FNAME_LEN has to go away, but that won't help having the idea accepted or not, and as I'm only warming up, I didn't tackle the problem. If you want me to do it, I'd appreciate some guidance as how to, though. It goes like this: dim=# select pg_switch_xlog(); pg_switch_xlog ---------------- 0/1000098 (1 row) dim=# select pg_archive_cleanup('0/1000098'); DEBUG: removing "pg_xlog/000000010000000000000000" DEBUG: removing "pg_xlog/000000010000000000000001" pg_archive_cleanup -------------------- t (1 row) I hope you too will find this way of interfacing is easier to deal with for everybody (from code maintenance to user settings). Regards, -- dim
From: Dimitri Fontaine on 12 Jun 2010 14:00
Dimitri Fontaine <dfontaine(a)hi-media.com> writes: > Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com> writes: >> So to clean up all WAL files older than those needed by that base backup, >> you would simply copy-paste that location and call pg_cleanuparchive: >> >> pg_cleanuparchive /walarchive/ 00000001000000000000002F > > Ok, idle though: what about having a superuser-only SRF doing the > same? So I'm looking at what it'd take to have that code, and it seems it would be quite easy. I wonder if we want to return only a boolean (command success status) or the list of files we're pruning (that's the SRF part), but other than that, it's all about having the code provided by Simon in another place and some internal command support. Something strange though: I notice that the error and signal handling in pgarch.c::pgarch_archiveXlog (lines 551 and following) and in xlog.c::ExecuteRecoveryCommand (lines 3143 and following) are very different for no reason that I can see. Why is that? Also, should I try to send a patch implementing my proposal (internal command exposed as a function at the SQL level, and while at it, maybe the internal command "pg_archive_bypass" to mimic /usr/bin/true as an archive_command)? Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |