From: Bruce Momjian on
Tom Lane wrote:
> I managed to crash the executor in the tablespace.sql test while working
> on a 9.1 patch, and discovered that the postmaster fails to recover
> from that. The end of postmaster.log looks like
>
> LOG: all server processes terminated; reinitializing
> LOG: database system was interrupted; last known up at 2010-07-11 19:30:07 EDT
> LOG: database system was not properly shut down; automatic recovery in progress
> LOG: consistent recovery state reached at 0/EE26F30
> LOG: redo starts at 0/EE26F30
> FATAL: directory "/home/postgres/pgsql/src/test/regress/testtablespace/PG_9.1_201004261" already in use as a tablespace
> CONTEXT: xlog redo create ts: 127158 "/home/postgres/pgsql/src/test/regress/testtablespace"
> LOG: startup process (PID 13914) exited with exit code 1
> LOG: aborting startup due to startup process failure
>
> It looks to me like those well-intentioned recent changes in this area
> broke the crash-recovery case. Not good.

Sorry for the delay. I didn't realize this was my code that was broken
until Heikki told me via IM.

The bug is that we can't replay mkdir()/symlink() and assume those will
always succeed. I looked at the createdb redo code and it basically
drops the directory before creating it.

The tablespace directory/symlink setup is more complex, so I just wrote
the attached patch to trigger a redo-'delete' tablespace operation
before the create tablespace redo operation.

Ignoring mkdir/symlink creation failure is not an option because the
symlink might point to some wrong location or something.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +
From: Bruce Momjian on
Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com> writes:
> > Maybe you should check that it points to the right location? Or drop and
> > recreate the symlink, and ignore failure at mkdir.
>
> More specifically, ignore EEXIST failure when replaying mkdir. Anything
> else is still a problem.

Thanks for the help. I tried to find somewhere else in our recovery
code that was similar but didn't find anything.

The attached patch does as suggested. I added the recovery code to the
create tablespace function so I didn't have to duplicate all the code
that computes the path names.

Attached.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +
From: Bruce Momjian on
Bruce Momjian wrote:
> Tom Lane wrote:
> > Heikki Linnakangas <heikki.linnakangas(a)enterprisedb.com> writes:
> > > Maybe you should check that it points to the right location? Or drop and
> > > recreate the symlink, and ignore failure at mkdir.
> >
> > More specifically, ignore EEXIST failure when replaying mkdir. Anything
> > else is still a problem.
>
> Thanks for the help. I tried to find somewhere else in our recovery
> code that was similar but didn't find anything.
>
> The attached patch does as suggested. I added the recovery code to the
> create tablespace function so I didn't have to duplicate all the code
> that computes the path names.
>
> Attached.

Uh, another question. Looking at the createdb recovery, I see:

/*
* Our theory for replaying a CREATE is to forcibly drop the target
* subdirectory if present, then re-copy the source data. This may be
* more work than needed, but it is simple to implement.
*/
if (stat(dst_path, &st) == 0 && S_ISDIR(st.st_mode))
{
if (!rmtree(dst_path, true))
ereport(WARNING,
(errmsg("some useless files may be left behind in old database directory \"%s\"",
dst_path)));
}

Should I be using rmtree() on the mkdir target?

Also, the original tablespace recovery code did not drop the symlink
first. I assume that was not a bug only because we don't support moving
tablespaces:

- /* Create the symlink if not already present */
- linkloc = (char *) palloc(OIDCHARS + OIDCHARS + 1);
- sprintf(linkloc, "pg_tblspc/%u", xlrec->ts_id);
-
- if (symlink(location, linkloc) < 0)
- {
- if (errno != EEXIST)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not create symbolic link \"%s\": %m",
- linkloc)));
- }

Still, it seems logical to unlink it before creating it.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Bruce Momjian on
Bruce Momjian wrote:
> > The attached patch does as suggested. I added the recovery code to the
> > create tablespace function so I didn't have to duplicate all the code
> > that computes the path names.
> >
> > Attached.
>
> Uh, another question. Looking at the createdb recovery, I see:
>
> /*
> * Our theory for replaying a CREATE is to forcibly drop the target
> * subdirectory if present, then re-copy the source data. This may be
> * more work than needed, but it is simple to implement.
> */
> if (stat(dst_path, &st) == 0 && S_ISDIR(st.st_mode))
> {
> if (!rmtree(dst_path, true))
> ereport(WARNING,
> (errmsg("some useless files may be left behind in old database directory \"%s\"",
> dst_path)));
> }
>
> Should I be using rmtree() on the mkdir target?
>
> Also, the original tablespace recovery code did not drop the symlink
> first. I assume that was not a bug only because we don't support moving
> tablespaces:

For consistency with CREATE DATABASE recovery and for reliablity, I
coded the rmtree() call instead. Patch attached.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +
From: Bruce Momjian on
Bruce Momjian wrote:
> Bruce Momjian wrote:
> > > The attached patch does as suggested. I added the recovery code to the
> > > create tablespace function so I didn't have to duplicate all the code
> > > that computes the path names.
> > >
> > > Attached.
> >
> > Uh, another question. Looking at the createdb recovery, I see:
> >
> > /*
> > * Our theory for replaying a CREATE is to forcibly drop the target
> > * subdirectory if present, then re-copy the source data. This may be
> > * more work than needed, but it is simple to implement.
> > */
> > if (stat(dst_path, &st) == 0 && S_ISDIR(st.st_mode))
> > {
> > if (!rmtree(dst_path, true))
> > ereport(WARNING,
> > (errmsg("some useless files may be left behind in old database directory \"%s\"",
> > dst_path)));
> > }
> >
> > Should I be using rmtree() on the mkdir target?
> >
> > Also, the original tablespace recovery code did not drop the symlink
> > first. I assume that was not a bug only because we don't support moving
> > tablespaces:
>
> For consistency with CREATE DATABASE recovery and for reliablity, I
> coded the rmtree() call instead. Patch attached.

Attached patch applied to HEAD and 9.0. 9.0 open item moved to
completed.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +