From: Bruce Momjian on
Bruce Momjian wrote:
> 2) Right now pg_migrator renames old tablespaces to .old, which fails
> if the tablespaces are on mount points. I have already received a
> report of such a failure. $PGDATA also has that issue, but that
> renaming has to be done by the user before pg_migrator is run, and only
> if they want to keep the same $PGDATA value after migration, i.e. no
> version-specific directory path. One idea we floated around was to have
> tablespaces use major version directory names under the tablespace
> directory so renaming would not be necessary. I could implement a
> pg_migrator --delete-old flag to cleanly delete the old 8.4 server files
> which are not in a version-specific subdirectory.

I have created a patch to implement per-cluster directories in
tablespaces. This is for use by pg_migrator so it doesn't have to
rename the tablespaces during the migration. Users still need to remove
the old cluster's tablespace subdirectory, and I can add a --delete-old
option to pg_migrator to do that.

The old code used a symlink from pg_tblspc/#### to the location
directory specified in CREATE TABLESPACE. During CREATE TABLESPACE, a
PG_VERSION file is created containing the major version number. Anytime
a database object is created in the tablespace, a per-database directory
is created.

With the new code in this patch, pg_tblspc/#### points to the CREATE
TABLESPACE directory just like before, but a new directory, PG_ +
major_version + catalog_version, e.g. PG_8.5_201001061, is created and
all per-database directories are created under that directory. This
directory has the same purpose as the old PG_VERSION file. One
disadvantage of this approach is that functions that need to look inside
tablespaces must now also specify the version directory, e.g.
pg_tablespace_databases().

An alternative approach would be for the pg_tblspc/#### symbolic link to
point to the new version directory, PG_*, but that makes removal of the
version directory complicated, particularly during WAL replay where we
don't have access to the system catalogs, and readlink() to read the
symbolic link target is not supported on all operating systems
(particularly Win32).

I used the version directory pattern "PG_8.5_201001061" because "PG_"
helps people realize the directory is for the use of Postgres
(PG_VERSION is gone in tablespaces), and the catalog version number
enables alpha migrations. The major version number is not necessary but
probably useful for administrators.

pg_migrator is going to need to know about the version directory too,
and it can't use the C macro --- it has to construct the directory
pattern based on the contents of pg_control from the old and new
servers. And, it is going to be difficult to run pg_control on the old
server for pg_migrator --delete-old after migration because it is
renamed to pg_control.old --- I will need to create a symbolic link
during the time I run pg_controldata. Also, the contents of the
tablespace directory for an 8.4 to 8.5 migration is going to be ugly
because there will be many numeric directories (for databases), and
PG_VERSION (for 8.4), and the PG_8.5_201001061 directory which should
not be touched.

Can someone explain why TablespaceCreateDbspace() creates a non-symlink
directory during recovery if the symlink is missing? Is it just for
robustness? I would like to document that more clearly.

Comments?

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +
From: Bruce Momjian on
Bruce Momjian wrote:
> Bruce Momjian wrote:
> > 2) Right now pg_migrator renames old tablespaces to .old, which fails
> > if the tablespaces are on mount points. I have already received a
> > report of such a failure. $PGDATA also has that issue, but that
> > renaming has to be done by the user before pg_migrator is run, and only
> > if they want to keep the same $PGDATA value after migration, i.e. no
> > version-specific directory path. One idea we floated around was to have
> > tablespaces use major version directory names under the tablespace
> > directory so renaming would not be necessary. I could implement a
> > pg_migrator --delete-old flag to cleanly delete the old 8.4 server files
> > which are not in a version-specific subdirectory.
>
> I have created a patch to implement per-cluster directories in
> tablespaces. This is for use by pg_migrator so it doesn't have to
> rename the tablespaces during the migration. Users still need to remove
> the old cluster's tablespace subdirectory, and I can add a --delete-old
> option to pg_migrator to do that.
>
> The old code used a symlink from pg_tblspc/#### to the location
> directory specified in CREATE TABLESPACE. During CREATE TABLESPACE, a
> PG_VERSION file is created containing the major version number. Anytime
> a database object is created in the tablespace, a per-database directory
> is created.
>
> With the new code in this patch, pg_tblspc/#### points to the CREATE
> TABLESPACE directory just like before, but a new directory, PG_ +
> major_version + catalog_version, e.g. PG_8.5_201001061, is created and
> all per-database directories are created under that directory. This
> directory has the same purpose as the old PG_VERSION file. One
> disadvantage of this approach is that functions that need to look inside
> tablespaces must now also specify the version directory, e.g.
> pg_tablespace_databases().
>
> An alternative approach would be for the pg_tblspc/#### symbolic link to
> point to the new version directory, PG_*, but that makes removal of the
> version directory complicated, particularly during WAL replay where we
> don't have access to the system catalogs, and readlink() to read the
> symbolic link target is not supported on all operating systems
> (particularly Win32).
>
> I used the version directory pattern "PG_8.5_201001061" because "PG_"
> helps people realize the directory is for the use of Postgres
> (PG_VERSION is gone in tablespaces), and the catalog version number
> enables alpha migrations. The major version number is not necessary but
> probably useful for administrators.
>
> pg_migrator is going to need to know about the version directory too,
> and it can't use the C macro --- it has to construct the directory
> pattern based on the contents of pg_control from the old and new
> servers. And, it is going to be difficult to run pg_control on the old
> server for pg_migrator --delete-old after migration because it is
> renamed to pg_control.old --- I will need to create a symbolic link
> during the time I run pg_controldata. Also, the contents of the
> tablespace directory for an 8.4 to 8.5 migration is going to be ugly
> because there will be many numeric directories (for databases), and
> PG_VERSION (for 8.4), and the PG_8.5_201001061 directory which should
> not be touched.
>
> Can someone explain why TablespaceCreateDbspace() creates a non-symlink
> directory during recovery if the symlink is missing? Is it just for
> robustness? I would like to document that more clearly.

Applied.

FYI, I decide to create a pg_migrator_remove_old_cluster.sh/.bat file
that can be run by the user after the upgrade, instead of adding a
--delete-old-cluster option to pg_migrator.

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Bruce Momjian on

FYI, I consider all the issues below to be addressed (we did all but
#4), and pg_migrator will take advantage of these new facilities for 8.5.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> pg_migrator has become more popular recently, so it seems time to look
> at some enhancements that would improve pg_migrator. None of these are
> required, but rather changes that would be nice to have:
>
> 1) Right now pg_migrator preserves relfilenodes for TOAST files because
> this is required for proper migration. Now that we have shown that
> strategically-placed global variables with a server-side function to set
> them is a viable solution, it would be nice to preserve all relfilenodes
> from the old server. This would simplify pg_migrator by no long
> requiring place-holder relfilenodes or the renaming of TOAST files. A
> simpler solution would just be to allow TOAST table creation to
> automatically remove placeholder files and create specified relfilenodes
> via global variables.
>
> 2) Right now pg_migrator renames old tablespaces to .old, which fails
> if the tablespaces are on mount points. I have already received a
> report of such a failure. $PGDATA also has that issue, but that
> renaming has to be done by the user before pg_migrator is run, and only
> if they want to keep the same $PGDATA value after migration, i.e. no
> version-specific directory path. One idea we floated around was to have
> tablespaces use major version directory names under the tablespace
> directory so renaming would not be necessary. I could implement a
> pg_migrator --delete-old flag to cleanly delete the old 8.4 server files
> which are not in a version-specific subdirectory.
>
> 3) There is no easy way to analyze all databases. vacuumdb --analyze
> does analyze _and_ vacuum, which for an 8.4 to 8.5 migration does an
> unnecessary vacuum. Right now I recommend ANALYZE in every database,
> but it would be nice if there were a single command which did this.
>
> 4) I have implemented the ability to run pg_migrator --check on a live
> old server. However, pg_migrator uses information from controldata to
> check things, and it also needs xid information that is only available
> via pg_resetxlog -n(no update) to perform the migration. Unfortunately,
> pg_resetxlog -n cannot be run on a live server, so pg_migrator runs
> pg_controldata for --check and pg_resetxlog -n for real upgrades. It
> would simplify pg_migrator if I would run pg_resetxlog -n on a live
> server, but I can understand if people don't want to do that because the
> xid information reported on a live server is inaccurate.
>
> Comments?
>
> --
> Bruce Momjian <bruce(a)momjian.us> http://momjian.us
> EnterpriseDB http://enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers