From: Robert Haas on
On Fri, Jun 11, 2010 at 9:29 PM, Rafael Martinez
<r.m.guerrero(a)usit.uio.no> wrote:
> I am testing HS + SR in a system running 9.0beta2. What I am doing is
> just trying all kind of crazy combinations and see how the system
> handles them.

Thanks!

> One of the test I knew was going to fail was to create a tablespace in
> the master node with the directory used by the tablespace existing in
> the master and not in the standby node.
>
> What I didn't expect was such a serious consequence. Postgres crashed in
> the standby node and it refused to start until the directory needed by
> the tablespace was created also in the standby.
>
> I suppose there is not an easy way of fixing this, but at least it would
> be a good idea to update the documentation with some information about
> how to fix this error situation (hot-standby.html#HOT-STANDBY-CAVEATS
> will be a nice place to have this information)
>
> Another thing is that the HINT message in the logs was a little
> misleading. The server is down and it will not start without fixing the
> cause of the problem.
> - ----------------------------------------------------
> FATAL: �directory "/var/pgsql/ts_test" does not exist
> CONTEXT: �xlog redo create ts: 20177 "/var/pgsql/ts_test"
> LOG: �startup process (PID 10147) exited with exit code 1
> LOG: �terminating any other active server processes
> WARNING: �terminating connection because of crash of another server process
> DETAIL: �The postmaster has commanded this server process to roll back
> the current transaction and exit, because another server process exited
> abnormally and possibly corrupted shared memory.
> HINT: �In a moment you should be able to reconnect to the database and
> repeat your command.

I think the behavior is correct (what else would we do? we must be
able to replace the subsequent WAL records that use the new
tablespace) but I agree that the hint is a little misleading.
Ideally, it seems like we'd like to issue that hint if we're planning
to restart, but not otherwise. You get that same message, for
example, if the DBA performs an immediate shutdown.

I'm somewhat disinclined to try to address this for 9.0. We've had
this problem for a long time, and I'm not sure that the fact that it
can now happen in a slightly wider set of circumstances is enough
reason to engineer a solution so close to release time, nor am I sure
what that other solution would look like. But I'm open to other
opinions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Martijn van Oosterhout on
On Sun, Jun 13, 2010 at 12:42:49PM -0400, Robert Haas wrote:
> I think the behavior is correct (what else would we do? we must be
> able to replace the subsequent WAL records that use the new
> tablespace) but I agree that the hint is a little misleading.
> Ideally, it seems like we'd like to issue that hint if we're planning
> to restart, but not otherwise. You get that same message, for
> example, if the DBA performs an immediate shutdown.

A bit of a comment from the sidelines: there's no particular reason why
the tablespaces on the master would need to match the tablespaces on
the slave. For a first cut it would seem to me that you should just be
able to ignore the tablespace commands on the slave. Not sure whether
that's easy or not though.

Have a nice day,
--
Martijn van Oosterhout <kleptog(a)svana.org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle
From: Robert Haas on
On Sun, Jun 13, 2010 at 4:52 PM, Martijn van Oosterhout
<kleptog(a)svana.org> wrote:
> On Sun, Jun 13, 2010 at 12:42:49PM -0400, Robert Haas wrote:
>> I think the behavior is correct (what else would we do? we must be
>> able to replace the subsequent WAL records that use the new
>> tablespace) but I agree that the hint is a little misleading.
>> Ideally, it seems like we'd like to issue that hint if we're planning
>> to restart, but not otherwise. �You get that same message, for
>> example, if the DBA performs an immediate shutdown.
>
> A bit of a comment from the sidelines: there's no particular reason why
> the tablespaces on the master would need to match the tablespaces on
> the slave. For a first cut it would seem to me that you should just be
> able to ignore the tablespace commands on the slave. Not sure whether
> that's easy or not though.

It's not particularly easy, and it might also not be what you want.

Perhaps in an ideal world we would have some system for mapping
tablespaces on the master to tablespaces on the slave, but I doubt
it's worth the effort: the existing system is not terribly onerous.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Rafael Martinez on
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Haas wrote:
> On Fri, Jun 11, 2010 at 9:29 PM, Rafael Martinez

> I'm somewhat disinclined to try to address this for 9.0. We've had
> this problem for a long time, and I'm not sure that the fact that it
> can now happen in a slightly wider set of circumstances is enough
> reason to engineer a solution so close to release time, nor am I sure
> what that other solution would look like. But I'm open to other
> opinions.
>

A minimum and probably the only feasible thing for 9.0 will be to update
the documentation. We need an entry in the hot-standby caveats section
explaining that if you create a tablespace and the directory needed does
not exist in the the standby, the standby will shutdown itself and will
not be able to start until the directory is also created in the standby.

For a DBA point of view, two possible solutions could be:

1) PostgreSQL creates the directory needed for the tablespace if the
user running postgres has privileges to do so at the OS level.

2) The standby discovers that the directory needed does not exist and
pauses the recovering (without shutting down the server) in the WAL
record that creates the tablespace. The standby will check periodically
if the directory is created before starting the recovery process again.

With this the users will be able to continue using and running queries
in the standby node. In very busy systems with many changes, the standby
will fall behind quite a lot if the error is not discovered and fixed
quickly. But in many other systems the delay will not be a problem as
serious as the loss of access to the standby.

regards,
- --
Rafael Martinez, <r.m.guerrero(a)usit.uio.no>
Center for Information Technology Services
University of Oslo, Norway

PGP Public Key: http://folk.uio.no/rafael/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEUEARECAAYFAkwVSsgACgkQBhuKQurGihQ1HgCXQKdwOEHLkj7g6FpJG663NUiZ
2gCZAT70aIQZ5Wj3IqsLlY6n+leLruI=
=neA1
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Smith on
Rafael Martinez wrote:
> A minimum and probably the only feasible thing for 9.0 will be to update
> the documentation. We need an entry in the hot-standby caveats section
> explaining that if you create a tablespace and the directory needed does
> not exist in the the standby, the standby will shutdown itself and will
> not be able to start until the directory is also created in the standby.
>

This is not a Hot Standby problem, and it's been documented since at
least http://www.postgresql.org/docs/8.2/static/warm-standby.html ; read
25.2.1 "Planning" in the current
http://developer.postgresql.org/pgdocs/postgres/warm-standby.html where
it's spelled out quite clearly.

It's a mixed blessing that it's now possible to actually get a
replicated server up so much more easily that people don't have to read
that particular document quite as carefully now and still get something
going. But if there's a documentation change to made, it should be
highlighting the warning already in that section better; it's not
something appropriate for the Hot Standby caveats. Since this is
clearly documented already, and there are bigger problems to worry about
for the current release, the real minimum action to perform here (and
the only one I would consider reasonable) is to change nothing at this
point for 9.0 here. I'm sorry you missed where this was covered, but
adding redundant documentation for basics like this invariably leads to
the multiple copies becoming out of sync with one another as changes are
made in the future.

> 1) PostgreSQL creates the directory needed for the tablespace if the
> user running postgres has privileges to do so at the OS level.
> 2) The standby discovers that the directory needed does not exist and
> pauses the recovering (without shutting down the server) in the WAL
> record that creates the tablespace. The standby will check periodically
> if the directory is created before starting the recovery process again.
>

Given that the idea behind a tablespace is that you want to relocate it
to a specific storage path, which may not map in the same way on the
standby, your first idea will never get implemented; it's not something
you want the server to guess about. As for the second, I would rather
see the standby go down--and hopefully set off some serious alarms for
the DBA who has screwed up here--than to stay up in a dysfunctional
polling state. The very serious mistake made is far more likely to be
discovered the way it's built right now.

I wouldn't be adverse to improving the error messages emitted when this
happens by the server to make it more obvious what's gone wrong in 9.1.
That's the only genuine improvement I'd see value in here, to cut down
on other people running into what you did and being as confused by it.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(a)2ndQuadrant.com www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers