From: Fujii Masao on
On Sat, Feb 13, 2010 at 1:10 AM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Are you thinking of a scenario where remove_command gets stuck, and
> prevents bgwriter from performing restartpoints while it's stuck?

Yes. If there is the archive in the remote server and the network outage
happens, remove_command might get stuck, I'm afraid.

> You
> have trouble if restore_command gets stuck like that as well, so I think
> we can require that the remove_command returns in a reasonable period of
> time, ie. in a few minutes.

Oh, you are right!

BTW, we need to note that remove_command approach would be useless if one
archive is shared by multiple standbys. One standby might wrongly remove
the archived WAL file that has been still required for another standby.
In this case, we need to have the job script that calculates the archived
WAL files that are required by no standbys, and removes them.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Fri, Feb 12, 2010 at 2:29 AM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> So the only major feature we're missing is the ability to clean up old
> files.

I found another missing feature in new file-based log shipping (i.e.,
standby_mode is enabled and 'cp' is used as restore_command).

After the trigger file is found, the startup process with pg_standby
tries to replay all of the WAL files in both pg_xlog and the archive.
So, when the primary fails, if the latest WAL file in pg_xlog of the
primary can be read, we can prevent the data loss by copying it to
pg_xlog of the standby before creating the trigger file.

On the other hand, the startup process with standby mode doesn't
replay the WAL files in pg_xlog after the trigger file is found. So
failover always causes the data loss even if the latest WAL file can
be read from the primary. And if the latest WAL file is copied to the
archive instead, it can be replayed but a PANIC error would happen
because it's not filled.

We should remove this restriction?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Wed, 2010-03-17 at 12:35 +0200, Heikki Linnakangas wrote:

> Looking into this, I realized that we have a bigger problem...

A lot of this would be easier if you do the docs first, then work
through the problems. The new system is more complex, since it has two
modes rather than one and also multiple processes and a live connection.
The number of failure cases must be higher than previously.

Documenting how it is supposed to work in the event of failure will help
everyone check those and comment on them.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Wed, Mar 17, 2010 at 7:35 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Fujii Masao wrote:
>> I found another missing feature in new file-based log shipping (i.e.,
>> standby_mode is enabled and 'cp' is used as restore_command).
>>
>> After the trigger file is found, the startup process with pg_standby
>> tries to replay all of the WAL files in both pg_xlog and the archive.
>> So, when the primary fails, if the latest WAL file in pg_xlog of the
>> primary can be read, we can prevent the data loss by copying it to
>> pg_xlog of the standby before creating the trigger file.
>>
>> On the other hand, the startup process with standby mode doesn't
>> replay the WAL files in pg_xlog after the trigger file is found. So
>> failover always causes the data loss even if the latest WAL file can
>> be read from the primary. And if the latest WAL file is copied to the
>> archive instead, it can be replayed but a PANIC error would happen
>> because it's not filled.
>>
>> We should remove this restriction?
>
> Looking into this, I realized that we have a bigger problem related to
> this. Although streaming replication stores the streamed WAL files in
> pg_xlog, so that they can be re-replayed after a standby restart without
> connecting to the master, we don't try to replay those either. So if you
> restart standby, it will fail to start up if the WAL it needs can't be
> found in archive or by connecting to the master. That must be fixed.

I agree that this is a bigger problem. Since the standby always starts
walreceiver before replaying any WAL files in pg_xlog, walreceiver tries
to receive the WAL files following the REDO starting point even if they
have already been in pg_xlog. IOW, the same WAL files might be shipped
from the primary to the standby many times. This behavior is unsmart,
and should be addressed.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
Sorry for the delay.

On Fri, Mar 19, 2010 at 8:37 PM, Heikki Linnakangas
<heikki.linnakangas(a)enterprisedb.com> wrote:
> Here's a patch I've been playing with.

Thanks! I'm reading the patch.

> The idea is that in standby mode,
> the server keeps trying to make progress in the recovery by:
>
> a) restoring files from archive
> b) replaying files from pg_xlog
> c) streaming from master
>
> When recovery reaches an invalid WAL record, typically caused by a
> half-written WAL file, it closes the file and moves to the next source.
> If an error is found in a file restored from archive or in a portion
> just streamed from master, however, a PANIC is thrown, because it's not
> expected to have errors in the archive or in the master.

But in the current (v8.4 or before) behavior, recovery ends normally
when an invalid record is found in an archived WAL file. Otherwise,
the server would never be able to start normal processing when there
is a corrupted archived file for some reasons. So, that invalid record
should not be treated as a PANIC if the server is not in standby mode
or the trigger file has been created. Thought?

When I tested the patch, the following PANIC error was thrown in the
normal archive recovery. This seems to derive from the above change.
The detail error sequence:
1. In ReadRecord(), emode was set to PANIC after 00000001000000000000000B
was read.
2. 00000001000000000000000C including the contrecord tried to be read
by using the emode (= PANIC). But since 00000001000000000000000C did
not exist, PANIC error was thrown.

-----------------
LOG: restored log file "00000001000000000000000B" from archive
cp: cannot stat `../data.arh/00000001000000000000000C': No such file
or directory
PANIC: could not open file "pg_xlog/00000001000000000000000C" (log
file 0, segment 12): No such file or directory
LOG: startup process (PID 17204) was terminated by signal 6: Aborted
LOG: terminating any other active server processes
-----------------

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers