From: Jaime Casanova on
Hi,

i'm startint to try Hot Standby & Streaming Replication, so i started
a replication:

1) Install master server with regression database
2) Start WAL archive (archive_mode=on, archive_command='cp %p
/usr/local/pgsql/wal_archive/%f')
3) select pg_start_backup('standby test');
4) cp -R /usr/local/pgsql/9.0/data /usr/local/pgsql/9.0slave/data
5) select pg_stop_backup();

at this point i checked wal_archive directory:
"""
postgres(a)casanova14:/usr/local/pgsql/9.0$ ls ../wal_archive/
000000010000000000000003 000000010000000000000004
000000010000000000000004.00000020.backup
"""

6) started standby recovery (archive_mode=off, standy_mode=on,
primary_conninfo = 'host=127.0.0.1 port=5432 user=postgres')

wait a little and check logs:
"""
LOG: database system was interrupted; last known up at 2010-04-09 14:48:16 ECT
LOG: entering standby mode
LOG: restored log file "000000010000000000000004" from archive
LOG: redo starts at 0/4000020
LOG: consistent recovery state reached at 0/5000000
LOG: database system is ready to accept read only connections
LOG: restored log file "000000010000000000000005" from archive
cp: no se puede efectuar `stat' sobre
«/usr/local/pgsql/wal_archive/000000010000000000000006»: No existe el
fichero ó directorio
LOG: unexpected pageaddr 0/2000000 in log file 0, segment 6, offset 0
cp: no se puede efectuar `stat' sobre
«/usr/local/pgsql/wal_archive/000000010000000000000006»: No existe el
fichero ó directorio
LOG: streaming replication successfully connected to primary
"""

mmm... are we waiting for a WAL file that doesn't exist?

7) i then, restart standby server
"""
LOG: received smart shutdown request
FATAL: terminating walreceiver process due to administrator command
LOG: shutting down
LOG: database system is shut down

LOG: database system was interrupted while in recovery at log time
2010-04-09 15:06:23 ECT
HINT: If this has occurred more than once some data might be
corrupted and you might need to choose an earlier recovery target.
LOG: entering standby mode
cp: no se puede efectuar `stat' sobre
«/usr/local/pgsql/wal_archive/000000010000000000000006»: No existe el
fichero ó directorio
LOG: invalid record length at 0/6000080
cp: no se puede efectuar `stat' sobre
«/usr/local/pgsql/wal_archive/000000010000000000000006»: No existe el
fichero ó directorio
"""

8) i initialize pgbench tables, which create missing WAL files (bin/pgbench -i)

and then it could connect to the primary, and some minutes later it
could accept connections
"""
LOG: streaming replication successfully connected to primary
FATAL: the database system is starting up
FATAL: the database system is starting up
FATAL: the database system is starting up
LOG: redo starts at 0/6000080
LOG: consistent recovery state reached at 0/60000A0
FATAL: the database system is starting up
LOG: database system is ready to accept read only connections
"""

but, my main concern is why it was asking for
"000000010000000000000006"? is this normal? is this standby's way of
saying i'm working but i have nothing to do?
when that happens after a standby restart, is normal that i have to
wait until the file is created before it can accept connections?

sorry, if this questions sound very simple but i haven't following all
the design details :)

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers