Streaming replication, and walsender during recovery [PgSql]

Prev: [HACKERS] Bloom filters bloom filters bloom filters
Next: Build farm tweaks

From: Heikki Linnakangas on 28 Jan 2010 13:49

Simon Riggs wrote:
> I'm a little worried the feature set of streaming rep isn't any better
> than what we have already.

Huh? Are you thinking of the "Record-based Log Shipping" described in
the manual, using a program to poll pg_xlogfile_name_offset() in a tight
loop, as a replacement for streaming replication? First of all, that
requires a big chunk of custom development, so it's a bit of a stretch
to say we have it already. Secondly, with that method, the standby still
still be replaying the WAL one file at a time, which makes a difference
with Hot Standby.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 28 Jan 2010 13:58

On Thu, 2010-01-28 at 20:49 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > I'm a little worried the feature set of streaming rep isn't any better
> > than what we have already.
>
> Huh? Are you thinking of the "Record-based Log Shipping" described in
> the manual, using a program to poll pg_xlogfile_name_offset() in a tight
> loop, as a replacement for streaming replication? First of all, that
> requires a big chunk of custom development, so it's a bit of a stretch
> to say we have it already.

It's been part of Skytools for years now...

> Secondly, with that method, the standby still
> still be replaying the WAL one file at a time, which makes a difference
> with Hot Standby.

I'm not attempting to diss Streaming Rep, or anyone involved. What has
been done is good internal work. I am pointing out and requesting that
we should have a little more added before we stop for this release.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on 28 Jan 2010 14:00

Simon Riggs wrote:
> On Thu, 2010-01-28 at 10:48 -0500, Tom Lane wrote:
>> Fujii Masao <masao.fujii(a)gmail.com> writes:
>>> How about just making a restore_command copy the WAL files as the
>>> normal one (e.g., 0000...) instead of a pg_xlog/RECOVERYXLOG?
>>> Though we need to worry about deleting them, we can easily leave
>>> the task to the bgwriter.
>> The reason for doing it that way was to limit disk space usage during
>> a long restore. I'm not convinced we can leave the task to the bgwriter
>> --- it shouldn't be deleting anything at that point.
>
> I think "bgwriter" means RemoveOldXlogFiles(), which would normally
> clear down files at checkpoint. If that was added to the end of
> RecoveryRestartPoint() to do roughly the same job then it could
> potentially work.

SR added a RemoveOldXLogFiles() call to CreateRestartPoint().

(Since 8.4, RecoveryRestartPoint() just writes the location of the
checkpoint record in shared memory, but doesn't actually perform the
restartpoint; bgwriter does that in CreateRestartPoint()).

> However, since not every checkpoint is a restartpoint we might easily
> end up with significantly more WAL files on the standby than would
> normally be there when it would be a primary. Not sure if that is an
> issue in this case, but we can't just assume we can store all files
> needed to restart the standby on the standby itself, in all cases. That
> might be an argument to add a restartpoint_segments parameter, so we can
> trigger restartpoints on WAL volume as well as time. But even that would
> not put an absolute limit on the number of WAL files.

I think it is a pretty important safety feature that we keep all the WAL
around that's needed to recover the standby. To avoid out-of-disk-space
situation, it's probably enough in practice to set checkpoint_timeout
small enough in the standby to trigger restartpoints often enough.

At the moment, we do retain streamed WAL as long as it's needed, but not
the WAL restored from archive.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on 28 Jan 2010 14:13

On Thu, 2010-01-28 at 21:00 +0200, Heikki Linnakangas wrote:
> However, since not every checkpoint is a restartpoint we might easily
> > end up with significantly more WAL files on the standby than would
> > normally be there when it would be a primary. Not sure if that is an
> > issue in this case, but we can't just assume we can store all files
> > needed to restart the standby on the standby itself, in all cases.
> That
> > might be an argument to add a restartpoint_segments parameter, so we
> can
> > trigger restartpoints on WAL volume as well as time. But even that
> would
> > not put an absolute limit on the number of WAL files.
>
> I think it is a pretty important safety feature that we keep all the
> WAL around that's needed to recover the standby. To avoid
> out-of-disk-space situation, it's probably enough in practice to set
> checkpoint_timeout small enough in the standby to trigger
> restartpoints often enough.

Hmm, I'm sorry but that's bogus. Retaining so much WAL that we are
strongly in danger of blowing disk space is not what I would call a
safety feature. Since there is no way to control or restrain the number
of files for certain, that approach seems fatally flawed. Reducing
checkpoint_timeout is the opposite of what you would want to do for
performance.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on 28 Jan 2010 14:16

Joshua D. Drake wrote:
> ...if with SR the entire log must be written before it streams to the slaves.

No.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7
Prev: [HACKERS] Bloom filters bloom filters bloom filters
Next: Build farm tweaks