Streaming replication status [PgSql]

Prev: [HACKERS] synchronized snapshots
Next: synchronized snapshots

From: Fujii Masao on 13 Jan 2010 00:34

On Tue, Jan 12, 2010 at 10:16 AM, Bruce Momjian <bruce(a)momjian.us> wrote:
> I am concerned that knowledge of this new read-only replication user
> would have to be spread all over the backend code, which is really not
> something we should be doing at this stage in 8.5 development. I am
> also thinking such a special user might fall out of work on mandatory
> access control, so maybe we should just require super-user for 8.5 and
> revisit this for 8.6.

OK. I leave that code as it is. If the majority feel it's overkill to
require a superuser privilege when authenticating the standby, we can
just drop it later.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Smith on 13 Jan 2010 03:47

Stefan Kaltenbrunner wrote:
> so is there an actually concrete proposal of _what_ interals to expose? '

The pieces are coming together...summary:

-Status quo: really bad, but could probably ship anyway because
existing PITR is no better and people manage to use it
-Add slave pg_current_xlog_location() and something like
pg_standby_received_xlog_location(): Much better, gets rid of the worst
issues here.
-Also add pg_standbys_xlog_location() on the master: while they could
live without it, this really helps out the "alert/monitor" script writer
whose use cases keep popping up here.

Details...the original idea from Fujii was:

"I'm thinking something like pg_standbys_xlog_location() [on the
primary] which returns
one row per standby servers, showing pid of walsender, host name/
port number/user OID of the standby, the location where the standby
has written/flushed WAL. DBA can measure the gap from the
combination of pg_current_xlog_location() and pg_standbys_xlog_location()
via one query on the primary."

After some naming quibbles and questions about what direction that
should happen in, Tom suggested the initial step here is:

"It seems to me that we should have at least two functions available
on the slave: latest xlog location received and synced to disk by
walreceiver (ie, we are guaranteed to be able to replay up to here);
and latest xlog location actually replayed (ie, the state visible
to queries on the slave). The latter perhaps could be
pg_current_xlog_location()."

So there's the first two of them: on the slave,
pg_current_xlog_location() giving the latest location replayed, and a
new one named something like pg_standby_received_xlog_location(). If
you take the position that an unreachable standby does provide answers
to these questions too (you just won't like them), this pair might be
sufficient to ship.

To help a lot at dealing with all the error situations where the standby
isn't reachable and segments are piling up (possibly leading to full
disk), the next figure that seems to answer the most questions is asking
the primary "what's the location of the last WAL segment file in the
pile of ones to be archived/distributed that has been requested (or
processed if that's the easier thing to note) by the standby?". That's
what is named pg_standbys_xlog_location() in the first paragraph I
quoted. If you know enough to identify that segment file on disk, you
can always look at its timestamp (and the ones on the rest of the files
in that directory) in a monitoring script to turn that information into
segments or a time measurement instead--xlog segments are nicely ordered
after all.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(a)2ndQuadrant.com www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 14 Jan 2010 03:33

On Wed, Jan 13, 2010 at 5:47 PM, Greg Smith <greg(a)2ndquadrant.com> wrote:
> The pieces are coming together...summary:

Thanks for the summary!

> -Also add pg_standbys_xlog_location() on the master: while they could live without it, this really helps out the "alert/monitor" script writer whose use cases keep popping up here.
>
> Details...the original idea from Fujii was:
>
> "I'm thinking something like pg_standbys_xlog_location() [on the primary] which returns
> one row per standby servers, showing pid of walsender, host name/
> port number/user OID of the standby, the location where the standby
> has written/flushed WAL. DBA can measure the gap from the
> combination of pg_current_xlog_location() and pg_standbys_xlog_location()
> via one query on the primary."

This function is useful but not essential for troubleshooting, I think.
So I'd like to postpone it.

> "It seems to me that we should have at least two functions available
> on the slave: latest xlog location received and synced to disk by
> walreceiver (ie, we are guaranteed to be able to replay up to here);
> and latest xlog location actually replayed (ie, the state visible
> to queries on the slave). The latter perhaps could be
> pg_current_xlog_location()."
>
> So there's the first two of them: on the slave, pg_current_xlog_location()
> giving the latest location replayed, and a new one named something like
> pg_standby_received_xlog_location(). If you take the position that an
> unreachable standby does provide answers to these questions too (you just
> won't like them), this pair might be sufficient to ship.

Done.

git://git.postgresql.org/git/users/fujii/postgres.git
branch: replication

I added two new functions;

(1) pg_last_xlog_receive_location() reports the last WAL location received
and synced by walreceiver. If streaming replication is still in progress
this will increase monotonically. If streaming replication has completed
then this value will remain static at the value of the last WAL record
received and synced. When the server has been started without a streaming
replication then the return value will be InvalidXLogRecPtr (0/0).

(2) pg_last_xlog_replay_location() reports the last WAL location replayed
during recovery. If recovery is still in progress this will increase
monotonically. If recovery has completed then this value will remain
static at the value of the last WAL record applied. When the server has
been started normally without a recovery then the return value will be
InvalidXLogRecPtr (0/0).

Since it's somewhat odd for me that pg_current_xlog_location() reports the
WAL replay location, I didn't do that. But if the majority feel that it's sane,
I'll merge pg_last_xlog_replay_location() into pg_current_xlog_location().

Thought? Better name?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Smith on 14 Jan 2010 23:07

Fujii Masao wrote:
>> "I'm thinking something like pg_standbys_xlog_location() [on the primary] which returns
>> one row per standby servers, showing pid of walsender, host name/
>> port number/user OID of the standby, the location where the standby
>> has written/flushed WAL. DBA can measure the gap from the
>> combination of pg_current_xlog_location() and pg_standbys_xlog_location()
>> via one query on the primary."
>>
>
> This function is useful but not essential for troubleshooting, I think.
> So I'd like to postpone it.
>

Sure; in a functional system where primary and secondary are both up,
you can assemble the info using the new functions you just added, so
this other one is certainly optional. I just took a brief look at the
code of the features you added, and it looks like it exposes the minimum
necessary to make this whole thing possible to manage. I think it's OK
if you postpone this other bit, more important stuff for you to work on.

So: the one piece of information I though was most important to expose
here at an absolute minimum is there now. Good progress. The other
popular request that keeps popping up here is providing an easy way to
see how backlogged the archive_command is, to make it easier to monitor
for out of disk errors that might prove catastrophic to replication.

I just spent some time looking through the WAL/archiving code in that
context. It looks to me that that this information isn't really stored
anywhere right now. The only thing that knows what segment is currently
queued up to copy over is pgarch_ArchiverCopyLoop via its call to
pgarch_readyXlog. Now, this is a pretty brute-force piece of code: it
doesn't remember its previous work at all, it literally walks the
archive_status directory looking for *.ready files that have names that
look like xlog files, then returns the earliest. That unfortunately
means that it's not even thinking in the same terms as all these other
functions, which are driven by the xlog_location advancing, and then the
filename is computed from that. All you've got is the filename at this
point, and it's not even guaranteed to be real--you could easily fool
this code if you dropped an inappropriately named file into that directory.

I could easily update this code path to save the name of the last
archived file in memory while all this directory scanning is going on
anyway, and then provide a UDF to expose that bit of information. The
result would need to have documentation that disclaims it like this:

pg_last_archived_xlogfile() text: Get the name of the last file the
archive_command [tried to|successfully] archived since the server was
started. If archiving is disabled or no xlog files have become ready to
archive since startup, a blank line will be returned. It is possible
for this function to return a result that does not reflect an actual
xlogfile if files are manually added to the server's archive_status
directory.

I'd find this extremely handy as a hook for monitoring scripts that want
to watch the server but don't have access to the filesystem directly,
even given those limitations. I'd prefer to have the "tried to"
version, because it will populate with the name of the troublesome file
it's stuck on even if archiving never gets its first segment delivered.

I'd happily write a patch to handle all that if I thought it would be
accepted. I fear that the whole approach will be considered a bit too
hackish and get rejected on that basis though. Not really sure of a
"right" way to handle this though. Anything better is going to be more
complicated because it requires passing more information into the
archiver, with little gain for that work beyond improving the quality of
this diagnostic routine. And I think most people would find what I
described above useful enough.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(a)2ndQuadrant.com www.2ndQuadrant.com

From: Simon Riggs on 14 Jan 2010 23:20

On Thu, 2010-01-14 at 23:07 -0500, Greg Smith wrote:

> pg_last_archived_xlogfile() text: Get the name of the last file the
> archive_command [tried to|successfully] archived since the server was
> started. If archiving is disabled or no xlog files have become ready
> to archive since startup, a blank line will be returned.

OK

> It is possible for this function to return a result that does not
> reflect an actual xlogfile if files are manually added to the server's
> archive_status directory.

> I'd find this extremely handy as a hook for monitoring scripts that
> want to watch the server but don't have access to the filesystem
> directly, even given those limitations. I'd prefer to have the "tried
> to" version, because it will populate with the name of the troublesome
> file it's stuck on even if archiving never gets its first segment
> delivered.
>
> I'd happily write a patch to handle all that if I thought it would be
> accepted. I fear that the whole approach will be considered a bit too
> hackish and get rejected on that basis though. Not really sure of a
> "right" way to handle this though. Anything better is going to be
> more complicated because it requires passing more information into the
> archiver, with little gain for that work beyond improving the quality
> of this diagnostic routine. And I think most people would find what I
> described above useful enough.

Yes, please write it. It's separate from SR, so will not interfere.

--
Simon Riggs www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Prev: [HACKERS] synchronized snapshots
Next: synchronized snapshots