From: Greg Smith on
Tom Lane wrote:
> If we were actually going in this direction we'd
> want to write a much better WAL-text-dump tool than we have, and then
> in principle somebody could sanitize the text output before shipping
> it off. But going through a large volume of data that way could be
> pretty impractical. Also, we (or at least I) have nearly zip experience
> with trying to debug problems by examining WAL, so it's not real clear
> to me which details might be important.
>

There's another interesting thing about moving in this direction too.
Systems that have WAL archiving setup that run into a problem could end
up being a much richer source for historical analysis of how the system
got into the bad state than is available right now. Typically those can
have longer histories available than you'll find on a primary that's
recycling segments all the time.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(a)2ndQuadrant.com www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Stephen Frost on
* Tom Lane (tgl(a)sss.pgh.pa.us) wrote:
> If we were actually going in this direction we'd
> want to write a much better WAL-text-dump tool than we have, and then
> in principle somebody could sanitize the text output before shipping
> it off. But going through a large volume of data that way could be
> pretty impractical. Also, we (or at least I) have nearly zip experience
> with trying to debug problems by examining WAL, so it's not real clear
> to me which details might be important.

Just an off-the-wall thought, but, would it be possible to have a tool
which read WAL backwards and compared entries in the WAL against entries
on disk? I realize that you'd only see one version of a particular
block and then have to skip any updates which are earlier than it, but
it seems like you could cover a pretty large chunk of the recent changes
to the database using this approach..

Thanks,

Stephen
From: Greg Stark on
On Tue, Jun 8, 2010 at 2:26 AM, Stephen Frost <sfrost(a)snowman.net> wrote:
> Just an off-the-wall thought, but, would it be possible to have a tool
> which read WAL backwards and compared entries in the WAL against entries
> on disk? �I realize that you'd only see one version of a particular
> block and then have to skip any updates which are earlier than it, but
> it seems like you could cover a pretty large chunk of the recent changes
> to the database using this approach..

I assume you mean back out the changes incrementally until you find a
full_page_write and see if it matches? And continue comparing with
full_page_writes once per checkpoint? I don't think the WAL has enough
information to replay backwards though. For example vacuum cleanup
records just list the tids to remove. They don't have the contents to
replace there.



--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jan Wieck on
On 6/8/2010 8:27 AM, Greg Stark wrote:
> On Tue, Jun 8, 2010 at 2:26 AM, Stephen Frost <sfrost(a)snowman.net> wrote:
>> Just an off-the-wall thought, but, would it be possible to have a tool
>> which read WAL backwards and compared entries in the WAL against entries
>> on disk? I realize that you'd only see one version of a particular
>> block and then have to skip any updates which are earlier than it, but
>> it seems like you could cover a pretty large chunk of the recent changes
>> to the database using this approach..
>
> I assume you mean back out the changes incrementally until you find a
> full_page_write and see if it matches? And continue comparing with
> full_page_writes once per checkpoint? I don't think the WAL has enough
> information to replay backwards though. For example vacuum cleanup
> records just list the tids to remove. They don't have the contents to
> replace there.

You can't back out changes. WAL does not contain before images.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Stephen Frost on
* Greg Stark (gsstark(a)mit.edu) wrote:
> I assume you mean back out the changes incrementally until you find a
> full_page_write and see if it matches?

To be honest, you're already assuming I know more about how this all
works than I do. :) The gist of my thought was simply- we write out
block changes to the WAL, including data in many cases. If we were to
look at the very end of the WAL, at the last piece of data written
there, and the data files have supposedly been flushed, then what's in
the WAL at that point should match what's in the data files, right? If
it doesn't, that'd be bad.

> And continue comparing with
> full_page_writes once per checkpoint?

If we could only do it when there's a full page write, then perhaps that
would work as well, but I thought we tracked them at a lower level. In
any case, the idea is the same- compare what's in WAL to what's supposed
to be on disk, and alarm whenever there's a clear error.

> I don't think the WAL has enough
> information to replay backwards though. For example vacuum cleanup
> records just list the tids to remove. They don't have the contents to
> replace there.

Right, you couldn't actually move the database backwards in time using
this tool (because we only write out new data, we don't write out what
was in that block/page before the write)- that isn't the idea or intent.
It would just be a tool that someone could run against a database where
they've detected corruption (or, I dunno, more frequently, to perhaps
catch corruption faster?), to see if the problem is a PG bug or a
hardware/kernel/etc issue. In fact, if you could somehow do this
against a backup that's been taken using pg_start_backup/pg_stop_backup,
that would be pretty awesome.

I know that if such a tool existed, I'd be happy to run it as part of my
regular backup routines- I *always* have all the WALs from my last
backup to my next backup (and typically farther back than that, eg: if I
run full backups weekly, I'll have 4 full backups + all 4 weeks of WALs,
to be able to replay back to any point in the month..).

The big question that I have is- would this actually be productive?
Would it actually be able to catch hardware corruption or help at all
with PG bugs? Those are the things I'm not really sure about.

Thanks,

Stephen