New feature request: FlashBack Query [PgSql]

Prev: Piggybacking vacuum I/O
Next: pg_proc without oid?

From: August Zajonc on 19 Feb 2007 13:10

Gregory Stark wrote:
> "Zeugswetter Andreas ADI SD" <ZeugswetterA(a)spardat.at> writes:
>
>>> First we must run the query in serializable mode and replace
>>> the snapshot with a synthetic one, which defines visibility
>>> at the start of the desired transaction
>> We could use something that controls "global xmin".
>> It would ensure, that global xmin does not advance bejond
>> what still needs to be visible. This would probably be a
>> sliding time window, or a fixed point in time that is
>> released by the dba/user.
>
> Well there's another detail you have to cover aside from rolling back your
> xmin. You have to find the rest of the snapshot including knowing what other
> transactions were in-progress at the time you want to flash back to.
>
> If you just roll back xmin and set xmax to the same value you'll get a
> consistent view of the database but it may not match a view that was ever
> current. That is, some of the transactions after the target xmin may have
> committed before that xmin. So there was never a time in the database when
> they were invisible but your new xmin was visible.
>
>[...]
> Incidentally this is one of the things that would be useful for read-only
> access to PITR warm standby machines.
>

Couldn't you define things simply to be that you get a consistent view
including all transactions started before x transaction? This is time
travel lite, but low overhead which I think is a key benefit of this
approach.

A huge value for this would be in the oops, I deleted my data category.
Postgresql rarely looses data, but clients seem to have a habit of doing
so, and then going oops. This seems to happen most often when facing
something like a reporting deadline where they are moving lots of stuff
around and making copies and sometimes delete the wrong "company"
recordset or equivalent, even with confirmation dialogs at the app level.

This would give a quick and easy oops procedure to the client. DBA set's
guc to 1hr, tells client, if you make a big mistake, stop database
server as follows and call. Frankly, would bail a few DBA's out as well.

The key is how lightweight the setup could be, which matters because
clients are not always willing to pay for a PITR setup. The low overhead
would mean you'd feel fine about setting guc to 1hr or so.

As a % of total installed instances I suspect the % with PITR is small.
I've got stuff I snapshot nightly, but that's it. So don't have an easy
out from the oops query either.

- August

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

From: "Florian G. Pflug" on 19 Feb 2007 14:30

August Zajonc wrote:
> Gregory Stark wrote:
>> "Zeugswetter Andreas ADI SD" <ZeugswetterA(a)spardat.at> writes:
>>
>>>> First we must run the query in serializable mode and replace
>>>> the snapshot with a synthetic one, which defines visibility
>>>> at the start of the desired transaction
>>> We could use something that controls "global xmin".
>>> It would ensure, that global xmin does not advance bejond
>>> what still needs to be visible. This would probably be a
>>> sliding time window, or a fixed point in time that is
>>> released by the dba/user.
>> Well there's another detail you have to cover aside from rolling back your
>> xmin. You have to find the rest of the snapshot including knowing what other
>> transactions were in-progress at the time you want to flash back to.
>>
>> If you just roll back xmin and set xmax to the same value you'll get a
>> consistent view of the database but it may not match a view that was ever
>> current. That is, some of the transactions after the target xmin may have
>> committed before that xmin. So there was never a time in the database when
>> they were invisible but your new xmin was visible.
>>
>> [...]
>> Incidentally this is one of the things that would be useful for read-only
>> access to PITR warm standby machines.
>>
>
> Couldn't you define things simply to be that you get a consistent view
> including all transactions started before x transaction? This is time
> travel lite, but low overhead which I think is a key benefit of this
> approach.

I was thinking along the same line. Flashback is probably ony really
usefull on databases that are mostly read-only, but with a few users
who update data. You'd use flashback to undo catastrophic changes done
by accident, and probably will gladly accept that you undo a little
more work than strictly necessary.

On the contrary, if you're running a online shop were people buy stuff
24/7, and, say, somebody accidentally deletes some producs, than you
won't want to loose the orders happened during that last hour, but will
rather try to regenerate that products from your last backup.

So I don't think that it's too important what snapshot you get exactly,
making the xmin=xmax idea feasable.

The same holds true for PITR warm standby (readonly queries on pitr
slaves). This would be used for reporting, or load-balancing of searches
in fairly static data - all of which won't depend on the exact snapshot
you get.

greetings, Florian Pflug

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

From: "August Zajonc" on 19 Feb 2007 15:00

On Mon, 19 Feb 2007 20:30:59 +0100, "Florian G. Pflug" <fgp(a)phlo.org>
said:
> August Zajonc wrote:
> > Gregory Stark wrote:
> >
> > Couldn't you define things simply to be that you get a consistent view
> > including all transactions started before x transaction? This is time
> > travel lite, but low overhead which I think is a key benefit of this
> > approach.
>
> I was thinking along the same line. Flashback is probably ony really
> usefull on databases that are mostly read-only, but with a few users
> who update data. You'd use flashback to undo catastrophic changes done
> by accident, and probably will gladly accept that you undo a little
> more work than strictly necessary.
>
> On the contrary, if you're running a online shop were people buy stuff
> 24/7, and, say, somebody accidentally deletes some producs, than you
> won't want to loose the orders happened during that last hour, but will
> rather try to regenerate that products from your last backup.

Hopefully people doing order systems are using PITR or similar :)

For the time travel light case, it's just a matter of clear definition.
You get all transactions that were *started* before and up to x trx. If
the transaction rolled back you still won't see it, so you're still
getting a consistent view. But if it committed after your marker you
will see it. That seems ok to me. In fact, I suspect folks think of
transactions as happening more or less when they get sent to the DB, so
this may map more directly to what people expect.

The one caveat would be that if you started a long running transaction,
then did the oops trx 5 minutes later, and then started time travel
*before* the long running trx committed. In that case you wouldn't see
that long running trx, so the definition would need to be modified to be
something like all trx started before x, that were no longer running
when you time travel. Don't know if it is worth a NOTICE in the logs if
you time travel back, but there are id's of transactions from before
your xmin that are still running (and if you waited a bit might become
visable in your time travel view).

If Jan gets his way with a timestamp on trx commit, then you can do
started before x time, which may be more user friendly.

For PITR I'd imagine you might actually be able to get the visability
right no? Havn't looked deeply enough into the wal logs to understand
how the partial playback scanario works. If the wal logs are ordered on
trx commit time, then you'd get proper visability.

- August

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo(a)postgresql.org so that your
message can get through to the mailing list cleanly

From: Tom Lane on 19 Feb 2007 22:39

August Zajonc <augustz(a)augustz.com> writes:
> The key is how lightweight the setup could be, which matters because
> clients are not always willing to pay for a PITR setup. The low overhead
> would mean you'd feel fine about setting guc to 1hr or so.

This would have exactly the same performance consequences as always
having an hour-old open transaction. I'm afraid that describing it
as "low overhead" is mere wishful thinking: it would cripple vacuuming
of high-update tables and greatly increase the typical load on pg_clog
and pg_subtrans. We already know that pg_subtrans contention can be a
source of context-swap storms, with the size of the window back to
GlobalXmin being the controlling factor for how bad it gets.

It's possible that this last could be addressed by separating the
concept of "old enough to be vacuumed" from GlobalXmin, but it's
certainly not a trivial thing.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo(a)postgresql.org so that your
message can get through to the mailing list cleanly

From: "Jonah H. Harris" on 19 Feb 2007 23:54

On 2/17/07, Joshua D. Drake <jd(a)commandprompt.com> wrote:
> My understanding is that the main difference is that rollbacks are
> inexpensive for us, but expensive for Oracle.

Yes, Oracle is optimized for COMMIT, we're optimized for ROLLBACK :)

In all seriousness, last time I checked Oracle's MVCC was covered by
two patents.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
33 Wood Ave S, 3rd Floor | jharris(a)enterprisedb.com
Iselin, New Jersey 08830 | http://www.enterprisedb.com/

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: Piggybacking vacuum I/O
Next: pg_proc without oid?