New feature request: FlashBack Query [PgSql]

Prev: Piggybacking vacuum I/O
Next: pg_proc without oid?

From: "Zeugswetter Andreas ADI SD" on 19 Feb 2007 09:32

> > Well this is certainly interesting. What do we think it
> would take to
> > enable the functionality?
>
> First we must run the query in serializable mode and replace
> the snapshot with a synthetic one, which defines visibility
> at the start of the desired transaction

We could use something that controls "global xmin".
It would ensure, that global xmin does not advance bejond
what still needs to be visible. This would probably be a
sliding time window, or a fixed point in time that is
released by the dba/user.

Then all below is not really different from a situation where
you had a long running tx.

> probably it is a good idea to take a lock on all tables
> involved to avoid a vacuum to be started on them when the
> query is running.
>
> also, we can't trust the DELETED flags in index pages, so we
> should forbid index scans, or just always re-check the
> visibility in heap.
>
> Otherways it would probably be enough to just scan tuples as
> usual, and check if they were visible to desired transaction,
> that is they were inserted before that transaction and they
> are not deleted before that trx.
>
> Of course this will not be true, once we have HOT/WIP with
> in-page vacuuming.

Currently I think HOT does honor "global xmin". There is no
lookup for relevant xids, so parts of an update chain where
only a previous tuple or a later tuple can be visible are reused.
Else Hot would need to be told not to, in a scenario where
a backend can choose a snapshot at will.

Andreas

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo(a)postgresql.org so that your
message can get through to the mailing list cleanly

From: "Zeugswetter Andreas ADI SD" on 19 Feb 2007 09:38

> > >First we must run the query in serializable mode and replace the
> > >snapshot with a synthetic one, which defines visibility at the
start
> > >of the desired transaction
> > >
> > >probably it is a good idea to take a lock on all tables involved to

> > >avoid a vacuum to be started on them when the query is running.
> > Would the xmin exported by that transaction prevent vacuum from
> > removing any tuples still needed for the flashback snapshot?
>
> Sure, and that makes the mentioned lock unnecessary.

Problem is, that that transaction sets a historic snapshot at a later
time, so it is not yet running when vacuum looks at "global xmin".
So something else needs to hold up global xmin (see prev post).

Andreas

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

From: "Florian G. Pflug" on 19 Feb 2007 10:00

Zeugswetter Andreas ADI SD wrote:
>>>> First we must run the query in serializable mode and replace the
>>>> snapshot with a synthetic one, which defines visibility at the
> start
>>>> of the desired transaction
>>>>
>>>> probably it is a good idea to take a lock on all tables involved to
>
>>>> avoid a vacuum to be started on them when the query is running.
>>> Would the xmin exported by that transaction prevent vacuum from
>>> removing any tuples still needed for the flashback snapshot?
>> Sure, and that makes the mentioned lock unnecessary.
>
> Problem is, that that transaction sets a historic snapshot at a later
> time, so it is not yet running when vacuum looks at "global xmin".
> So something else needs to hold up global xmin (see prev post).

I think to make this flashback stuff fly, you'd need to know the
earliest xmin that you can still flashback too. Vacuum would advance
that xmin, as soon as it starts working. So the case you'd need to
protect against would be a race condition when you start a vacuum
and a flashback transaction at the same time. But for that, some simple
semaphore should suffice, and a well-thought-out ordering of the actions
taken.

In the long run, you'd probably want to store the commit-times of
transactions somewhere, and add some guc that makes a vacuum assume
that recently comitted transaction (say, in the last hour) are still
considered active. That allow the dba to guarantee that he can always
flashback at least a hour.

greetings, Florian Pflug

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

From: Gregory Stark on 19 Feb 2007 10:18

"Zeugswetter Andreas ADI SD" <ZeugswetterA(a)spardat.at> writes:

>> First we must run the query in serializable mode and replace
>> the snapshot with a synthetic one, which defines visibility
>> at the start of the desired transaction
>
> We could use something that controls "global xmin".
> It would ensure, that global xmin does not advance bejond
> what still needs to be visible. This would probably be a
> sliding time window, or a fixed point in time that is
> released by the dba/user.

Well there's another detail you have to cover aside from rolling back your
xmin. You have to find the rest of the snapshot including knowing what other
transactions were in-progress at the time you want to flash back to.

If you just roll back xmin and set xmax to the same value you'll get a
consistent view of the database but it may not match a view that was ever
current. That is, some of the transactions after the target xmin may have
committed before that xmin. So there was never a time in the database when
they were invisible but your new xmin was visible.

I think to do this you'll need to periodically record a snapshot and then
later restore one of those saved snapshots. Not sure where would be a good
place to record them. The WAL seems like a handy place but digging through the
WAL would be annoying.

Incidentally this is one of the things that would be useful for read-only
access to PITR warm standby machines.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

From: tomas on 19 Feb 2007 11:18

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, Feb 19, 2007 at 04:00:09PM +0100, Florian G. Pflug wrote:
[...]
> In the long run, you'd probably want to store the commit-times of
> transactions somewhere, and add some guc that makes a vacuum assume
> that recently comitted transaction (say, in the last hour) are still
> considered active [...]

Funny how some things recur:

<http://archives.postgresql.org/pgsql-hackers/2007-01/msg01301.php>

(says I and seeks shelter beneath a big rock ;-)

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFF2c3RBcgs9XrR2kYRAh1PAJ442IXzr0CjN0w5a3BpwBrKgVGvsgCcCmyh
mnM5AUTHo4uIZ/WCnWxLVM0=
=1aUG
-----END PGP SIGNATURE-----

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: Piggybacking vacuum I/O
Next: pg_proc without oid?