Exposing the Xact commit order to the user [PgSql]

Prev: [HACKERS] Exposing the Xact commit order to the user
Next: [HACKERS] Synchronization levels in SR

From: Nicolas Barbier on 25 May 2010 15:53

2010/5/25 Florian Pflug <fgp(a)phlo.org>:

> Hm, but for there to be an actual problem (and not a false positive), an
> actual dangerous circle has to exist in the dependency graph. The
> existence of a dangerous structure is just a necessary (but not
> sufficient) and easily checked-for condition for that, right? Now, if a
> read-only transaction only ever has outgoing edges, it cannot be part
> of a (dangerous or not) circle, and hence any dangerous structure it is
> part of is a false positive.
>
> I guess my line of reasoning is flawed somehow, but I cannot figure out why...

In the general case, "wr" dependencies also create "must be serialized
before" edges. It seems that those edges can be discarded when finding
a pivot, but if you want to go "back to basics":

("<" means "must be serialized before".)

* T1 < T2, because T1 reads a version of a data element for which T2
later creates a newer version (rw between T1 and T2).
* T3 < T1, because T3 reads a version of a data element for which T1
later creates a newer version (rw between T3 and T1).
* T2 < T3, because T2 creates a version of a data element, which is
then read by T3 (wr between T2 and T3).

(As you can see, those 3 edges form a cycle.)

Nicolas

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 25 May 2010 15:57

Florian Pflug <fgp(a)phlo.org> wrote:

> Hm, but for there to be an actual problem (and not a false
> positive), an actual dangerous circle has to exist in the
> dependency graph. The existence of a dangerous structure is just a
> necessary (but not sufficient) and easily checked-for condition
> for that, right? Now, if a read-only transaction only ever has
> outgoing edges, it cannot be part of a (dangerous or not) circle,
> and hence any dangerous structure it is part of is a false
> positive.
>
> I guess my line of reasoning is flawed somehow, but I cannot
> figure out why...

Here's why:

We're tracking rw-dependencies, where the "time-arrow" showing
effective order of execution points from the reader to the writer
(since the reader sees a state prior to the write, it effectively
executes before it). These are important because there have to be
two such dependencies, one in to the pivot and one out from the
pivot, for a problem to exist. (See various works by Dr. Alan
Fekete, et al, for details.) But other dependencies can imply an
order of execution. In particular, a wr-dependency, where a
transaction *can* see data committed by another transaction, implies
that the *writer* came first in the order of execution. In this
example, the transaction which lists the receipts successfully reads
the control table update, but is not able to read the receipt
insert. This completes the cycle, making it a real anomaly and not
a false positive.

Note that the wr-dependency can actually exist outside the database,
making it pretty much impossible to accurately tell a false positive
from a true anomaly when the pivot exists and the transaction
writing data which the pivot can't read commits first. For example,
let's say that the update to the control table is committed from an
application which, seeing that its update came back without error,
proceeds to list the receipts for the old date in a subsequent
transaction. You have a wr-dependency which is, in reality, quite
real and solid with no way to notice it within the database engine.
That's why the techniques used in SSI are pretty hard to improve
upon beyond more detailed and accurate tracking of rw-conflicts.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jan Wieck on 25 May 2010 15:58

On 5/24/2010 9:30 AM, Greg Sabino Mullane wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160
>
>
>> In light of the proposed purging scheme, how would it be able to distinguish
>> between those two cases (nothing there yet vs. was there but purged)?
>
>> There is a difference between an empty result set and an exception.
>
> No, I meant how will the *function* know, if a superuser and/or some
> background process can purge records at any time?

The data contains timestamps which are supposedly taken in commit order.
Checking the age of the last entry in the file should be simple enough
to determine if the segment matches the "max age" configuration (if
set). In the case of a superuser telling what to purge he would just
call a function with a serial number (telling the obsolete segments).

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 25 May 2010 16:16

Jan Wieck <JanWieck(a)Yahoo.com> writes:
>> No, I meant how will the *function* know, if a superuser and/or some
>> background process can purge records at any time?

> The data contains timestamps which are supposedly taken in commit order.

You can *not* rely on the commit timestamps to be in exact order.
(Perhaps approximate ordering is good enough for what you want here,
but just be careful to not fall into the trap of assuming that they're
exactly ordered.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 25 May 2010 16:17

Robert Haas <robertmhaas(a)gmail.com> wrote:

> maybe we should get serializable working and committed on one
> node first and then worry about how to distribute it. I think
> there might be other approaches to this problem

Well, I've got two or three other ideas on how we can manage this
for HS, but since I now realize that I've totally misunderstood the
main use case for this (which is to support trigger-based
replication), I'd like to be clear on something before letting it
drop. The big question is, do such replicas need to support
serializable access to the data modified by serializable
transactions in the source database? That is, is there a need for
such replicas to only see states which are possible in some serial
order of execution of serializable transactions on the source
database? Or to phrase the same question a third way, should there
be a way to run queries on such replicas with confidence that what
is viewed is consistent with user-defined constraints and business
rules?

If not, there's no intersection between this feature and SSI. If
there is, I think we should think through at least a general
strategy sooner, rather than later.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Prev: [HACKERS] Exposing the Xact commit order to the user
Next: [HACKERS] Synchronization levels in SR