From: Robert Haas on
On Mon, May 24, 2010 at 12:51 PM, Kevin Grittner
<Kevin.Grittner(a)wicourts.gov> wrote:
> Robert Haas  wrote:
>> I think you're confusing two subtly different things.
>
> The only thing I'm confused about is what benefit anyone expects to
> get from looking at data between commits in some way other than our
> current snapshot mechanism.  Can someone explain a use case where
> what Jan is proposing is better than snapshot isolation?  It doesn't
> provide any additional integrity guarantees that I can see.

It's a tool for replication solutions to use.

>> But the commit order is still the order the effects of those
>> transactions have become visible - if we inserted a new read-only
>> transaction into the stream at some arbitrary point in time, it
>> would see all the transactions which committed before it and none
>> of those that committed afterward.
>
> Isn't that what a snapshot does already?

Yes, for a particular transaction. But this is to allow transactions
to be replayed (in order) on another node.

>> your proposed fix sounds like it would be prohibitively expensive
>> for many users. But can this actually happen?
>
> How so?  The transaction start/end logging, or looking at that data
> when building a snapshot?

I guess what I'm asking is - if the reconstructed transaction order
inferred by SSI doesn't match the actual commit order, can we get a
serialization anomaly on the standby by replaying transactions there
in commit order? Can you give an example and explain how your
proposal would solve it?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alvaro Herrera on
Excerpts from Robert Haas's message of dom may 23 20:38:14 -0400 2010:
> On Sun, May 23, 2010 at 4:21 PM, Jan Wieck <JanWieck(a)yahoo.com> wrote:
> > The system will have postgresql.conf options for enabling/disabling the
> > whole shebang, how many shared buffers to allocate for managing access
> > to the data and to define the retention period of the data based on data
> > volume and/or age of the commit records.
>
> It would be nice if this could just be managed out of shared_buffers
> rather than needing to configure a separate pool just for this
> feature.

FWIW we've talked about this for years -- see old discussions about how
pg_subtrans becomes a bottleneck in certain cases and you want to
enlarge the number of buffers allocated to it (probably easy to find by
searching posts from Jignesh). I'm guessing the new notify code would
benefit from this as well.

It'd be nice to have as a side effect, but if not, IMHO this proposal
could simply use a fixed buffer pool like all other slru.c callers until
someone gets around to fixing that. Adding more GUC switches for this
strikes me as overkill.

--
Álvaro Herrera <alvherre(a)alvh.no-ip.org>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Mon, May 24, 2010 at 3:07 PM, Alvaro Herrera <alvherre(a)alvh.no-ip.org> wrote:
> It'd be nice to have as a side effect, but if not, IMHO this proposal
> could simply use a fixed buffer pool like all other slru.c callers until
> someone gets around to fixing that.  Adding more GUC switches for this
> strikes me as overkill.

I agree.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Dan Ports on
On Sun, May 23, 2010 at 04:21:58PM -0400, Jan Wieck wrote:
> In some systems (data warehousing, replication), the order of commits is
> important, since that is the order in which changes have become visible.
> This information could theoretically be extracted from the WAL, but
> scanning the entire WAL just to extract this tidbit of information would
> be excruciatingly painful.

This is very interesting to me as I've been doing some (research --
nowhere near production-level) work on building a transactional
application-level (i.e. memcached-like) cache atop Postgres. One of the
features I needed to support it was basically what you describe.

Without getting too far into the details of what I'm doing, I needed to
make it clear to a higher layer which commits were visible to a given
query. That is, I wanted to know both the order of commits and where
particular snapshots fit into this ordering. (A SnapshotData struct
obviously contains the visibility information, but a representation in
terms of the commit ordering is both more succinct and allows for easy
ordering comparisons).

Something you might want to consider, then, is adding an interface to
find out the timestamp of the current transaction's snapshot, i.e. the
timestamp of the most recent committed transaction visible to it. I
wouldn't expect this to be difficult to implement as transaction
completion/visibility is already synchronized via ProcArrayLock.

> Each record of the Transaction Commit Info consists of
>
> txid xci_transaction_id
> timestamptz xci_begin_timestamp
> timestamptz xci_commit_timestamp
> int64 xci_total_rowcount

Another piece of information that seems useful to provide here would be
the logical timestamp of the transaction, i.e. a counter that's
incremented by one for each transaction. But maybe that's implicit in
the log ordering?

I'm not clear on why the total rowcount is useful, but perhaps I'm
missing something obvious.


I've actually implemented some semblance of this on Postgres 8.2, but it
sounds like what you're interested in is more sophisticated. In
particular, I wasn't at all concerned with durability or WAL stuff, and
I had some specific requirements about when it was OK to purge the
data. Because of this (and very limited development time), I just
threw something together with a simple shared buffer.

I don't think I have any useful code to offer, but let me know if
there's some way I can help out.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on
Robert Haas wrote:

> It's a tool for replication solutions to use.

I was thrown by the original post referencing "data warehousing".
For replication I definitely see that it would be good to provide
some facility to grab a coherent snapshot out of the transaction
stream, but I'm still not clear on a use case where other solutions
aren't better. If you want a *particular* past snapshot, something
akin to the transactional caching that Dan Ports mentioned seems
best. If you just want a coherent snapshot like snapshot isolation,
the current mechanisms seem to work (unless I'm missing something?).
If you want solid data integrity querying the most recent replicated
data, the proposal I posted earlier in the thread is the best I can
see, so far.

> if the reconstructed transaction order inferred by SSI doesn't
> match the actual commit order, can we get a serialization anomaly
> on the standby by replaying transactions there in commit order?

Yes. If we don't do *something* to address it, the replicas
(slaves) will operate as read-only snapshot isolation, not true
serializable.

> Can you give an example and explain how your proposal would solve
> it?

I gave an example (without rigorous proof accompanying it, granted)
earlier in the thread. In that example, if you allow a selection
against a snapshot which includes the earlier commit (the update of
the control table) and before the later commits (the receipts which
used the old deposit date) you have exactly the kind of
serialization anomaly which the work in progress prevents on the
source (master) database -- the receipts *appear* to run in earlier
transactions because the see the pre-update deposit date, but they
show up out of order. As far as I'm concerned this is only a
problem if the user *requested* serializable behavior for all
transactions involved.

If we send the information I suggested in the WAL stream, then any
slave using the WAL stream could build a snapshot for a serializable
transaction which excluded serializable transactions from the source
which overlap with still-pending serializable transactions on the
source. In this example, the update of the control table would not
be visible to a serializable transaction on the slave until any
overlapping serializable transactions (which would include any
receipts using the old date) had also committed, so you could never
see the writes out of order.

I don't think that passing detailed predicate locking information
would be feasible from a performance perspective, but since the
slaves are read-only, I think it is fine to pass just the minimal
transaction-level information I described.

-Kevin



--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers