Prev: [HACKERS] Check constraints on non-immutable keys
Next: Proposal for 9.1: WAL streaming from WALbuffers
From: Alvaro Herrera on 30 Jun 2010 17:05 Excerpts from Devrim GÜNDÜZ's message of mié jun 30 14:54:06 -0400 2010: > One of the things that interested me was parallel recovery feature. They > said that they are keeping separate xlogs for each database, which > speeds ups recovery in case of a crash. It also would increase > performance, since we could write xlogs to separate disks. I'm not sure about this. You'd need to have one extra WAL stream, for shared catalogs; and what would you do to a transaction that touches both shared catalogs and also local objects? You'd have to split the WAL entries in those two WAL streams. I think you could try to solve this by having yet another WAL stream for transaction commit, and have the database-specific streams reference that one. Operations touching shared catalogs would act as barriers: all other databases' WAL streams would have to be synchronized to that one. This would still allow you to have some concurrency because, presumably, operations on shared catalogs are rare. -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 30 Jun 2010 18:55 2010/6/30 Alvaro Herrera <alvherre(a)commandprompt.com>: > Excerpts from Devrim G�ND�Z's message of mi� jun 30 14:54:06 -0400 2010: > >> One of the things that interested me was parallel recovery feature. They >> said that they are keeping separate xlogs for each database, which >> speeds ups recovery in case of a crash. It also would increase >> performance, since we could write xlogs to separate disks. > > I'm not sure about this. You'd need to have one extra WAL stream, for > shared catalogs; and what would you do to a transaction that touches > both shared catalogs and also local objects? �You'd have to split the > WAL entries in those two WAL streams. > > I think you could try to solve this by having yet another WAL stream for > transaction commit, and have the database-specific streams reference > that one. �Operations touching shared catalogs would act as barriers: > all other databases' WAL streams would have to be synchronized to that > one. �This would still allow you to have some concurrency because, > presumably, operations on shared catalogs are rare. I think one per database and one extra one for the shared catalogs would be enough. Most transactions would either touch either just the database, or just the shared catalogs, so you'd write the commit record in whichever stream was appropriate. If you had a transaction that touched both, you'd write the commit record in both places, and include in each stream a reference to the other stream. On replay, when you reach a commit record that references the another stream, you pause until the reference stream also reaches the matching commit record. If you reach the end of that WAL stream without finding the commit record, then, in archive recovery, you just keep waiting for more of the stream to arrive; and, in crash recovery, you write a matching commit record at the end of WAL. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 30 Jun 2010 19:05 Robert Haas <robertmhaas(a)gmail.com> writes: > I think one per database and one extra one for the shared catalogs > would be enough. Most transactions would either touch either just the > database, or just the shared catalogs, so you'd write the commit > record in whichever stream was appropriate. If you had a transaction > that touched both, you'd write the commit record in both places, and > include in each stream a reference to the other stream. On replay, > when you reach a commit record that references the another stream, you > pause until the reference stream also reaches the matching commit > record. If you reach the end of that WAL stream without finding the > commit record, then, in archive recovery, you just keep waiting for > more of the stream to arrive; and, in crash recovery, you write a > matching commit record at the end of WAL. Surely you'd have to roll back, not commit, in that situation. You have no excuse for assuming that you've replayed all effects of the transaction. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 30 Jun 2010 19:17 2010/6/30 Tom Lane <tgl(a)sss.pgh.pa.us>: > Robert Haas <robertmhaas(a)gmail.com> writes: >> I think one per database and one extra one for the shared catalogs >> would be enough. �Most transactions would either touch either just the >> database, or just the shared catalogs, so you'd write the commit >> record in whichever stream was appropriate. �If you had a transaction >> that touched both, you'd write the commit record in both places, and >> include in each stream a reference to the other stream. �On replay, >> when you reach a commit record that references the another stream, you >> pause until the reference stream also reaches the matching commit >> record. �If you reach the end of that WAL stream without finding the >> commit record, then, in archive recovery, you just keep waiting for >> more of the stream to arrive; and, in crash recovery, you write a >> matching commit record at the end of WAL. > > Surely you'd have to roll back, not commit, in that situation. �You have > no excuse for assuming that you've replayed all effects of the > transaction. Hmm, good point. But you could make it work either way, I think. If you flush WAL stream A, write commit record to WAL stream B, flush WAL stream B, write commit record to WAL stream A, then commit is correct. If you write commit record to A, flush A, write commit record to B, flush B, then abort is correct. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 30 Jun 2010 19:40 Robert Haas <robertmhaas(a)gmail.com> writes: > 2010/6/30 Tom Lane <tgl(a)sss.pgh.pa.us>: >> Surely you'd have to roll back, not commit, in that situation. �You have >> no excuse for assuming that you've replayed all effects of the >> transaction. > Hmm, good point. But you could make it work either way, I think. If > you flush WAL stream A, write commit record to WAL stream B, flush WAL > stream B, write commit record to WAL stream A, then commit is correct. I don't think so. "I flushed this" is not equivalent to "it is certain that it will be possible to read this again". In particular, corruption of WAL stream A leaves you in trouble if you take the commit on B as a certificate for stream A being complete. (thinks for a bit...) Maybe if the commit record on B included a minimum stopping point for stream A, it'd be all right. This wouldn't be exactly the expected LSN of the A commit record, mind you, because you don't want to block insertions into the A stream while you're flushing B. But it would say that all non-commit records for the xact on stream A are known to be before that point. If you've replayed A that far then you can take the transaction as being committable. (thinks some more...) No, you still lose, because a commit record isn't just a single bit. What about subtransactions for example? I guess maybe the commit record written/flushed first is the real commit record with all the auxiliary data, and the one written second isn't so much a commit record as a fencepoint record to prevent advancing beyond that point in stream A before you've processed the relevant commit from B. (thinks some more...) Maybe you don't even need the fencepoint record per se. I think all it's doing for you is making sure you don't process commit records on different streams out-of-order. There might be some other, more direct way to do that. (thinks yet more...) Actually the weak point in this scheme is that it wouldn't serialize transactions that occur in different databases and don't touch any shared catalogs. It'd be entirely possible for T1 in DB1 to be reported committed, then T2 in DB2 to be reported committed, then a crash occurs after which T2 is seen committed and T1 not. While this would be all right if the clients for T1 and T2 can't communicate, that isn't the real world. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
|
Next
|
Last
Pages: 1 2 3 Prev: [HACKERS] Check constraints on non-immutable keys Next: Proposal for 9.1: WAL streaming from WALbuffers |