From: Fujii Masao on
On Tue, Jul 27, 2010 at 8:48 PM, Yeb Havinga <yebhavinga(a)gmail.com> wrote:
> Is there a reason not to send the signal in XlogFlush itself, so it would be
> called at
>
> CreateCheckPoint(), EndPrepare(), FlushBuffer(),
> RecordTransactionAbortPrepared(), RecordTransactionCommit(),
> RecordTransactionCommitPrepared(), RelationTruncate(),
> SlruPhysicalWritePage(), write_relmap_file(), WriteTruncateXlogRec(), and
> xact_redo_commit().

Yes, it's because there is no need to send WAL immediately in other
than the following functions:

* EndPrepare()
* RecordTransactionAbortPrepared()
* RecordTransactionCommit()
* RecordTransactionCommitPrepared()

Some functions call XLogFlush() to follow the basic WAL rule. In the
standby, WAL records are always flushed to disk prior to any corresponding
data-file change. So, we don't need to replicate the result of XLogFlush()
immediately for the WAL rule.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on
On Tue, Jul 27, 2010 at 10:12 PM, Joshua Tolley <eggyknap(a)gmail.com> wrote:
> I don't think it can support the case you're interested in, though I'm not
> terribly expert on it. I'm definitely not arguing for the syntax Oracle uses,
> or something similar; I much prefer the flexibility we're proposing, and agree
> with Yeb Havinga in another email who suggests we spell out in documentation
> some recipes for achieving various possible scenarios given whatever GUCs we
> settle on.

Agreed. I'll add it to my TODO list.

> My concern is that in a quorum system, if the quorum number is less than the
> total number of replicas, there's no way to know *which* replicas composed the
> quorum for any given transaction, so we can't know which servers to fail to if
> the master dies.

What about checking the current WAL receive location of each standby by
using pg_last_xlog_receive_location()? The standby which has the newest
location should be failed over to.

> This isn't different from Oracle, where it looks like
> essentially the "quorum" value is always 1. Your scenario shows that all
> replicas are not created equal, and that sometimes we'll be interested in WAL
> getting committed on a specific subset of the available servers. If I had two
> nearby replicas called X and Y, and one at a remote site called Z, for
> instance, I'd set quorum to 2, but really I'd want to say "wait for server X
> and Y before committing, but don't worry about Z".
>
> I have no idea how to set up our GUCs to encode a situation like that :)

Yeah, quorum commit alone cannot cover that situation. I think that
current approach (i.e., quorum commit plus replication mode per standby)
would cover that. In your example, you can choose "recv", "fsync" or
"replay" as replication_mode in X and Y, and choose "async" in Z.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Joshua Tolley on
On Tue, Jul 27, 2010 at 10:53:45PM +0900, Fujii Masao wrote:
> On Tue, Jul 27, 2010 at 10:12 PM, Joshua Tolley <eggyknap(a)gmail.com> wrote:
> > My concern is that in a quorum system, if the quorum number is less than the
> > total number of replicas, there's no way to know *which* replicas composed the
> > quorum for any given transaction, so we can't know which servers to fail to if
> > the master dies.
>
> What about checking the current WAL receive location of each standby by
> using pg_last_xlog_receive_location()? The standby which has the newest
> location should be failed over to.

That makes sense. Thanks.

> > This isn't different from Oracle, where it looks like
> > essentially the "quorum" value is always 1. Your scenario shows that all
> > replicas are not created equal, and that sometimes we'll be interested in WAL
> > getting committed on a specific subset of the available servers. If I had two
> > nearby replicas called X and Y, and one at a remote site called Z, for
> > instance, I'd set quorum to 2, but really I'd want to say "wait for server X
> > and Y before committing, but don't worry about Z".
> >
> > I have no idea how to set up our GUCs to encode a situation like that :)
>
> Yeah, quorum commit alone cannot cover that situation. I think that
> current approach (i.e., quorum commit plus replication mode per standby)
> would cover that. In your example, you can choose "recv", "fsync" or
> "replay" as replication_mode in X and Y, and choose "async" in Z.

Clearly I need to read through the GUCs and docs better. I'll try to keep
quiet until that's finished :)


--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com
From: Dimitri Fontaine on
Le 27 juil. 2010 à 15:12, Joshua Tolley <eggyknap(a)gmail.com> a écrit :
> My concern is that in a quorum system, if the quorum number is less than the
> total number of replicas, there's no way to know *which* replicas composed the
> quorum for any given transaction, so we can't know which servers to fail to if
> the master dies. This isn't different from Oracle, where it looks like
> essentially the "quorum" value is always 1. Your scenario shows that all
> replicas are not created equal, and that sometimes we'll be interested in WAL
> getting committed on a specific subset of the available servers. If I had two
> nearby replicas called X and Y, and one at a remote site called Z, for
> instance, I'd set quorum to 2, but really I'd want to say "wait for server X
> and Y before committing, but don't worry about Z".
>
> I have no idea how to set up our GUCs to encode a situation like that :)

You make it so that Z does not take a vote, by setting it async.

Regards,
--
dim
--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Heikki Linnakangas on
On 27/07/10 16:12, Joshua Tolley wrote:
> My concern is that in a quorum system, if the quorum number is less than the
> total number of replicas, there's no way to know *which* replicas composed the
> quorum for any given transaction, so we can't know which servers to fail to if
> the master dies.

In fact, it's possible for one standby to sync up to X, then disconnect
and reconnect, and have the master count it second time in the quorum.
Especially if the master doesn't notice that the standby disconnected,
e.g a network problem.

I don't think any of this quorum stuff makes much sense without
explicitly registering standbys in the master.

That would also solve the fuzziness with wal_keep_segments - if the
master knew what standbys exist, it could keep track of how far each
standby has received WAL, and keep just enough WAL for each standby to
catch up.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers