Synchronous replication [PgSql]

Prev: standard_conforming_strings
Next: Per-column collation, proof of concept

From: Yeb Havinga on 27 Jul 2010 06:39

Fujii Masao wrote:
> The attached patch changes the backend so that it signals walsender to
> wake up from the sleep and send WAL immediately. It doesn't include any
> other synchronous replication stuff.
>
Hello Fujii,

I noted the changes in XlogSend where instead of *caughtup = true/false
it now returns !MyWalSnd->sndrqst. That value is initialized to false in
that procedure and it cannot be changed to true during execution of that
procedure, or can it?

regards,
Yeb Havinga

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 27 Jul 2010 07:29

On Tue, Jul 27, 2010 at 7:39 PM, Yeb Havinga <yebhavinga(a)gmail.com> wrote:
> Fujii Masao wrote:
>>
>> The attached patch changes the backend so that it signals walsender to
>> wake up from the sleep and send WAL immediately. It doesn't include any
>> other synchronous replication stuff.
>>
>
> Hello Fujii,

Thanks for the review!

> I noted the changes in XlogSend where instead of *caughtup = true/false it
> now returns !MyWalSnd->sndrqst. That value is initialized to false in that
> procedure and it cannot be changed to true during execution of that
> procedure, or can it?

That value is set to true in WalSndWakeup(). If WalSndWakeup() is called
after initialization of that value in XLogSend(), *caughtup is set to false.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Fujii Masao on 27 Jul 2010 07:42

On Tue, Jul 27, 2010 at 5:42 PM, Yeb Havinga <yebhavinga(a)gmail.com> wrote:
> I'd like to bring forward another suggestion (please tell me when it is
> becoming spam). My feeling about replication_mode as is, is that is says in
> the same parameter something about async or sync, as well as, if sync, which
> method of feedback to the master. OTOH having two parameters would need
> documentation that the feedback method may only be set if the
> replication_mode was sync, as well as checks. So it is actually good to have
> it all in one parameter
>
> But somehow the shoe pinches, because async feels different from the other
> three parameters. There is a way to move async out of the enumeration:
>
> synchronous_replication_mode = off | recv | fsync | replay

ISTM that we need to get more feedback from users to determine which
is the best. So, how about leaving the parameter as it is and revisiting
this topic later? Since it's not difficult to change the parameter later,
we will not regret even if we delay that determination.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Yeb Havinga on 27 Jul 2010 07:48

Fujii Masao wrote:
>> I noted the changes in XlogSend where instead of *caughtup = true/false it
>> now returns !MyWalSnd->sndrqst. That value is initialized to false in that
>> procedure and it cannot be changed to true during execution of that
>> procedure, or can it?
>>
>
> That value is set to true in WalSndWakeup(). If WalSndWakeup() is called
> after initialization of that value in XLogSend(), *caughtup is set to false.
>
Ah, so it can be changed by another backend process.

Another question:

Is there a reason not to send the signal in XlogFlush itself, so it
would be called at

CreateCheckPoint(), EndPrepare(), FlushBuffer(),
RecordTransactionAbortPrepared(), RecordTransactionCommit(),
RecordTransactionCommitPrepared(), RelationTruncate(),
SlruPhysicalWritePage(), write_relmap_file(), WriteTruncateXlogRec(),
and xact_redo_commit().

regards,
Yeb Havinga

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Joshua Tolley on 27 Jul 2010 09:12

On Tue, Jul 27, 2010 at 01:41:10PM +0900, Fujii Masao wrote:
> On Tue, Jul 27, 2010 at 12:36 PM, Joshua Tolley <eggyknap(a)gmail.com> wrote:
> > Perhaps I'm hijacking the wrong thread for this, but I wonder if the quorum
> > idea is really the best thing for us. I've been thinking about Oracle's way of
> > doing things[1]. In short, there are three different modes: availability,
> > performance, and protection. "Protection" appears to mean that at least one
> > standby has applied the log; "availability" means at least one standby has
> > received the log info (it doesn't specify whether that info has been fsynced
> > or applied, but presumably does not mean "applied", since it's distinct from
> > "protection" mode); "performance" means replication is asynchronous. I'm not
> > sure this method is perfect, but it might be simpler than the quorum behavior
> > that has been considered, and adequate for actual use cases.
>
> In my case, I'd like to set up one synchronous standby on the near rack for
> high-availability, and one asynchronous standby on the remote site for disaster
> recovery. Can Oracle's way cover the case?

I don't think it can support the case you're interested in, though I'm not
terribly expert on it. I'm definitely not arguing for the syntax Oracle uses,
or something similar; I much prefer the flexibility we're proposing, and agree
with Yeb Havinga in another email who suggests we spell out in documentation
some recipes for achieving various possible scenarios given whatever GUCs we
settle on.

> "availability" mode with two standbys might create a sort of similar situation.
> That is, since the ACK from the near standby arrives in first, the near standby
> acts synchronous and the remote one does asynchronous. But the ACK from the
> remote standby can arrive in first, so it's not guaranteed that the near standby
> has received the log info before transaction commit returns a "success" to the
> client. In this case, we have to failover to the remote standby even if it's not
> under control of a clusterware. This is a problem for me.

My concern is that in a quorum system, if the quorum number is less than the
total number of replicas, there's no way to know *which* replicas composed the
quorum for any given transaction, so we can't know which servers to fail to if
the master dies. This isn't different from Oracle, where it looks like
essentially the "quorum" value is always 1. Your scenario shows that all
replicas are not created equal, and that sometimes we'll be interested in WAL
getting committed on a specific subset of the available servers. If I had two
nearby replicas called X and Y, and one at a remote site called Z, for
instance, I'd set quorum to 2, but really I'd want to say "wait for server X
and Y before committing, but don't worry about Z".

I have no idea how to set up our GUCs to encode a situation like that :)

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: standard_conforming_strings
Next: Per-column collation, proof of concept