From: Tom Lane on 16 Jun 2010 16:56 Robert Haas <robertmhaas(a)gmail.com> writes: > The first problem I noticed is that the slave never seems to realize > that the master has gone away. Every time I crashed the master, I had > to kill the wal receiver process on the slave to get it to reconnect; > otherwise it just sat there waiting, either forever or at least for > longer than I was willing to wait. TCP timeout is the answer there. > More seriously, I was able to demonstrate that the problem linked in > the thread above is real: if the master crashes after streaming WAL > that it hasn't yet fsync'd, then on recovery the slave's xlog position > is ahead of the master. So indeed we'd better change walsender to not get ahead of the fsync'd position. And probably also warn people to not disable fsync on the master, unless they're willing to write it off and fail over at any system crash. > I don't know what to do about this, but I'm pretty sure we can't ship it as-is. Doesn't seem tremendously insoluble from here ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Josh Berkus on 16 Jun 2010 18:03 On 6/16/10 1:26 PM, Robert Haas wrote: > Similarly with synchronous_commit=off, I believe > that the next checkpoint will still fsync WAL, but the lag might be > long. That's not a showstopper. Just tell people that having synch_commit=off on the master might increase the lag to the slave, and leave it alone. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Pierre C" on 16 Jun 2010 18:21 > The real problem here is that we're sending records to the slave which > might cease to exist on the master if it unexpectedly reboots. I > believe that what we need to do is make sure that the master only > sends WAL it has already fsync'd How about this : - pg records somewhere the xlog position of the last record synced to disk. I dont remember the variable name, let's just say xlog_synced_recptr - pg always writes the xlog first, ie. before writing any page it checks that the page's xlog recptr < xlog_synced_recptr and if it's not the case it has to wait before it can write the page. Now : - master sends messages to slave with the xlog_synced_recptr after each fsync - slave gets these messages and records the master_xlog_synced_recptr - slave doesn't write any page to disk until BOTH the slave's local WAL copy AND the master's WAL have reached the recptr of this page If a master crashes or the slave loses connection, then the in-memory pages of the slave could be in a state that is "in the future" compared to the master's state when it comes up. Therefore when a slave detects that the master has crashed, it could shoot itself and recover from WAL, at which point the slave will not be "in the future" anymore from the master, rather it would be in the past, which is a lot less problematic... Of course this wouldn't speed up the failover process !... > I think we should also change the slave to panic and shut down > immediately if its xlog position is ahead of the master. That can > never be a watertight solution because you can always advance the xlog > position on them master and mask the problem. But I think we should > do it anyway, so that we at least have a chance of noticing that we're > hosed. I wish I could think of something a little more watertight... If a slave is "in the future" relative to the master, then the only way to keep using this slave could be to make it the new master... -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Stark on 16 Jun 2010 19:08 On Wed, Jun 16, 2010 at 9:56 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas(a)gmail.com> writes: >> The first problem I noticed is that the slave never seems to realize >> that the master has gone away. �Every time I crashed the master, I had >> to kill the wal receiver process on the slave to get it to reconnect; >> otherwise it just sat there waiting, either forever or at least for >> longer than I was willing to wait. > > TCP timeout is the answer there. If you mean TCP Keepalives, I disagree quite strongly. If you want the application to guarantee any particular timing constraints then you have to implement that in the application using timers and data packets. TCP keepalives are for detecting broken network connections, not enforcing application rules. Using TCP timeouts would have a number of problems: On many systems they are impossible or difficult to adjust and worse, it would make it impossible to distinguish an postgres master crash from a transient or permanent network outage. >> More seriously, I was able to demonstrate that the problem linked in >> the thread above is real: if the master crashes after streaming WAL >> that it hasn't yet fsync'd, then on recovery the slave's xlog position >> is ahead of the master. > > So indeed we'd better change walsender to not get ahead of the fsync'd > position. �And probably also warn people to not disable fsync on the > master, unless they're willing to write it off and fail over at any > system crash. > >> I don't know what to do about this, but I'm pretty sure we can't ship it as-is. > > Doesn't seem tremendously insoluble from here ... For the case of fsync=off I can't get terribly excited about the slave being ahead of the master after a crash. After all the master is toast anyways. It seems to me in this situation the slave should detect that the master has failed and automatically come up in master mode. Or perhaps it should just shut down and then refuse to come up as a slave again on the basis that it would be unsafe precisely because it might be ahead of the (corrupt) master. At some point we should consider having a server set to fsync=off refuse to come back up unless it was shut down cleanly anyways. Perhaps we should put a strongly worded warning now. For the case of fsync=on it does seem to me to be terribly obvious that the master should never send records to the slave that aren't fsynced on the master. For 9.1 the other option proposed would work as well but would be more complex -- to send and store records immediately but not replay them on the slave until they're either fsynced on the master or failover occurs. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Kevin Grittner" on 16 Jun 2010 19:16 Greg Stark <gsstark(a)mit.edu> wrote: > TCP keepalives are for detecting broken network connections Yeah. That seems like what we have here. If you shoot the OS in the head, the network connection is broken rather abruptly, without the normal packets exchanged to close the TCP connection. It sounds like it behaves just fine except for not detecting a broken connection. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: streaming replication breaks horribly if mastercrashes Next: ANNOUNCE list |