From: Greg Stark on 16 Jun 2010 19:32 On Thu, Jun 17, 2010 at 12:22 AM, Kevin Grittner <Kevin.Grittner(a)wicourts.gov> wrote: > "Kevin Grittner" <Kevin.Grittner(a)wicourts.gov> wrote: > >> It sounds like it behaves just fine except for not detecting a >> broken connection. > > Of course I meant in terms of the slave's attempts at retrieving > more WAL, not in terms of it applying a second time line. �TCP > keepalive timeouts don't help with that part of it, just the failure > to recognize the broken connection. �I suppose someone could argue > that's a *feature*, since it gives you two hours to manually > intervene before it does something stupid, but that hardly seems > like a solution.... It's certainly a design goal of TCP that you should be able to disconnect the network and reconnect it everything should recover. If no data was sent it should be able to withstand arbitrarily long disconnections. TCP Keepalives break that but they should only break it in the case where the network connection has definitely exceeded the retry timeouts, not when it merely hasn't responded fast enough for the application requirements. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Stark on 16 Jun 2010 19:40 On Thu, Jun 17, 2010 at 12:16 AM, Kevin Grittner <Kevin.Grittner(a)wicourts.gov> wrote: > Greg Stark <gsstark(a)mit.edu> wrote: > >> TCP keepalives are for detecting broken network connections > > Yeah. �That seems like what we have here. �If you shoot the OS in > the head, the network connection is broken rather abruptly, without > the normal packets exchanged to close the TCP connection. �It sounds > like it behaves just fine except for not detecting a broken > connection. So I think there are two things happening here. If you shut down the master and don't replace it then you'll get no network errors until TCP gives up entirely. Similarly if you pull the network cable or your switch powers off or your routing table becomes messed up, or anything else occurs which prevents packets from getting through then you'll see similar breakage. You wouldn't want your database to suddenly come up as master in such circumstances though when you'll have to fix the problem anyways, doing so won't solve any problems it would just create a second problem. But there's a second case. The Postgres master just stops responding -- perhaps it starts seeing disk errors and becomes stuck in disk-wait or the machine just becomes very heaviliy loaded and Postgres can't get any cycles, or someone attaches to it with gdb, or one of any number of things happen which cause it to stop sending data. In that case replication will not see any data from the master but TCP will never time out because the network is just fine. That's why there needs to be an application level health check if you want to have timeouts. You can't depend on the network layer to detect problems between the application. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Fujii Masao on 17 Jun 2010 01:57 On Thu, Jun 17, 2010 at 5:26 AM, Robert Haas <robertmhaas(a)gmail.com> wrote: > On Wed, Jun 16, 2010 at 4:14 PM, Josh Berkus <josh(a)agliodbs.com> wrote: >>> The first problem I noticed is that the slave never seems to realize >>> that the master has gone away. Every time I crashed the master, I had >>> to kill the wal receiver process on the slave to get it to reconnect; >>> otherwise it just sat there waiting, either forever or at least for >>> longer than I was willing to wait. >> >> Yes, I've noticed this. That was the reason for forcing walreceiver to >> shut down on a restart per prior discussion and patches. This needs to >> be on the open items list ... possibly it'll be fixed by Simon's >> keepalive patch? Or is it just a tcp_keeplalive issue? > > I think a TCP keepalive might be enough, but I have not tried to code > or test it. The "keepalive on libpq" patch would help. https://commitfest.postgresql.org/action/patch_view?id=281 >>> and this just >>> makes it more likely. After the most recent crash, the master thought >>> pg_current_xlog_location() was 1/86CD4000; the slave thought >>> pg_last_xlog_receive_location() was 1/8733C000. After reconnecting to >>> the master, the slave then thought that >>> pg_last_xlog_receive_location() was 1/87000000. >> >> So, *in this case*, detecting out-of-sequence xlogs (and PANICing) would >> have actually prevented the slave from being corrupted. >> >> My question, though, is detecting out-of-sequence xlogs *enough*? Are >> there any crash conditions on the master which would cause the master to >> reuse the same locations for different records, for example? I don't >> think so, but I'd like to be certain. > > The real problem here is that we're sending records to the slave which > might cease to exist on the master if it unexpectedly reboots. I > believe that what we need to do is make sure that the master only > sends WAL it has already fsync'd (Tom suggested on another thread that > this might be necessary, and I think it's now clear that it is 100% > necessary). The attached patch changes walsender so that it always sends WAL up to LogwrtResult.Flush instead of LogwrtResult.Write. > But I'm not sure how this will play with fsync=off - if > we never fsync, then we can't ever really send any WAL without risking > this failure mode. Similarly with synchronous_commit=off, I believe > that the next checkpoint will still fsync WAL, but the lag might be > long. First of all, we should not restart the master after the crash in fsync=off case. That would cause the corruption of the master database itself. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
From: Heikki Linnakangas on 17 Jun 2010 02:09 On 17/06/10 02:40, Greg Stark wrote: > On Thu, Jun 17, 2010 at 12:16 AM, Kevin Grittner > <Kevin.Grittner(a)wicourts.gov> wrote: >> Greg Stark<gsstark(a)mit.edu> wrote: >> >>> TCP keepalives are for detecting broken network connections >> >> Yeah. That seems like what we have here. If you shoot the OS in >> the head, the network connection is broken rather abruptly, without >> the normal packets exchanged to close the TCP connection. It sounds >> like it behaves just fine except for not detecting a broken >> connection. > > So I think there are two things happening here. If you shut down the > master and don't replace it then you'll get no network errors until > TCP gives up entirely. Similarly if you pull the network cable or your > switch powers off or your routing table becomes messed up, or anything > else occurs which prevents packets from getting through then you'll > see similar breakage. You wouldn't want your database to suddenly come > up as master in such circumstances though when you'll have to fix the > problem anyways, doing so won't solve any problems it would just > create a second problem. We're not talking about a timeout for promoting standby to master. The problem is that the standby doesn't notice that from the master's point of view, the connection has been broken. Whether it's because of a network error or because the master server crashed doesn't matter, the standby should reconnect in any case. TCP keepalives are a perfect fit, as long as you can tune the keepalive time short enough. Where "Short enough" is up to the admin to decide depending on the application. Having said that, it would probably make life easier if we implemented an application level heartbeat anyway. Not all OS's allow tuning keepalives. > But there's a second case. The Postgres master just stops responding > -- perhaps it starts seeing disk errors and becomes stuck in disk-wait > or the machine just becomes very heaviliy loaded and Postgres can't > get any cycles, or someone attaches to it with gdb, or one of any > number of things happen which cause it to stop sending data. In that > case replication will not see any data from the master but TCP will > never time out because the network is just fine. That's why there > needs to be an application level health check if you want to have > timeouts. You can't depend on the network layer to detect problems > between the application. If the PostgreSQL master stops responding, it's OK for the slave to sit and wait for the master to recover. Reconnecting wouldn't help. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Rafael Martinez on 17 Jun 2010 03:02 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Heikki Linnakangas wrote: > > We're not talking about a timeout for promoting standby to master. The > problem is that the standby doesn't notice that from the master's point > of view, the connection has been broken. Whether it's because of a > network error or because the master server crashed doesn't matter, the > standby should reconnect in any case. TCP keepalives are a perfect fit, > as long as you can tune the keepalive time short enough. Where "Short > enough" is up to the admin to decide depending on the application. > > I tested this yesterday and I could not get any reaction from the wal receiver even after using minimal values compared to the default values . The default values in linux for tcp_keepalive_time, tcp_keepalive_intvl and tcp_keepalive_probes are 7200, 75 and 9. I reduced these values to 60, 3, 3 and nothing happened, it continuous with status ESTABLISHED after 60+3*3 seconds. I did not restart the network after I changed these values on the fly via /proc. I wonder if this is the reason the connection didn't die neither with the new keppalive values after the connection was broken. I will check this later today. regards, - -- Rafael Martinez, <r.m.guerrero(a)usit.uio.no> Center for Information Technology Services University of Oslo, Norway PGP Public Key: http://folk.uio.no/rafael/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAkwZyJ4ACgkQBhuKQurGihT3kgCgn4iQkZ8YKr/nAk5/QqpwYfnc 4lsAn2CKvgeeIOon+lWRHe908hbJ+zK6 =VymH -----END PGP SIGNATURE----- -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: streaming replication breaks horribly if mastercrashes Next: ANNOUNCE list |