Prev: About tapes
Next: [HACKERS] beta3 & the open items list
From: Stefan Kaltenbrunner on 19 Jun 2010 15:49 On 06/19/2010 09:13 PM, Tom Lane wrote: > Robert Haas<robertmhaas(a)gmail.com> writes: >> Right now, if the SR master reboots unexpectedly (say, power plug pull >> and restart), the slave never notices. It just sits there forever >> waiting for the next byte of data from the master to arrive (which it >> never will). > > This is nonsense --- the slave's kernel *will* eventually notice that > the TCP connection is dead, and tell walreceiver so. I don't doubt > that the standard TCP timeout is longer than people want to wait for > that, but claiming that it will never happen is simply wrong. > > I think that enabling slave-side TCP keepalives and control of the > keepalive timeout parameters is probably sufficient for 9.0 here. yeah I would agree - we do have tcp keepalive code in the backend for a while now and adding that to libpq as well just seems like an easy enough fix at this time in the release cycle. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Florian Pflug on 19 Jun 2010 19:11 On Jun 19, 2010, at 21:13 , Tom Lane wrote: > Robert Haas <robertmhaas(a)gmail.com> writes: >> Right now, if the SR master reboots unexpectedly (say, power plug pull >> and restart), the slave never notices. It just sits there forever >> waiting for the next byte of data from the master to arrive (which it >> never will). > > This is nonsense --- the slave's kernel *will* eventually notice that > the TCP connection is dead, and tell walreceiver so. I don't doubt > that the standard TCP timeout is longer than people want to wait for > that, but claiming that it will never happen is simply wrong. No, Robert is correct AFAIK. If you're *waiting* for data, TCP generates no traffic (expect with keepalive enabled). From the slave's kernel POV, a dead master is therefore indistinguishable from a inactive master. Things are different from a sender's POV, though. Since sent data is ACK'ed by the receiving end, the TCP stack can (and does) detect a broken connection. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Simon Riggs on 19 Jun 2010 20:19 On Sat, 2010-06-19 at 14:53 -0400, Robert Haas wrote: > On Sat, Jun 19, 2010 at 2:46 PM, Greg Stark <gsstark(a)mit.edu> wrote: > > On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: > >> 4. Streaming Replication needs to detect death of master. We need > >> some sort of keep-alive, here. Whether it's at the TCP level (as > >> advocated by Tom Lane and others) or at the protocol level (as > >> advocated by Greg Stark) is something that we have yet to decide; once > >> it's decided, someone will need to do it... > > > > This sounds like a useful feature but I don't see why it's not 9.1 > > material. The status quo is that the expected usage pattern is manual > > failover. As long as the slave responds to manual intervention when in > > this state I don't think this is a blocking issue. Monitoring and > > automatic failover are clearly things we plan to add features to > > handle better in the future. > > Right now, if the SR master reboots unexpectedly (say, power plug pull > and restart), the slave never notices. It just sits there forever > waiting for the next byte of data from the master to arrive (which it > never will). You have to manually restart the server or hit > walreceiver with a SIGTERM to get it to start streaming agian. I > guess we could decide we're just not going to deal with that, but it > seems like a fairly large misfeature to me. Are you saying it doesn't respond to a trigger file any any point? That would be a problem. Sounds like we should have a pg_restart_walreceiver() function. We shouldn't be encouraging people to send signals to backends, its too easy to get wrong. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 20 Jun 2010 01:18 Florian Pflug <fgp(a)phlo.org> writes: > On Jun 19, 2010, at 21:13 , Tom Lane wrote: >> This is nonsense --- the slave's kernel *will* eventually notice that >> the TCP connection is dead, and tell walreceiver so. I don't doubt >> that the standard TCP timeout is longer than people want to wait for >> that, but claiming that it will never happen is simply wrong. > No, Robert is correct AFAIK. If you're *waiting* for data, TCP > generates no traffic (expect with keepalive enabled). Mph. I was thinking that keepalive was on by default with a very long interval, but I see this isn't so. However, if we enable keepalive, then it's irrelevant to the point anyway. Nobody's produced any evidence that keepalive is an unsuitable solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andres Freund on 20 Jun 2010 05:41
On Saturday 19 June 2010 18:05:34 Joshua D. Drake wrote: > On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote: > > 4. Streaming Replication needs to detect death of master. We need > > some sort of keep-alive, here. Whether it's at the TCP level (as > > advocated by Tom Lane and others) or at the protocol level (as > > advocated by Greg Stark) is something that we have yet to decide; once > > it's decided, someone will need to do it... > > TCP involves unknowns, such as firewalls, vpn routers and ssh tunnels. I > humbly suggest we *not* be pedantic and implement something practical > and less prone to variables outside the control of Pg. > > Sincerely, >++++ + > Joshua D. Drake -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |