Prev: About tapes
Next: [HACKERS] beta3 & the open items list
From: "Joshua D. Drake" on 19 Jun 2010 12:05 On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote: > 4. Streaming Replication needs to detect death of master. We need > some sort of keep-alive, here. Whether it's at the TCP level (as > advocated by Tom Lane and others) or at the protocol level (as > advocated by Greg Stark) is something that we have yet to decide; once > it's decided, someone will need to do it... TCP involves unknowns, such as firewalls, vpn routers and ssh tunnels. I humbly suggest we *not* be pedantic and implement something practical and less prone to variables outside the control of Pg. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Greg Stark on 19 Jun 2010 14:46 On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: > 4. Streaming Replication needs to detect death of master. �We need > some sort of keep-alive, here. �Whether it's at the TCP level (as > advocated by Tom Lane and others) or at the protocol level (as > advocated by Greg Stark) is something that we have yet to decide; once > it's decided, someone will need to do it... This sounds like a useful feature but I don't see why it's not 9.1 material. The status quo is that the expected usage pattern is manual failover. As long as the slave responds to manual intervention when in this state I don't think this is a blocking issue. Monitoring and automatic failover are clearly things we plan to add features to handle better in the future. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 19 Jun 2010 14:53 On Sat, Jun 19, 2010 at 2:46 PM, Greg Stark <gsstark(a)mit.edu> wrote: > On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: >> 4. Streaming Replication needs to detect death of master. �We need >> some sort of keep-alive, here. �Whether it's at the TCP level (as >> advocated by Tom Lane and others) or at the protocol level (as >> advocated by Greg Stark) is something that we have yet to decide; once >> it's decided, someone will need to do it... > > This sounds like a useful feature but I don't see why it's not 9.1 > material. The status quo is that the expected usage pattern is manual > failover. As long as the slave responds to manual intervention when in > this state I don't think this is a blocking issue. Monitoring and > automatic failover are clearly things we plan to add features to > handle better in the future. Right now, if the SR master reboots unexpectedly (say, power plug pull and restart), the slave never notices. It just sits there forever waiting for the next byte of data from the master to arrive (which it never will). You have to manually restart the server or hit walreceiver with a SIGTERM to get it to start streaming agian. I guess we could decide we're just not going to deal with that, but it seems like a fairly large misfeature to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 19 Jun 2010 15:13 Robert Haas <robertmhaas(a)gmail.com> writes: > Right now, if the SR master reboots unexpectedly (say, power plug pull > and restart), the slave never notices. It just sits there forever > waiting for the next byte of data from the master to arrive (which it > never will). This is nonsense --- the slave's kernel *will* eventually notice that the TCP connection is dead, and tell walreceiver so. I don't doubt that the standard TCP timeout is longer than people want to wait for that, but claiming that it will never happen is simply wrong. I think that enabling slave-side TCP keepalives and control of the keepalive timeout parameters is probably sufficient for 9.0 here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Andres Freund on 19 Jun 2010 15:15
On Saturday 19 June 2010 18:05:34 Joshua D. Drake wrote: > On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote: > > 4. Streaming Replication needs to detect death of master. We need > > some sort of keep-alive, here. Whether it's at the TCP level (as > > advocated by Tom Lane and others) or at the protocol level (as > > advocated by Greg Stark) is something that we have yet to decide; once > > it's decided, someone will need to do it... > > TCP involves unknowns, such as firewalls, vpn routers and ssh tunnels. I > humbly suggest we *not* be pedantic and implement something practical > and less prone to variables outside the control of Pg. And has the huge advantage of being implementable in about 5 lines of C (setsockopt + error checking). Considering what time in the release cycle this is... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |