TCP keepalive support for libpq [PgSql]

Prev: what exactly is a PlaceHolderVar?
Next: [HACKERS] Parallel pg_restore versus old dump files

From: Greg Stark on 24 Jun 2010 07:54

On Tue, Jun 22, 2010 at 6:04 PM, Kevin Grittner
<Kevin.Grittner(a)wicourts.gov> wrote:
> Robert Haas <robertmhaas(a)gmail.com> wrote:
>
>> What does bother me is the fact that we are engineering a critical
>> aspect of our system reliability around vendor-specific
>> implementation details of the TCP stack, and that if any version
>> of any operating system that we support (or ever wish to support
>> in the future) fails to have a reliable implementation of this
>> feature AND configurable knobs that we can tune to suit our needs,
>> then we're screwed. Does anyone want to argue that this is NOT a
>> house of cards?
>
> [/me raises hand]
>
> TCP keepalive has been available and a useful part of my reliability
> solutions since I had so find a way to clean up zombie database
> connections caused by clients powering down their workstations
> without closing their apps -- that was in OS/2 circa 1990.

I think the problem is that the above is precisely what TCP keepalives
were designed for -- to prevent connections that are definitely dead
from living on forever. Even then they're controversial and mean
sacrificing a feature that's quite desirable for TCP -- namely that
idle connections don't die unnecessarily in the face of transient
failures and can function fine when the link returns.

The proposed use is for detecting connections which aren't responding
quickly enough for our tastes which might be much more quickly than
TCP timeouts. Because we have a backup plan the conservative option in
our case is to kill the connection as soon as there's any doubt about
it's validity so we can try a new connection. That's just not how TCP
is designed -- the conservative option is assumed to be to keep the
connection open until there's no doubt the connection is dead.

I think it's going to be an uphill battle convincing TCP that we know
better than the TCP spec about how aggressive it should be about
throwing errors and killing connections. Once we have TCP keepalives
set low enough -- assuming the OS will allow it to be set much lower
-- we'll find that other timeouts are longer than we expect too. TCP
Keepalives won't come into it at all if there is any unacked data
pending -- TCP *will* detect that case but it might take longer than
you want too and you won't be able to lower it.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 24 Jun 2010 08:31

Greg Stark wrote:

> we'll find that other timeouts are longer than we expect too. TCP
> Keepalives won't come into it at all if there is any unacked data
> pending -- TCP *will* detect that case but it might take longer
> than you want too and you won't be able to lower it.

If memory servers after twenty years, and the standard hasn't
changed, if you add up all the delays, it can take about nine minutes
maximum for a connection to break due to a wait for unacked data.
That's longer than the values Robert showed (which I think was
between one and two minutes -- can't fine the post at the moment),
but quite a bit less than the two hours and ten minutes you get with
the defaults for keepalive.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 24 Jun 2010 10:40

Greg Stark <gsstark(a)mit.edu> writes:
> I think it's going to be an uphill battle convincing TCP that we know
> better than the TCP spec about how aggressive it should be about
> throwing errors and killing connections. Once we have TCP keepalives
> set low enough -- assuming the OS will allow it to be set much lower
> -- we'll find that other timeouts are longer than we expect too. TCP
> Keepalives won't come into it at all if there is any unacked data
> pending -- TCP *will* detect that case but it might take longer than
> you want too and you won't be able to lower it.

So it's a good thing that walreceiver never has to send anything after
the initial handshake ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev |
Pages: 1 2 3 4
Prev: what exactly is a PlaceHolderVar?
Next: [HACKERS] Parallel pg_restore versus old dump files