From: Rick Jones on 15 Jan 2008 14:17 Arkadiy <vertleyb(a)gmail.com> wrote: > On Jan 14, 7:10 pm, Rick Jones <rick.jon...(a)hp.com> wrote: > > The server's host rebooting will cause an RST to come back to the > > client end in response to the first segment the client sends to > > the server after it reboots because the server's host TCP will > > have no knowledge of the connection. > This will take really long time. Yes. Perhaps even longer than TCP might wait before giving-up on retransmitting unACKnowledged data, perhaps not. > Also, what if the server host crached and never got rebooted? Or > got disconnected? If you have no unACKnowledged data outstanding you need an application-level keepalive (which effectively creates unACKed data), or limp along with SO_KEEPALIVE. rick jones -- Wisdom Teeth are impacted, people are affected by the effects of events. these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: David Schwartz on 15 Jan 2008 15:56 On Jan 15, 10:15 am, Arkadiy <vertl...(a)gmail.com> wrote: > On Jan 15, 12:10 pm, David Schwartz <dav...(a)webmaster.com> wrote: > > You really want to backoff and retry though. And I'm not sure you want > > to tear down the connection at the first sign of trouble. > My problem is I can't understand the purpose of this "retry". If you don't retry, then one failure would be the end of the world. By retry, I don't mean the same operation, I mean retry getting the server to work. > If my > timeout is 1 sec, and the first request timed out, and I retry it, why > not to set the timeout to 2 sec in the first place? Because if that happens, it will take you 2 seconds to detect a server failure rather than 1. The first operation will timeout in a second, and then your connect will fail. You won't have to wait another second. Note that "retry it" doesn't mean you send the same request or even any request. It simply means that you try again. That might entail making a connection, sending a 'VERSION' as a probe, or sending a different request. But you try (to reach the server) again. What you don't do is try a thousand concurrent requests just because you got a thousand concurrent requests when you *know* the server is likely overloaded. Because if you do that, the server will *never* catch back up because it will be too busy handling the backlog of queries you've already decided to ignore. DS
From: Arkadiy on 15 Jan 2008 17:00 On Jan 15, 3:56 pm, David Schwartz <dav...(a)webmaster.com> wrote: > On Jan 15, 10:15 am, Arkadiy <vertl...(a)gmail.com> wrote: > > > On Jan 15, 12:10 pm, David Schwartz <dav...(a)webmaster.com> wrote: > > > You really want to backoff and retry though. And I'm not sure you want > > > to tear down the connection at the first sign of trouble. > > My problem is I can't understand the purpose of this "retry". > > If you don't retry, then one failure would be the end of the world. By > retry, I don't mean the same operation, I mean retry getting the > server to work. OK, I think I understand now. > > If my > > timeout is 1 sec, and the first request timed out, and I retry it, why > > not to set the timeout to 2 sec in the first place? > > Because if that happens, it will take you 2 seconds to detect a server > failure rather than 1. The first operation will timeout in a second, > and then your connect will fail. You won't have to wait another > second. Do you mean that, although the request times out, the connect fails right away? If the server is congested, wouldn't it fail to accept the connection in reasonable amount of time? It seems to me that it's impossible to tell congested server from unreachable by using connect the same way as it's impossible to do by using timeout on a regular request... Am I wrong? Regards, Arkadiy
From: Rick Jones on 15 Jan 2008 19:24 Arkadiy <vertleyb(a)gmail.com> wrote: > If the server is congested, wouldn't it fail to accept the connection > in reasonable amount of time? Depending on one's definition of reasonable, not necessarily. What happens depends on the TCP stack on the server. If we are taking Unix/Unixlike then once the server application's listen queue fills, subsequent attempts to connect will have the TCP SYNchronize segment silently dropped. It will then be up to the client TCP stack's behaviour on the connect() call and/or if the client _application_ has done a non-blocking connect() and set its own timeout. (Or I suppose arranged for a signal to be delivered to get it out of the blocking connect() call...) IIRC only the Windows TCP stack will "fail" a TCP SYN to a full listen queue with an RST reply. And even then, there is no guarantee that the RST will make it back to the client TCP stack, which brings us right back to the Unix/Unixlike case... > It seems to me that it's impossible to tell congested server from > unreachable by using connect the same way as it's impossible to do > by using timeout on a regular request... Am I wrong? Depends on whether or not the server's TCP stack actively responds to SYNs to a full listen queue with an RST and if those RST's arrive at the client. Also, the server may actually be "congested" before the listen queue fills, although indeed one could consider a full listen queue as a sign of a congested server. IE, a full queue is sufficient, but not necessary. rick jones -- firebug n, the idiot who tosses a lit cigarette out his car window these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: Rainer Weikusat on 16 Jan 2008 02:19
Arkadiy <vertleyb(a)gmail.com> writes: > On Jan 15, 3:56 pm, David Schwartz <dav...(a)webmaster.com> wrote: [...] >> The first operation will timeout in a second, >> and then your connect will fail. You won't have to wait another >> second. > > Do you mean that, although the request times out, the connect fails > right away? Usually not. > If the server is congested, wouldn't it fail to accept the connection > in reasonable amount of time? It seems to me that it's impossible to > tell congested server from unreachable by using connect the same way > as it's impossible to do by using timeout on a regular request... Connect asynchronously and wait 'some time'. If the connection isn't available by then, do something else. At the next request, if the connect is still in progress, do the same and so forth until it fails. Retry connecting for the next request. If you just drop the connection at the first request timeout, subsequent replies from the server should elicit a RST, which should help to get rid of an eventual backlog. Preferably, don't do any this unless it is certain that you are working around an actual problem. |