From: WC on 1 Feb 2010 20:29 I've written a TCL app that receives data from a single TCP source and distributes this data to multiple TCP receivers using a very simple ASCII protocol. The server is non-blocking using TCL's event loop. Most of the receivers are not under my control and sometimes behave poorly. This means I don't have access to code/application and in some cases the owner of those applications. Here is my problem. TCL has called my writable handler indicating that a channel is ready for data. I write data to the channel but the client stops reading data at some point, but does not close the connection. TCP's flow control kicks in and data ends up being buffered in the receivers TCP input buffer, my hosts TCP output buffer and finally my application's TCL channel output buffer. If at this point I connect to another port and issue a command for my application to shutdown it hangs. I forced a core dump and noticed that it's hanging in send(). The man page for TCL's close indicates that TCL will put the channel into blocking mode and attempt to flush the channel of any remaining data, the interpreter does this for each open channel when exit is called. However if the TCP stack is not accepting data the application will never be able to exit or close channels without exiting for that matter. This appears to be a pretty serious bug. I need to 'kill -9' in order to force an exit... very ugly. Seems like what is needed is an option to the close command to discard any data buffered in the TCL channel's output buffer and close the channel. I coded a small extension in C that closes the OS specific handle for the channel and the unregisters the channel from the interpreter. This causes send() to return -1 but the interpreter doesn't care at that point and shutdown continues successfully. Anyone else run into this? I'm I totally missing something here? BTW I'm using TCL 8.4 on Linux and HP-UX but a review of the current 8.5 API it seems like this deadlock could still exist. Any input/ideas are greatly appreciated, Wayne
From: tom.rmadilo on 1 Feb 2010 21:51 On Feb 1, 5:29 pm, WC <wcu...(a)cox.net> wrote: > I've written a TCL app that receives data from a single TCP source and > distributes this data to multiple TCP receivers using a very simple > ASCII protocol. The server is non-blocking using TCL's event loop. Most > of the receivers are not under my control and sometimes behave poorly. > This means I don't have access to code/application and in some cases the > owner of those applications. > > Here is my problem. > > TCL has called my writable handler indicating that a channel is ready > for data. I write data to the channel but the client stops reading data > at some point, but does not close the connection. TCP's flow control > kicks in and data ends up being buffered in the receivers TCP input > buffer, my hosts TCP output buffer and finally my application's TCL > channel output buffer. > > If at this point I connect to another port and issue a command for my > application to shutdown it hangs. I forced a core dump and noticed that > it's hanging in send(). The man page for TCL's close indicates that TCL > will put the channel into blocking mode and attempt to flush the channel > of any remaining data, the interpreter does this for each open channel > when exit is called. However if the TCP stack is not accepting data the > application will never be able to exit or close channels without exiting > for that matter. This appears to be a pretty serious bug. I need to > 'kill -9' in order to force an exit... very ugly. Seems like what is > needed is an option to the close command to discard any data buffered in > the TCL channel's output buffer and close the channel. > > I coded a small extension in C that closes the OS specific handle for > the channel and the unregisters the channel from the interpreter. This > causes send() to return -1 but the interpreter doesn't care at that > point and shutdown continues successfully. > > Anyone else run into this? I'm I totally missing something here? > > BTW I'm using TCL 8.4 on Linux and HP-UX but a review of the current 8.5 > API it seems like this deadlock could still exist. > > Any input/ideas are greatly appreciated, > Wayne Right, so it sounds like your wrote an application which gets stuck...probably due to poor coding. It also sounds like you ran it in background so you couldn't control it except via signals. The TCP connection should still time out if you let it sit long enough. BTW, a channel becomes readable/writable if an error occurs, it is something of a blunt indicator. In this case is sounds like the application is simply waiting around to send or receive data. I'm not sure how this adds up to a bug.
From: WC on 2 Feb 2010 01:56 tom.rmadilo wrote: > On Feb 1, 5:29 pm, WC <wcu...(a)cox.net> wrote: > > Right, so it sounds like your wrote an application which gets > stuck...probably due to poor coding. It also sounds like you ran it in > background so you couldn't control it except via signals. The TCP > connection should still time out if you let it sit long enough. > > BTW, a channel becomes readable/writable if an error occurs, it is > something of a blunt indicator. In this case is sounds like the > application is simply waiting around to send or receive data. I'm not > sure how this adds up to a bug. > Did you even read my post or were you just looking for someone to criticize? 1) Backgrounding does not imply that an application can only be controlled via signals. In fact I'm using a control socket on another port, as stated in my message, to send the app a stop message. But this is beside the point I'm not sure why you brought it up? 2) You need to go back and study blocking sockets, if the remote end stops reading data but the IP buffers on both ends are full and you attempt to write more data, the write end will block until the remote end begins to read data thus clearing IP buffers or it simply closes the connection. Neither of which are happening. There is no timeout to wait for, TCP is operating as designed in this case. 3) I know about read/write handlers, I have both installed on these channels. The write handler is not getting called because the remote end is not reading and the read handler is not getting called becuase the remote end is not closing the socket nor sending my application any data. I know this because I see that on my host system netstat shows around 40K in the TCP write Q and the connection is in the ESTABLISHED state. Perhaps "bug" is a strong word, it appears that TCL is operating as designed but there should be a way to close an output channel and instruct TCL to just discard any data that it has left and not attempt to send it for the exact reason cited above. It does not sound like a good design if a remote machine can cause my application to hang while attempting to close a channel or exit the application simply becuase the interpreter mandates that it must flush all data from it's queues.
From: David Gravereaux on 2 Feb 2010 02:34 Can't you just close them manually? Off hand: foreach sock [chan names sock*] { # enables dump on close fconfigure $sock -blocking no close $sock } --
From: WC on 2 Feb 2010 02:52 David Gravereaux wrote: > Can't you just close them manually? Off hand: > > foreach sock [chan names sock*] { > # enables dump on close > fconfigure $sock -blocking no > close $sock > } > Unfortunately not, if I do this while the application is running and I leave the socket non-blocking. TCL will return from close immediately and try to flush the data in the background. So the script layer "thinks" it's closed but a file descriptor is forever allocated to the interpreter. Many opens and closes with the bad server eventually causes file descriptor starvation in the process. When the application finally attempts to exit it hangs since it is the interpreter's policy to flush and close all open channels before it exists. So all those background tasks prevent it from exiting. If I put the channel in blocking mode as you suggest above I don't even get the benefit of the interp attempting to close the channel in the background. It hangs on the close until the other side reads the data or terminates the connection. Which means that none of the my other socket handlers are being serviced as they are in the non-blocking scenario. Essentially the application gives the impression that it is locked at this point. I appreciate the suggestion though! Thanks.
|
Next
|
Last
Pages: 1 2 3 Prev: ActiveState`s documentation question Next: One more idiot "button -command" question |