Prev: socket flushing/buffering problem, app hangs on close
Next: socket flushing/buffering problem, app hangs on close
From: WC on 1 Feb 2010 20:07 I've written a TCL app that receives data from a single TCP source and distributes this data to multiple TCP receivers using a very simple ASCII protocol. The server is non-blocking using TCL's event loop. Most of the receivers are not under my control and sometimes behave poorly. This means I don't have access to code/application and in some cases the owner of those applications. Here is my problem. TCL has called my writable handler indicating that a channel is ready for data. I write data to the channel but the client stops reading data at some point, but does not close the connection. TCP's flow control kicks in and data ends up being buffered in the receivers TCP input buffer, my hosts TCP output buffer and finally my application's TCL channel output buffer. If at this point I connect to another port and issue a command for my application to shutdown it hangs. I forced a core dump and noticed that it's hanging in send(). The man page for TCL's close indicates that TCL will put the channel into blocking mode and attempt to flush the channel of any remaining data, the interpreter does this for each open channel when exit is called. However if the TCP stack is not accepting data the application will never be able to exit or close channels without exiting for that matter. This appears to be a pretty serious bug. I need to 'kill -9' in order to force an exit... very ugly. Seems like what is needed is an option to the close command to discard any data buffered in the TCL channel's output buffer and close the channel. I coded a small extension in C that closes the OS specific handle for the channel and the unregisters the channel from the interpreter. This causes send() to return -1 but the interpreter doesn't care at that point and shutdown continues successfully. Anyone else run into this? I'm I totally missing something here? BTW I'm using TCL 8.4 on Linux and HP-UX but a review of the current 8.5 API it seems like this deadlock could still exist. Any input/ideas are greatly appreciated, Wayne |