socket flushing/buffering problem, app hangs on close [TCL]

Prev: socket flushing/buffering problem, app hangs on close
Next: socket flushing/buffering problem, app hangs on close

From: WC on 1 Feb 2010 20:06

I've written a TCL app that receives data from a single TCP source and
distributes this data to multiple TCP receivers using a very simple
ASCII protocol. The server is non-blocking using TCL's event loop. Most
of the receivers are not under my control and sometimes behave poorly.
This means I don't have access to code/application and in some cases the
owner of those applications.

Here is my problem.

TCL has called my writable handler indicating that a channel is ready
for data. I write data to the channel but the client stops reading data
at some point, but does not close the connection. TCP's flow control
kicks in and data ends up being buffered in the receivers TCP input
buffer, my hosts TCP output buffer and finally my application's TCL
channel output buffer.

If at this point I connect to another port and issue a command for my
application to shutdown it hangs. I forced a core dump and noticed that
it's hanging in send(). The man page for TCL's close indicates that TCL
will put the channel into blocking mode and attempt to flush the channel
of any remaining data, the interpreter does this for each open channel
when exit is called. However if the TCP stack is not accepting data the
application will never be able to exit or close channels without exiting
for that matter. This appears to be a pretty serious bug. I need to
'kill -9' in order to force an exit... very ugly. Seems like what is
needed is an option to the close command to discard any data buffered in
the TCL channel's output buffer and close the channel.

I coded a small extension in C that closes the OS specific handle for
the channel and the unregisters the channel from the interpreter. This
causes send() to return -1 but the interpreter doesn't care at that
point and shutdown continues successfully.

Anyone else run into this? I'm I totally missing something here?

BTW I'm using TCL 8.4 on Linux and HP-UX but a review of the current 8.5
API it seems like this deadlock could still exist.

Any input/ideas are greatly appreciated,
Wayne

|
Pages: 1
Prev: socket flushing/buffering problem, app hangs on close
Next: socket flushing/buffering problem, app hangs on close