Prev: IPTV and its relevant technology
Next: Cheap wholesale 2010 World Cup jerseys by paypal and free shipping
From: Urs Thuermann on 11 Jun 2010 05:39 I am developing a client-server application using TCP sockets. After connection establishment the server application first writes a variably-sized small header (typically around 40-60 bytes) followed by many writes of 256 bytes of data (everything using write(2) on the socket). The socket doesn't have the O_NONBLOCK flag set and, AFAICS, no signals are being sent to the process. In case it does matter, the server has many threads running and serves several clients at the same time (a couple of threads producing data, one thread listening on the socket, and one thread per connected client). I am surprised that sometimes the write system call on the socket returns with less than 256 bytes written. My understanding is that according to POSIX, this shouldn't happen: When attempting to write to a file descriptor (other than a pipe or FIFO) that supports non-blocking writes and cannot accept the data immediately: * If the O_NONBLOCK flag is clear, write() shall block the calling thread until the data can be accepted. * If the O_NONBLOCK flag is set, write() shall not block the thread. If some data can be written without blocking the thread, write() shall write what it can and return the number of bytes written. Otherwise, it shall return -1 and set errno to [EAGAIN]. Therefore, I expected the write(2) system call to return immediately with 256 if the send buffer has enough space, or to block until 256 bytes can be written to the send buffer and then also return with 256. urs
From: Rick Jones on 11 Jun 2010 13:19 Perhaps your platform's stack is slightly buggy. Or not strictly conformant to POSIX. While I don't know there is one in this case, POSIX has been known to hae "loopholes." Drifting...if ever so slightly... How many of these 256 byte writes does your application make? Is there a specific reason it is a stream of comparatively tiny 256 byte writes rather than larger writes? Are they "spread-out" in time or do they get sent "back-to-back?" rick jones -- oxymoron n, Hummer H2 with California Save Our Coasts and Oceans plates these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: Urs Thuermann on 11 Jun 2010 20:39 Rick Jones <rick.jones2(a)hp.com> writes: > Perhaps your platform's stack is slightly buggy. It's a Debian testing Linux with kernel 2.6.32: urs(a)ha:~$ uname -a Linux ha 2.6.32-3-686 #1 SMP Thu Feb 25 06:14:20 UTC 2010 i686 GNU/Linux > How many of these 256 byte writes does your application make? Is > there a specific reason it is a stream of comparatively tiny 256 byte > writes rather than larger writes? Are they "spread-out" in time or do > they get sent "back-to-back?" There are millions of these small writes. The first few thousands are written very fast, i.e. as fast as the network connection allows, then the rate is rougly 24 writes per seconds, i.e. 48 kbit/s. In the server several threads are producing data each into its own FIFO buffer with roughly 48 kbit/s: char b[4096]; while (1) { produce_data(b, sizeof(b)); write_to_fifo_buffer(b, sizeof(b)); } For each client connected to the server, I start a thread that selects one of these buffers and then sends data from it: char b[256]; select_buffer(); write(sock, small_header, header_size); while (1) { get_from_fifo_buffer(b, sizeof(b)); n = write(sock, b, sizeof(b)); if (n < 0) { perror("write"); break; } else if (n < sizeof(b)) { /* This is where I didn't expect to get into * and I currently don't handle this case properly. */ fprintf(stderr, "Incomplete write...", ...); } } After the client connects the buffer it typically has some megabytes of data so the while loop can send as fast as the network connection allows. Following that the thread will wait in each loop iteration in get_from_fifo_buffer() one a pthread_mutex and is limited to the 48 kbit/s the buffer is filled at. The size of 256 was selected somewhat arbitrarily, I have chosen a small value to get less bursty behavior of the thread. But the observed behavior of the write(2) system call means that I get incomplete writes and have to change the code to handle the unwritten data in some way. This would probably also be necessary with a larger buffer b of say 4096 bytes. I see the incomplete writes very seldomly, but when they occur it happens mostly after only a couple writes on the socket after connection establishment when the rate of writes is still very high. urs
From: Rick Jones on 11 Jun 2010 21:07 Urs Thuermann <urs(a)isnogud.escape.de> wrote: > Rick Jones <rick.jones2(a)hp.com> writes: > > Perhaps your platform's stack is slightly buggy. > It's a Debian testing Linux with kernel 2.6.32: I cannot say that I do many tests with netperf writing 256 bytes at a time, but you could try a netperf TCP_STREAM test and/or some "burst mode" TCP_RR tests with 256 byte sends/requests/responses to see if you can make it happen with other code. > urs(a)ha:~$ uname -a > Linux ha 2.6.32-3-686 #1 SMP Thu Feb 25 06:14:20 UTC 2010 i686 GNU/Linux > > How many of these 256 byte writes does your application make? Is > > there a specific reason it is a stream of comparatively tiny 256 byte > > writes rather than larger writes? Are they "spread-out" in time or do > > they get sent "back-to-back?" > There are millions of these small writes. The first few thousands are > written very fast, i.e. as fast as the network connection allows, then > the rate is rougly 24 writes per seconds, i.e. 48 kbit/s. The reason I ask is that sending bulk data 256 bytes at a time isn't terribly efficient... > In the server several threads are producing data each into its own > FIFO buffer with roughly 48 kbit/s: > char b[4096]; > while (1) { > produce_data(b, sizeof(b)); > write_to_fifo_buffer(b, sizeof(b)); > } > For each client connected to the server, I start a thread that selects > one of these buffers and then sends data from it: > char b[256]; > select_buffer(); > write(sock, small_header, header_size); > while (1) { > get_from_fifo_buffer(b, sizeof(b)); > n = write(sock, b, sizeof(b)); > if (n < 0) { > perror("write"); > break; > } else if (n < sizeof(b)) { > /* This is where I didn't expect to get into > * and I currently don't handle this case properly. > */ > fprintf(stderr, "Incomplete write...", ...); > } > } > After the client connects the buffer it typically has some megabytes > of data so the while loop can send as fast as the network connection > allows. Following that the thread will wait in each loop iteration in > get_from_fifo_buffer() one a pthread_mutex and is limited to the 48 > kbit/s the buffer is filled at. > The size of 256 was selected somewhat arbitrarily, I have chosen a > small value to get less bursty behavior of the thread. Are you also disabling Nagle? That burst of 256 byte sends at the beginning may get chunked-up by a combination of Nagle and TSO. You can see the difference in stack efficiency with netperf TCP_STREAM tests: netperf -H <remote> -c -C -l 30 -- -m 256 vs netperf -H <remote> -c -C -l 30 -- -m <something much larger> If you are setting TCP_NODELAY in your application, add a -D option to the end of the netperf command lines. > But the observed behavior of the write(2) system call means that I > get incomplete writes and have to change the code to handle the > unwritten data in some way. This would probably also be necessary > with a larger buffer b of say 4096 bytes. > I see the incomplete writes very seldomly, but when they occur it > happens mostly after only a couple writes on the socket after > connection establishment when the rate of writes is still very high. That would also be when the Linux stack is still "autotuning" the socket buffer size - at least if you haven't made an explicit setsockopt(SO_SNDBUF) call before-hand. Sending that big burst at the beginning will probably result in a rather larger than necessary SO_SNDBUF with autotuning - and similarly for the SO_RCVBUF at the receiver if it recv()s the data as fast as it arrives. If you want explicit socket buffer sizes with netperf tack-on a -s <size> at the end for the local end and a -S <size> for the remote. rick jones -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail?
From: David Schwartz on 11 Jun 2010 22:40 On Jun 11, 2:39 am, Urs Thuermann <u...(a)isnogud.escape.de> wrote: > I am surprised that sometimes the write system call on the socket > returns with less than 256 bytes written. .... > Therefore, I expected the write(2) system call to return immediately > with 256 if the send buffer has enough space, or to block until 256 > bytes can be written to the send buffer and then also return with 256. What happens when you immediately follow up with a 'write' call for the remaining bytes? Does it succeed? If so, the solution is pretty obvious (though it's not clear why you should need it), just call 'write' again. You should be doing that anyway. DS
|
Next
|
Last
Pages: 1 2 Prev: IPTV and its relevant technology Next: Cheap wholesale 2010 World Cup jerseys by paypal and free shipping |