Prev: [PATCH] wm8727: add a missing return in wm8727_platform_probe
Next: [PATCH] block: Add secure discard
From: Ed W on 14 Jul 2010 19:00 >> Although section 3 of RFC 5681 is a great text, it does not say at all >> that increasing the initial CWND would lead to fairness issues. >> > Because it is only one side of the medal, probing conservative the available > link capacity in conjunction with n simultaneous probing TCP/SCTP/DCCP > instances is another. > So lets define the problem more succinctly: - New TCP connections are assumed to have no knowledge of current network conditions (bah) - We desire the connection to consume the maximum amount of bandwidth possible, but staying ever so fractionally under the maximum link bandwidth > Currently I know no working link capacity probing approach, without active > network feedback, to conservatively probing the available link capacity with a > high CWND. I am curious about any future trends. > Sounds like smarter people than I have played this game, but just to chuck out one idea: How about attacking the idea that we have no knowledge of network conditions? After all we have a bunch of information about: 1) very good information about the size of the link to the first hop (eg the modem/network card reported rate) 2) often a reasonably good idea about the bandwidth to the first "restrictive" router along our default path (ie usually the situation is there is a pool of high speed network locally, then a more limited connectivity between our network and other networks. We can look at the maximum flows through our network device to outside our subnet and infer an approximate link speed from that) 3) often moderate quality information about the size of the link between us and a specific destination IP So here goes: the heuristic could be to examine current flows through our interface, use this to offer hints to the remote end during SYN handshake as to a recommended starting size, and additionally the client side can examine the implied RTT of the SYN/ACK to further fine tune the initial cwnd? In practice this could be implemented in other ways such as examining recent TCP congestion windows and using some heuristic to start "near" those. Or remembering congestion windows recently used for popular destinations? Also we can benefit the receiver of our data - if we see some app open up 16 http connections to some poor server then some of those connections will NOT be given large initial cwnd. Essentially perhaps we can refine our initial cwnd heuristic somewhat if we assume better than zero knowledge about the network link? Out of curiousity, why has it taken so long for active feedback to appear? If every router simply added a hint to the packet as to the max bandwidth it can offer then we would appear to be able to make massively better decisions on window sizes. Furthermore routers have the ability to put backpressure on classes of traffic as appropriate. I guess the speed at which ECN has been adopted answers the question of why nothing more exotic has appeared? >> But for all we know this side discussion about initial CWND settings >> could have nothing to do with the issue being reported at the start of >> this thread. :-) >> Actually the original question was mine and it was literally - can I adjust the initial cwnd for users of my very specific satellite network which has a high RTT. I believe Stephen Hemminger has been kind enough to recently add the facility to experiment with this to the ip utility and so I am now in a position to go do some testing - thanks Stephen Cheers Ed W -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Ed W on 14 Jul 2010 19:10 > Do you cite "An Argument for Increasing TCP's Initial Congestion Window"? > People at google stated that a CWND of 10 seems to be fair in their > measurements. 10 because the test setup was equipped with a reasonable large > link capacity? Do they analyse their modification in environments with a small > BDP (e.g. multihop MANET setup, ...)? I am curious, but We will see what > happens if TCPM adopts this. > Well, I personally would shoot for starting from the position of assuming better than zero knowledge about our link and incorporating that into the initial cwnd estimate... We know something about the RTT from the syn/ack times, speed of the local link and quickly we will learn about median window sizes to other destinations, plus additionally the kernel has some knowledge of other connections currently in progress. With all that information perhaps we can make a more informed option than just a hard coded magic number? (Oh and lets make the option pluggable so that we can soon have 10 different kernel options...) Seems like there is evidence that networks are starting to cluster into groups that would benefit from a range of cwnd options (higher/lower) - perhaps there is some way to choose a reasonable heuristic to cluster these and choose a better starting option? Cheers Ed W -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Ed W on 14 Jul 2010 19:10 On 15/07/2010 00:01, Hagen Paul Pfeifer wrote: > It is quite late here so I will quickly write two sentence about ECN: one > month ago Lars Eggers posted a link at the tcpm maillinglist where google (not > really sure if it was google) analysed the employment of ECN - the usage was > really low. Search the PDF, it is quite interesting one. > I would speculate that this is because there is a big warning on ECN saying that it may cause you to loose customers who can't connect to you... Businesses are driven by needing to support the most common case, not the most optimal (witness the pain of html development and needing to consider IE6...) What would be more useful is for google to survey how many devices are unable to interoperate with ECN and if that number turned out to be extremely low, and this fact were advertised, then I suspect we might see a mass increase in it's deployment? I know I have it turned off on all my servers because I worry more about loosing one customer than improving the experience for all customers... Cheers Ed W -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Hagen Paul Pfeifer on 14 Jul 2010 19:10 * Ed W | 2010-07-14 23:52:02 [+0100]: >Out of curiousity, why has it taken so long for active feedback to >appear? If every router simply added a hint to the packet as to the >max bandwidth it can offer then we would appear to be able to make >massively better decisions on window sizes. Furthermore routers have >the ability to put backpressure on classes of traffic as appropriate. >I guess the speed at which ECN has been adopted answers the question >of why nothing more exotic has appeared? It is quite late here so I will quickly write two sentence about ECN: one month ago Lars Eggers posted a link at the tcpm maillinglist where google (not really sure if it was google) analysed the employment of ECN - the usage was really low. Search the PDF, it is quite interesting one. Hagen -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Bill Fink on 14 Jul 2010 23:00
On Wed, 14 Jul 2010, David Miller wrote: > From: Bill Davidsen <davidsen(a)tmr.com> > Date: Wed, 14 Jul 2010 11:21:15 -0400 > > > You may have to go into /proc/sys/net/core and crank up the > > rmem_* settings, depending on your distribution. > > You should never, ever, have to touch the various networking sysctl > values to get good performance in any normal setup. If you do, it's a > bug, report it so we can fix it. > > I cringe every time someone says to do this, so please do me a favor > and don't spread this further. :-) > > For one thing, TCP dynamically adjusts the socket buffer sizes based > upon the behavior of traffic on the connection. > > And the TCP memory limit sysctls (not the core socket ones) are sized > based upon available memory. They are there to protect you from > situations such as having so much memory dedicated to socket buffers > that there is none left to do other things effectively. It's a > protective limit, rather than a setting meant to increase or improve > performance. So like the others, leave these alone too. What's normal? :-) netem1% cat /proc/version Linux version 2.6.30.10-105.2.23.fc11.x86_64 (mockbuild(a)x86-01.phx2.fedoraproject.org) (gcc version 4.4.1 20090725 (Red Hat 4.4.1-2) (GCC) ) #1 SMP Thu Feb 11 07:06:34 UTC 2010 Linux TCP autotuning across an 80 ms RTT cross country network path: netem1% nuttcp -T10 -i1 192.168.1.18 14.1875 MB / 1.00 sec = 119.0115 Mbps 0 retrans 558.0000 MB / 1.00 sec = 4680.7169 Mbps 0 retrans 872.8750 MB / 1.00 sec = 7322.3527 Mbps 0 retrans 869.6875 MB / 1.00 sec = 7295.5478 Mbps 0 retrans 858.4375 MB / 1.00 sec = 7201.0165 Mbps 0 retrans 857.3750 MB / 1.00 sec = 7192.2116 Mbps 0 retrans 865.5625 MB / 1.00 sec = 7260.7193 Mbps 0 retrans 872.3750 MB / 1.00 sec = 7318.2095 Mbps 0 retrans 862.7500 MB / 1.00 sec = 7237.2571 Mbps 0 retrans 857.6250 MB / 1.00 sec = 7194.1864 Mbps 0 retrans 7504.2771 MB / 10.09 sec = 6236.5068 Mbps 11 %TX 25 %RX 0 retrans 80.59 msRTT Manually specified 100 MB TCP socket buffer on the same path: netem1% nuttcp -T10 -i1 -w100m 192.168.1.18 106.8125 MB / 1.00 sec = 895.9598 Mbps 0 retrans 1092.0625 MB / 1.00 sec = 9160.3254 Mbps 0 retrans 1111.2500 MB / 1.00 sec = 9322.6424 Mbps 0 retrans 1115.4375 MB / 1.00 sec = 9356.2569 Mbps 0 retrans 1116.4375 MB / 1.00 sec = 9365.6937 Mbps 0 retrans 1115.3125 MB / 1.00 sec = 9356.2749 Mbps 0 retrans 1121.2500 MB / 1.00 sec = 9405.6233 Mbps 0 retrans 1125.5625 MB / 1.00 sec = 9441.6949 Mbps 0 retrans 1130.0000 MB / 1.00 sec = 9478.7479 Mbps 0 retrans 1139.0625 MB / 1.00 sec = 9555.8559 Mbps 0 retrans 10258.5120 MB / 10.20 sec = 8440.3558 Mbps 15 %TX 40 %RX 0 retrans 80.59 msRTT The manually selected TCP socket buffer size both ramps up quicker and achieves a much higher steady state rate. -Bill -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |