From: Anoop on 31 Mar 2010 20:30 Hey All, I'm having trouble with NFS write performance. The NFS server is a 35 TB Fileserver with a 10 GbE Myricom card. The filesystem is ZFS with ZIL disabled for the time being. The local writes to disk occur at 595 MBps. The network speed (with proper TCP tuning and jumbo frames) is 9.8 Gbps (full capacity) on both reception and transmission. This was benchmarked using nuttcp. However, when filesystem is exported over NFS over the 10GbE card, the NFS write performance seems to drop to 90 MBps. I'm getting less than 16% of peak performance. I'm at a loss on how to tune it. My benchmarking tool is pretty rudimentary. I'm using "dd" with a 128k block-size to write files. Any pointers on how to help me get more out of the NFS system? Thanks, -Anoop
From: Rick Jones on 1 Apr 2010 13:37 Anoop <anoop.rajendra(a)gmail.com> wrote: > I'm having trouble with NFS write performance. > The NFS server is a 35 TB Fileserver with a 10 GbE Myricom card. The > filesystem is ZFS with ZIL disabled for the time being. > The local writes to disk occur at 595 MBps. The network speed (with > proper TCP tuning and jumbo frames) is 9.8 Gbps (full capacity) on > both reception and transmission. This was benchmarked using nuttcp. > However, when filesystem is exported over NFS over the 10GbE card, > the NFS write performance seems to drop to 90 MBps. I'm getting less > than 16% of peak performance. I'm at a loss on how to tune it. NFS is actually a request/response protocol. It is for that, among many other reasons that netperf includes request/response tests. Simply taking a bulk, unidirectional bandwidth measurement like that above is not giving you enough of a picture. Particularly if there is some aggressive interrupt coalescing/avoidance going-on on either side. Sometimes that is done in the name of reducing CPU overhead for bulk transfers to try to enable getting to higher bulk transfer rates. If not done "well" though it will trash latency. netperf -t TCP_RR -H <remote> is a good way to see if there is such aggressive interrupt coalescing going-on. > My benchmarking tool is pretty rudimentary. I'm using "dd" with a > 128k block-size to write files. > Any pointers on how to help me get more out of the NFS system? You need to know how many writes your client system will have outstanding at one time. Then to see how many you need/want to have it have outstanding at one time I would probably: download netperf http://www.netperf.org/ unpack the source on the server unpack the source on the client ../configure --enable-burst --prefix=<where you want make to stick it> make install On the server: netserver On the client: NO_HDR="-P 1" for b in 0 1 2 3 etc etc do netperf -H <server> -t TCP_RR -f m -c -C -l 20 $NO_HDR -- -r 32K,256 -D -b $b NO_HDR="-P 0" done I took a guess as to the mount size (32K - 32768 bytes) but didn't include the NFS header - if I had netperf would have counted that as goodput. I guessed that the write replies would be ~256 bytes. Adjust as you see fit. Actually there should probably be "test-specific" (after the "--") -s and -S options to set the socket buffer size and thus the TCP window size. I don't know what Sun's NFS stuff uses, I would start with a guess of 1M just for grins "-s 1M -S 1M" . Whatever you do, you don't want the request size (32K) times the value of $b (additional transactions in flight at one time) to be larger than the socket buffer. The way --enable-burst abuses the netperf TCP_RR test it will lead to test deadlock :) happy benchmarking, rick jones BTW, -c and -C will cause netperf to report local (netperf side) and remote (netserver side) CPU utilization. It will also calculate a "service demand" which is a measure of how much active CPU time was consumed per unit of work performed. Smaller is better for service demand. The -l option is telling netperf to run for 20 seconds. http://www.netperf.org/svn/netperf2/tags/netperf-2.4.5/doc/netperf.html -- oxymoron n, commuter in a gas-guzzling luxury SUV with an American flag these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
|
Pages: 1 Prev: How to clone Solaris 10 ZFS boot disk? Next: SMC CIM error |