Prev: Wireless and LAN
Next: TCP data not ack'ed
From: J G Miller on 12 Jul 2010 11:32 On Monday, July 12th, 2010 at 10:09:04h -0500, Ignoramus15939 wrote: > Any ideas what might cause this? Only a couple of suggestions to consider: 1) What is the disk usage for the NFS mounted disk on Server A. If it is getting full, then that could cause a slow down for writing. 2) Is there still a process writing a big file to Server A in progress? 3) Have you checked that all of the necessary daemons are running properly? You do not indicate if this is NFSv3 (needs statd) or NFSv4 (needs idmapd), so associated required daemons will be different. If Kerberos is involved additional daemons are required. As to a quick fix, restart the NFS daemons on Server A (in worst case reboot the machine) when all of the users have gone home and nobody is reading/writing to Server A.
From: Ignoramus15939 on 12 Jul 2010 11:44 On 2010-07-12, J G Miller <miller(a)yoyo.ORG> wrote: > On Monday, July 12th, 2010 at 10:09:04h -0500, Ignoramus15939 wrote: > >> Any ideas what might cause this? > > Only a couple of suggestions to consider: > > 1) What is the disk usage for the NFS mounted disk on Server A. > If it is getting full, then that could cause a slow down for writing. 50% > 2) Is there still a process writing a big file to Server A > in progress? I think not, not that I could find, but I will look. Here's the output of 'top', which looks weirds, considering that load average is 4 and nothing is really running: top - 10:42:41 up 484 days, 22:50, 1 user, load average: 3.78, 3.27, 3.12 Tasks: 91 total, 1 running, 90 sleeping, 0 stopped, 0 zombie Cpu(s):100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 2059764k total, 2043948k used, 15816k free, 152092k buffers Swap: 409616k total, 48k used, 409568k free, 1611376k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 15 0 6120 688 564 S 0 0.0 0:15.07 init 2 root RT 0 0 0 0 S 0 0.0 0:01.21 migration/0 3 root 34 19 0 0 0 S 0 0.0 0:00.01 ksoftirqd/0 4 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0 5 root RT 0 0 0 0 S 0 0.0 0:00.16 migration/1 6 root 34 19 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1 7 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1 8 root 10 -5 0 0 0 S 0 0.0 0:00.07 events/0 9 root 10 -5 0 0 0 S 0 0.0 0:00.06 events/1 10 root 10 -5 0 0 0 S 0 0.0 0:00.00 khelper 11 root 10 -5 0 0 0 S 0 0.0 0:00.00 kthread 16 root 10 -5 0 0 0 S 0 0.0 1:45.50 kblockd/0 17 root 10 -5 0 0 0 S 0 0.0 0:02.04 kblockd/1 18 root 15 -5 0 0 0 S 0 0.0 0:00.00 kacpid 150 root 10 -5 0 0 0 S 0 0.0 0:00.00 khubd 152 root 10 -5 0 0 0 S 0 0.0 0:00.00 kseriod 203 root 15 0 0 0 0 S 0 0.0 0:10.27 pdflush 204 root 15 0 0 0 0 S 0 0.0 1:17.88 pdflush 205 root 10 -5 0 0 0 S 0 0.0 96:37.46 kswapd0 206 root 15 -5 0 0 0 S 0 0.0 0:00.00 aio/0 207 root 15 -5 0 0 0 S 0 0.0 0:00.00 aio/1 1076 root 10 -5 0 0 0 S 0 0.0 1:16.26 kjournald 1266 root 11 -4 10596 708 328 S 0 0.0 0:00.08 udevd 1659 root 16 -5 0 0 0 S 0 0.0 0:00.00 kpsmoused 1879 root 13 -5 0 0 0 S 0 0.0 0:00.00 kmirrord 1962 daemon 15 0 4820 496 376 S 0 0.0 0:01.41 portmap 2198 root 15 0 3728 668 508 S 0 0.0 22:52.14 syslogd 2204 root 15 0 2660 396 308 S 0 0.0 0:00.00 klogd 2279 root 18 0 2652 584 476 S 0 0.0 0:00.00 acpid 2322 Debian-e 15 0 23324 1116 736 S 0 0.1 0:00.24 exim4 linux-nfs(a)vger.kernel.org > 3) Have you checked that all of the necessary daemons are running properly? > You do not indicate if this is NFSv3 (needs statd) or NFSv4 > (needs idmapd), so associated required daemons will be different. > If Kerberos is involved additional daemons are required. All fstab lines say 'nfs', it may be nfsv4. > As to a quick fix, restart the NFS daemons on Server A (in worst case > reboot the machine) when all of the users have gone home and nobody > is reading/writing to Server A. Yes, that's the plan for later today, restart nfs daemon on A and see if it helps. Thanks JG. i
From: Hadron on 12 Jul 2010 11:50 J G Miller <miller(a)yoyo.ORG> writes: > On Monday, July 12th, 2010 at 10:09:04h -0500, Ignoramus15939 wrote: > >> Any ideas what might cause this? > > Only a couple of suggestions to consider: > > 1) What is the disk usage for the NFS mounted disk on Server A. > If it is getting full, then that could cause a slow down for writing. > > 2) Is there still a process writing a big file to Server A > in progress? > > 3) Have you checked that all of the necessary daemons are running properly? > You do not indicate if this is NFSv3 (needs statd) or NFSv4 > (needs idmapd), so associated required daemons will be different. > If Kerberos is involved additional daemons are required. > > As to a quick fix, restart the NFS daemons on Server A (in worst case > reboot the machine) when all of the users have gone home and nobody > is reading/writing to Server A. Compare the /etc/hosts file on both and, in addition, check they are using the same DNS.
From: Stan Bischof on 12 Jul 2010 12:00 In comp.os.linux.misc Ignoramus15939 <ignoramus15939(a)nospam.15939.invalid> wrote: > I think not, not that I could find, but I will look. Here's the output > of 'top', which looks weirds, considering that load average is 4 and > nothing is really running: > > top - 10:42:41 up 484 days, 22:50, 1 user, load average: 3.78, 3.27, 3.12 That is a little weird. I've seen nfs go wild and start spawning daemons til the system chokes, and I've seen processes hung in IO that suck down CPU time. You might look to see how many NFS daemons you have running. In any case sounds like time to restart NFS- and any other process(es) that could be hung. Stan
From: Ignoramus20495 on 12 Jul 2010 13:47
On 2010-07-12, Stan Bischof <stan(a)newserve.worldbadminton.com> wrote: > In comp.os.linux.misc Ignoramus15939 <ignoramus15939(a)nospam.15939.invalid> wrote: >> I think not, not that I could find, but I will look. Here's the output >> of 'top', which looks weirds, considering that load average is 4 and >> nothing is really running: >> >> top - 10:42:41 up 484 days, 22:50, 1 user, load average: 3.78, 3.27, 3.12 > > That is a little weird. I've seen nfs go wild and start spawning > daemons til the system chokes, and I've seen processes hung in IO > that suck down CPU time. > > You might look to see how many NFS daemons you have running. > > In any case sounds like time to restart NFS- and any other process(es) > that could be hung. Server A:~# ps auxw | grep nfsd root 2998 0.0 0.0 0 0 ? S< 2009 0:00 [nfsd4] root 2999 0.1 0.0 0 0 ? S 2009 1317:25 [nfsd] root 3000 0.1 0.0 0 0 ? S 2009 1311:14 [nfsd] root 3001 0.1 0.0 0 0 ? S 2009 1299:59 [nfsd] root 3002 0.1 0.0 0 0 ? S 2009 1306:12 [nfsd] root 3003 0.1 0.0 0 0 ? S 2009 1305:07 [nfsd] root 3004 0.1 0.0 0 0 ? S 2009 1302:03 [nfsd] root 3005 0.1 0.0 0 0 ? D 2009 1287:22 [nfsd] root 3006 0.1 0.0 0 0 ? S 2009 1296:57 [nfsd] root 25666 0.0 0.0 3936 716 pts/0 S+ 12:24 0:00 grep nfsd |