Prev: BIOS detects 4 GB RAM, but kernel does not
Next: Getting started with Debian - help with Xwindows
From: nash_rack1 on 21 Aug 2006 16:56 We have this problem on our database machine where the load average shown by 'top' goes very high ( > 100) at random times and the database becomes really slow. We're trying to find out which process could be causing this load. For some reason, top does not show any processes that could be suspects. It shows only 2 running processes using some CPU. Other processes are not using any CPU. How can we find out what is causing the load average to be so high. 11:18:06 up 163 days, 5:56, 7 users, load average: 101.27, 102.72, 69.93 Tasks: 443 total, 2 running, 441 sleeping, 0 stopped, 0 zombie Cpu(s): 7.3% us, 1.4% sy, 0.0% ni, 88.3% id, 3.0% wa, 0.1% hi, 0.0% si Mem: 16634608k total, 15611264k used, 1023344k free, 3260k buffers Swap: 32796688k total, 119724k used, 32676964k free, 14511340k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28419 oracle 16 0 1493m 1.1g 1.1g R 55.3 6.9 19:59.11 oracle 22906 oracle 15 0 1494m 1.2g 1.2g D 17.2 7.4 0:39.87 oracle 31900 dba01 15 0 2156 1096 692 R 5.7 0.0 0:00.04 top 8 root RT 0 0 0 0 S 1.9 0.0 0:37.66 migration/3 3198 root 15 0 0 0 0 S 1.9 0.0 196:38.27 vxiod 1 root 16 0 2760 608 520 S 0.0 0.0 1:50.83 init 2 root RT 0 0 0 0 S 0.0 0.0 0:24.24 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 1:00.85 ksoftirqd/0 4 root RT 0 0 0 0 S 0.0 0.0 0:34.51 migration/1 5 root 34 19 0 0 0 S 0.0 0.0 1:13.25 ksoftirqd/1 6 root RT 0 0 0 0 S 0.0 0.0 0:25.12 migration/2 7 root 34 19 0 0 0 S 0.0 0.0 1:04.21 ksoftirqd/2 9 root 34 19 0 0 0 S 0.0 0.0 1:12.07 ksoftirqd/3 10 root RT 0 0 0 0 S 0.0 0.0 0:26.34 migration/4 11 root 34 19 0 0 0 S 0.0 0.0 1:02.59 ksoftirqd/4 12 root RT 0 0 0 0 S 0.0 0.0 0:32.50 migration/5 13 root 34 19 0 0 0 S 0.0 0.0 1:11.32 ksoftirqd/5 14 root RT 0 0 0 0 S 0.0 0.0 0:32.99 migration/6 15 root 34 19 0 0 0 S 0.0 0.0 1:04.03 ksoftirqd/6 16 root RT 0 0 0 0 S 0.0 0.0 0:33.00 migration/7 17 root 34 19 0 0 0 S 0.0 0.0 1:08.40 ksoftirqd/7 18 root 5 -10 0 0 0 S 0.0 0.0 0:00.58 events/0 19 root 5 -10 0 0 0 S 0.0 0.0 0:00.58 events/1 20 root 5 -10 0 0 0 S 0.0 0.0 0:00.54 events/2 21 root 5 -10 0 0 0 S 0.0 0.0 0:00.50 events/3 22 root 5 -10 0 0 0 S 0.0 0.0 0:00.56 events/4 23 root 5 -10 0 0 0 S 0.0 0.0 0:00.55 events/5 24 root 5 -10 0 0 0 S 0.0 0.0 0:00.56 events/6 25 root 5 -10 0 0 0 S 0.0 0.0 0:00.51 events/7 26 root 5 -10 0 0 0 S 0.0 0.0 0:00.01 khelper 27 root 13 -10 0 0 0 S 0.0 0.0 0:00.00 kacpid 124 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 kblockd/0 $ uname -a Linux db01 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:54:53 EST 2006 i686 i686 i38 6 GNU/Linux This is a 4 CPU machine. If you notice anything unusual in the above output or if there's another command we can use, please let me know. Thanks, Nash.
From: Patrick on 21 Aug 2006 17:27 <nash_rack1(a)yahoo.com> wrote in message news:1156193776.060140.111970(a)m79g2000cwm.googlegroups.com > We have this problem on our database machine where the load average > shown by 'top' goes very high ( > 100) at random times and the > database becomes really slow. We're trying to find out which process > could be causing this load. ... > 11:18:06 up 163 days, 5:56, 7 users, load average: 101.27, 102.72, > 69.93 > Tasks: 443 total, 2 running, 441 sleeping, 0 stopped, 0 zombie .... > If you notice anything unusual in the above output or if there's > another command we can use, please let me know. What are those 441 sleeping processes, and why are they sleeping? Waiting for disk I/O or interprocess communications? From the man page: "The load averages are the average number of processes ready to run during the last 1, 5 and 15 minutes."
From: The Natural Philosopher on 21 Aug 2006 19:50 Patrick wrote: > <nash_rack1(a)yahoo.com> wrote in message > news:1156193776.060140.111970(a)m79g2000cwm.googlegroups.com > >> We have this problem on our database machine where the load average >> shown by 'top' goes very high ( > 100) at random times and the >> database becomes really slow. We're trying to find out which process >> could be causing this load. ... >> 11:18:06 up 163 days, 5:56, 7 users, load average: 101.27, 102.72, >> 69.93 >> Tasks: 443 total, 2 running, 441 sleeping, 0 stopped, 0 zombie > ... >> If you notice anything unusual in the above output or if there's >> another command we can use, please let me know. > > What are those 441 sleeping processes, and why are they sleeping? Waiting > for disk I/O or interprocess communications? Almost certainly. I ran some DB stuff on SCO unix once, and it beat the hell out of it...we increased file, inodes, file names and directory cacheing by an order of about a thousand, and it went much better...;-) No idea whether modern Linux needs that or not, or if its possible..generally it was then a kernel boot time option. > > From the man page: "The load averages are the average number of processes > ready to run during the last 1, 5 and 15 minutes." >
From: nash_rack1 on 31 Aug 2006 16:51 Most of these sleeping processes are the processes that Oracle creates for pooled database connecitons. I believe they're sleeping because the application is not doing any SQL activity on those connections. What other commands can I use to find the root cause of this high load? Thanks, Nash. Patrick wrote: > <nash_rack1(a)yahoo.com> wrote in message > news:1156193776.060140.111970(a)m79g2000cwm.googlegroups.com > > > We have this problem on our database machine where the load average > > shown by 'top' goes very high ( > 100) at random times and the > > database becomes really slow. We're trying to find out which process > > could be causing this load. ... > > 11:18:06 up 163 days, 5:56, 7 users, load average: 101.27, 102.72, > > 69.93 > > Tasks: 443 total, 2 running, 441 sleeping, 0 stopped, 0 zombie > ... > > If you notice anything unusual in the above output or if there's > > another command we can use, please let me know. > > What are those 441 sleeping processes, and why are they sleeping? Waiting > for disk I/O or interprocess communications? > > From the man page: "The load averages are the average number of processes > ready to run during the last 1, 5 and 15 minutes."
|
Pages: 1 Prev: BIOS detects 4 GB RAM, but kernel does not Next: Getting started with Debian - help with Xwindows |