Prev: [PATCH v2 3/11] Enhance replace_page() to support pagecache
Next: OMAP: Fix for bus width which improves SD card's peformance.
From: Denys Fedorysychenko on 31 Mar 2010 12:10 I have a proxy server with "loaded" squid. On some moment i did sync, and expecting it to finish in reasonable time. Waited more than 30 minutes, still "sync". Can be reproduced easily. Here is some stats and info: Linux SUPERPROXY 2.6.33.1-build-0051 #16 SMP Wed Mar 31 17:23:28 EEST 2010 i686 GNU/Linux SUPERPROXY ~ # iostat -k -x -d 30 Linux 2.6.33.1-build-0051 (SUPERPROXY) 03/31/10 _i686_ (4 CPU) Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.16 0.01 0.08 0.03 3.62 1.33 88.94 0.15 1389.89 59.15 0.66 sdb 4.14 61.25 6.22 25.55 44.52 347.21 24.66 2.24 70.60 2.36 7.49 sdc 4.37 421.28 9.95 98.31 318.27 2081.95 44.34 20.93 193.21 2.31 24.96 sdd 2.34 339.90 3.97 117.47 95.48 1829.52 31.70 1.73 14.23 8.09 98.20 sde 2.29 71.40 2.34 27.97 22.56 397.81 27.74 2.34 77.34 1.66 5.04 dm-0 0.00 0.00 0.19 0.02 3.48 0.02 32.96 0.05 252.11 28.05 0.60 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 54.67 2.93 26.87 12.27 326.13 22.71 2.19 73.49 1.91 5.68 sdc 0.00 420.50 3.43 110.53 126.40 2127.73 39.56 23.82 209.00 2.06 23.44 sdd 0.00 319.63 2.30 122.03 121.87 1765.87 30.37 1.72 13.83 7.99 99.37 sde 0.00 71.67 0.83 30.63 6.93 409.33 26.46 2.66 84.68 1.51 4.76 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 CPU: 8.4% usr 7.7% sys 0.0% nic 50.7% idle 27.7% io 0.6% irq 4.7% sirq Load average: 5.57 4.82 4.46 2/243 2032 PID PPID USER STAT VSZ %MEM CPU %CPU COMMAND 1769 1552 squid R 668m 8.3 3 11.7 /usr/sbin/squid -N 1546 1545 root R 10800 0.1 2 6.0 /config/globax /config/globax.conf 1549 1548 root S 43264 0.5 2 1.5 /config/globax /config/globax- dld.conf 1531 2 root DW 0 0.0 0 0.3 [jbd2/sdd1-8] 1418 1 root S 2500 0.0 3 0.0 /sbin/syslogd -R 80.83.17.2 1524 2 root SW 0 0.0 0 0.0 [flush-8:32] 1525 2 root SW 0 0.0 1 0.0 [jbd2/sdc1-8] 1604 2 root DW 0 0.0 0 0.0 [flush-8:48] 1537 2 root SW 0 0.0 1 0.0 [jbd2/sde1-8] 18 2 root SW 0 0.0 3 0.0 [events/3] 1545 1 root S 3576 0.0 1 0.0 /config/globax /config/globax.conf 1548 1 root S 3576 0.0 0 0.0 /config/globax /config/globax- dld.conf 1918 1 ntp S 3316 0.0 3 0.0 /usr/sbin/ntpd -s 1919 1 root S 3268 0.0 3 0.0 /usr/sbin/ntpd -s 1 0 root S 2504 0.0 0 0.0 /bin/sh /init trynew trynew trynew trynew 1923 1257 root S 2504 0.0 1 0.0 /sbin/getty 38400 tty1 1924 1257 root S 2504 0.0 0 0.0 /sbin/getty 38400 tty2 1927 1257 root S 2504 0.0 0 0.0 /sbin/getty 38400 tty3 2015 2014 root S 2504 0.0 1 0.0 -ash 2032 2015 root R 2504 0.0 3 0.0 top 1584 1 root S 2500 0.0 1 0.0 /usr/bin/ifplugd -i eth0 -a -r /etc/startup/rc.ifup -t 1 -u 1 -d 1 1592 1 root S 2500 0.0 1 0.0 /usr/bin/ifplugd -i eth2 -a -r /etc/startup/rc.ifup -t 1 -u 1 -d 1 1587 1 root S 2500 0.0 1 0.0 /usr/bin/ifplugd -i eth1 -a -r /etc/startup/rc.ifup -t 1 -u 1 -d 1 1595 1 root S 2500 0.0 1 0.0 /usr/bin/ifplugd -i eth3 -a -r /etc/startup/rc.ifup -t 1 -u 1 -d 1 1257 1 root S 2500 0.0 0 0.0 init 1420 1 root S 2500 0.0 1 0.0 /sbin/klogd 1432 1 root S 2500 0.0 3 0.0 /usr/sbin/telnetd -f /etc/issue.telnet 1552 1 root S 2500 0.0 1 0.0 /bin/sh /bin/squidloop 1743 1742 root S 2500 0.0 3 0.0 ash -c gs newkernel 1744 1743 root S 2500 0.0 0 0.0 /bin/sh /bin/gs newkernel 1753 1744 root D 2368 0.0 0 0.0 sync SUPERPROXY ~ # cat /proc/1753/stack [<c019a93c>] bdi_sched_wait+0x8/0xc [<c019a807>] wait_on_bit+0x20/0x2c [<c019a9af>] sync_inodes_sb+0x6f/0x10a [<c019dd53>] __sync_filesystem+0x28/0x49 [<c019ddf3>] sync_filesystems+0x7f/0xc0 [<c019de7a>] sys_sync+0x1b/0x2d [<c02f7a25>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff SUPERPROXY ~ # cat /proc/1753/status Name: sync State: D (disk sleep) Tgid: 1753 Pid: 1753 PPid: 1744 TracerPid: 0 Uid: 0 0 0 0 Gid: 0 0 0 0 FDSize: 32 Groups: 0 VmPeak: 2368 kB VmSize: 2368 kB VmLck: 0 kB VmHWM: 408 kB VmRSS: 408 kB VmData: 192 kB VmStk: 84 kB VmExe: 488 kB VmLib: 1540 kB VmPTE: 12 kB Threads: 1 SigQ: 2/127824 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000000000 SigCgt: 0000000000000000 CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff Cpus_allowed: f Cpus_allowed_list: 0-3 Mems_allowed: 1 Mems_allowed_list: 0 voluntary_ctxt_switches: 124 nonvoluntary_ctxt_switches: 2 SUPERPROXY ~ # cat /proc/meminfo MemTotal: 8186788 kB MemFree: 3758908 kB Buffers: 373656 kB Cached: 3122824 kB SwapCached: 0 kB Active: 1851000 kB Inactive: 2152916 kB Active(anon): 446516 kB Inactive(anon): 63140 kB Active(file): 1404484 kB Inactive(file): 2089776 kB Unevictable: 2152 kB Mlocked: 0 kB HighTotal: 7346120 kB HighFree: 3720868 kB LowTotal: 840668 kB LowFree: 38040 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 109540 kB Writeback: 1036 kB AnonPages: 509648 kB Mapped: 2320 kB Shmem: 8 kB Slab: 263512 kB SReclaimable: 129176 kB SUnreclaim: 134336 kB KernelStack: 980 kB PageTables: 1460 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 4093392 kB Committed_AS: 746808 kB VmallocTotal: 122880 kB VmallocUsed: 2860 kB VmallocChunk: 113308 kB DirectMap4k: 4088 kB DirectMap2M: 907264 kB "Tuning" (that probably is wrong?) SCHED="cfq" echo ${SCHED} >/sys/block/sdb/queue/scheduler echo ${SCHED} >/sys/block/sdc/queue/scheduler echo ${SCHED} >/sys/block/sdd/queue/scheduler echo ${SCHED} >/sys/block/sde/queue/scheduler echo ${SCHED} >/sys/block/sdf/queue/scheduler echo 0 >/sys/block/sdb/queue/iosched/low_latency echo 0 >/sys/block/sdc/queue/iosched/low_latency echo 0 >/sys/block/sdd/queue/iosched/low_latency echo 0 >/sys/block/sde/queue/iosched/low_latency echo 0 >/sys/block/sdf/queue/iosched/low_latency echo 128 >/sys/block/sdb/queue/iosched/quantum echo 128 >/sys/block/sdc/queue/iosched/quantum echo 128 >/sys/block/sdd/queue/iosched/quantum echo 128 >/sys/block/sde/queue/iosched/quantum echo 128 >/sys/block/sdf/queue/iosched/quantum sysctl -w vm.dirty_background_bytes=600000000 sysctl -w vm.dirty_bytes=800000000 rootfs on / type rootfs (rw) proc on /proc type proc (rw,relatime) sysfs on /sys type sysfs (rw,relatime) none on /dev/pts type devpts (rw,relatime,mode=600) none on /proc/bus/usb type usbfs (rw,relatime) /dev/mapper/root on /mnt/flash type ext3 (rw,noatime,errors=continue,data=writeback) /dev/loop0 on /lib/modules/2.6.33.1-build-0051 type squashfs (ro,relatime) none on /sys/kernel/debug type debugfs (rw,relatime) /dev/sdb2 on /cache1 type ext4 (rw,noatime,nodiratime,barrier=0,journal_async_commit,nobh,data=writeback) /dev/sdc1 on /cache2 type ext4 (rw,noatime,nodiratime,barrier=0,journal_async_commit,nobh,data=writeback) /dev/sdd1 on /cache3 type ext4 (rw,noatime,nodiratime,barrier=0,journal_async_commit,nobh,data=writeback) /dev/sde1 on /cache4 type ext4 (rw,noatime,nodiratime,barrier=0,journal_async_commit,nobh,data=writeback) /dev/sda1 on /mnt/boot type ext2 (rw,relatime,errors=continue) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dave Chinner on 31 Mar 2010 18:20 On Wed, Mar 31, 2010 at 07:07:31PM +0300, Denys Fedorysychenko wrote: > I have a proxy server with "loaded" squid. On some moment i did sync, and > expecting it to finish in reasonable time. Waited more than 30 minutes, still > "sync". Can be reproduced easily. > > Here is some stats and info: > > Linux SUPERPROXY 2.6.33.1-build-0051 #16 SMP Wed Mar 31 17:23:28 EEST 2010 > i686 GNU/Linux > > SUPERPROXY ~ # iostat -k -x -d 30 > Linux 2.6.33.1-build-0051 (SUPERPROXY) 03/31/10 _i686_ (4 CPU) > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await svctm %util > sda 0.16 0.01 0.08 0.03 3.62 1.33 88.94 > 0.15 1389.89 59.15 0.66 > sdb 4.14 61.25 6.22 25.55 44.52 347.21 24.66 > 2.24 70.60 2.36 7.49 > sdc 4.37 421.28 9.95 98.31 318.27 2081.95 44.34 > 20.93 193.21 2.31 24.96 > sdd 2.34 339.90 3.97 117.47 95.48 1829.52 31.70 > 1.73 14.23 8.09 98.20 ^^^^ ^^^^^ /dev/sdd is IO bound doing small random writeback IO. A service time of 8ms implies that it is doing lots of large seeks. If you've got GBs of data to sync and that's the writeback pattern, then sync will most definitely take a long, long time. it may be that ext4 is allocating blocks far apart rather than close together (as appears to be the case for /dev/sdc), so maybe this is is related to how the filesytems are aging or how full they are... Cheers, Dave. -- Dave Chinner david(a)fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Denys Fedorysychenko on 1 Apr 2010 06:50 On Thursday 01 April 2010 01:12:54 Dave Chinner wrote: > On Wed, Mar 31, 2010 at 07:07:31PM +0300, Denys Fedorysychenko wrote: > > I have a proxy server with "loaded" squid. On some moment i did sync, and > > expecting it to finish in reasonable time. Waited more than 30 minutes, > > still "sync". Can be reproduced easily. > > > > Here is some stats and info: > > > > Linux SUPERPROXY 2.6.33.1-build-0051 #16 SMP Wed Mar 31 17:23:28 EEST > > 2010 i686 GNU/Linux > > > > SUPERPROXY ~ # iostat -k -x -d 30 > > Linux 2.6.33.1-build-0051 (SUPERPROXY) 03/31/10 _i686_ (4 CPU) > > > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s > > avgrq-sz avgqu-sz await svctm %util > > sda 0.16 0.01 0.08 0.03 3.62 1.33 > > 88.94 0.15 1389.89 59.15 0.66 > > sdb 4.14 61.25 6.22 25.55 44.52 347.21 > > 24.66 2.24 70.60 2.36 7.49 > > sdc 4.37 421.28 9.95 98.31 318.27 2081.95 > > 44.34 20.93 193.21 2.31 24.96 > > sdd 2.34 339.90 3.97 117.47 95.48 1829.52 > > 31.70 1.73 14.23 8.09 98.20 > > ^^^^ ^^^^^ > > /dev/sdd is IO bound doing small random writeback IO. A service time > of 8ms implies that it is doing lots of large seeks. If you've got > GBs of data to sync and that's the writeback pattern, then sync will > most definitely take a long, long time. > > it may be that ext4 is allocating blocks far apart rather than close > together (as appears to be the case for /dev/sdc), so maybe this is > is related to how the filesytems are aging or how full they are... Thats correct, it is quite busy cache server. Well, if i stop squid(cache) sync will finish enough fast. If i don't - it took more than hour. Actually i left that PC after 1 hour, and it didn't finish yet. I don't think it is normal. Probably sync taking new data and trying to flush it too, and till he finish that, more data comes. Actually all what i need - to sync config directory. I cannot use fsync, because it is multiple files opened before by other processes, and sync is doing trick like this. I got dead process, and only fast way to recover system - kill the cache process, so I/O pumping will stop for a while, and sync() will have chance to finish. Sure there is way just to "remount" config partition to ro, but i guess just sync must flush only current buffer cache pages. I will do more tests now and will give exact numbers, how much time it needs with running squid and if i kill it shortly after running sync. > > Cheers, > > Dave. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dave Chinner on 1 Apr 2010 07:20 On Thu, Apr 01, 2010 at 01:42:42PM +0300, Denys Fedorysychenko wrote: > Thats correct, it is quite busy cache server. > > Well, if i stop squid(cache) sync will finish enough fast. > If i don't - it took more than hour. Actually i left that PC after 1 hour, and > it didn't finish yet. I don't think it is normal. > Probably sync taking new data and trying to flush it too, and till he finish > that, more data comes. > Actually all what i need - to sync config directory. I cannot use fsync, > because it is multiple files opened before by other processes, and sync is > doing trick like this. I got dead process, and only fast way to recover system > - kill the cache process, so I/O pumping will stop for a while, and sync() > will have chance to finish. > Sure there is way just to "remount" config partition to ro, but i guess just > sync must flush only current buffer cache pages. > > I will do more tests now and will give exact numbers, how much time it needs > with running squid and if i kill it shortly after running sync. Ok. What would be interesting is regular output from /proc/meminfo to see how the dirty memory is changing over the time the sync is running.... Cheers, Dave. -- Dave Chinner david(a)fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Jeff Moyer on 1 Apr 2010 16:20
Dave Chinner <david(a)fromorbit.com> writes: > On Thu, Apr 01, 2010 at 01:42:42PM +0300, Denys Fedorysychenko wrote: >> Thats correct, it is quite busy cache server. >> >> Well, if i stop squid(cache) sync will finish enough fast. >> If i don't - it took more than hour. Actually i left that PC after 1 hour, and >> it didn't finish yet. I don't think it is normal. >> Probably sync taking new data and trying to flush it too, and till he finish >> that, more data comes. >> Actually all what i need - to sync config directory. I cannot use fsync, >> because it is multiple files opened before by other processes, and sync is >> doing trick like this. I got dead process, and only fast way to recover system >> - kill the cache process, so I/O pumping will stop for a while, and sync() >> will have chance to finish. >> Sure there is way just to "remount" config partition to ro, but i guess just >> sync must flush only current buffer cache pages. >> >> I will do more tests now and will give exact numbers, how much time it needs >> with running squid and if i kill it shortly after running sync. > > Ok. What would be interesting is regular output from /proc/meminfo > to see how the dirty memory is changing over the time the sync is > running.... This sounds familiar: http://lkml.org/lkml/2010/2/12/41 Cheers, Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |