From: Zeno Davatz on 14 Jul 2010 02:20 Hi I got a new Intel core-8 i7 processor. I am on kernel uname -a Linux zenogentoo 2.6.35-rc5 #97 SMP Tue Jul 13 16:13:25 CEST 2010 i686 Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz GenuineIntel GNU/Linux Sometimes in the middle of nowhere all of a sudden all of my 8-cores are at 100% CPU usage and my machine really lags and hangs and is not useable anymore. Some random process just grabs a bunch CPUs according to htop. dmesg tell me that kmemleak: 38 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) I am attaching you the file from /sys/kernel/debug/kmemleak Let me know if you need anything else. Best Zeno
From: Pekka Enberg on 14 Jul 2010 04:10 On Wed, Jul 14, 2010 at 9:12 AM, Zeno Davatz <zdavatz(a)gmail.com> wrote: > I got a new Intel core-8 i7 processor. > > I am on kernel uname -a > > Linux zenogentoo 2.6.35-rc5 #97 SMP Tue Jul 13 16:13:25 CEST 2010 i686 > Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz GenuineIntel GNU/Linux > > Sometimes in the middle of nowhere all of a sudden all of my 8-cores > are at 100% CPU usage and my machine really lags and hangs and is not > useable anymore. Some random process just grabs a bunch CPUs according > to htop. Why did you enable CONFIG_DEBUG_KMEMLEAK? Memory leak scanning is likely the source of these pauses. > dmesg tell me that > > kmemleak: 38 new suspected memory leaks (see /sys/kernel/debug/kmemleak) > kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) > kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) > kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) > kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) > kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) > > I am attaching you the file from /sys/kernel/debug/kmemleak Zeno, can you post your dmesg and .config, please? We have a bunch of suspected leaks here. The first class of leaks is related to reserve_region(): unreferenced object 0xf6d80740 (size 64): comm "swapper", pid 1, jiffies 4294892590 (age 57258.752s) hex dump (first 32 bytes): 00 00 ee c7 00 00 00 00 ff b7 ee c7 00 00 00 00 ................ 7c 09 52 c1 00 00 00 80 00 f2 5e c1 20 ac 6f c1 |.R.......^. .o. backtrace: [<c145d4eb>] kmemleak_alloc+0x27/0x4d [<c10ad53f>] kmem_cache_alloc+0xa3/0xd4 [<c163b782>] __reserve_region_with_split+0x29/0x149 [<c163b86a>] __reserve_region_with_split+0x111/0x149 [<c163b89a>] __reserve_region_with_split+0x141/0x149 [<c163b89a>] __reserve_region_with_split+0x141/0x149 [<c163b89a>] __reserve_region_with_split+0x141/0x149 [<c163b8de>] reserve_region_with_split+0x3c/0x4f [<c162e307>] e820_reserve_resources_late+0xea/0x108 [<c16504e6>] pcibios_resource_survey+0x23/0x2a [<c1652022>] pcibios_init+0x61/0x73 [<c165172b>] pci_subsys_init+0x43/0x48 [<c1001114>] do_one_initcall+0x27/0x178 [<c162b357>] kernel_init+0x129/0x1c7 [<c10238b6>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff unreferenced object 0xf6d232a0 (size 32): comm "swapper", pid 1, jiffies 4294892601 (age 57258.708s) hex dump (first 32 bytes): 70 6e 70 20 30 30 3a 30 31 00 d2 f6 fa 00 0b c1 pnp 00:01....... 00 00 00 00 04 aa dc f6 2c 00 00 00 01 00 00 00 ........,....... backtrace: [<c145d4eb>] kmemleak_alloc+0x27/0x4d [<c10ad53f>] kmem_cache_alloc+0xa3/0xd4 [<c123040b>] reserve_range+0x3b/0x13f [<c1230597>] system_pnp_probe+0x88/0xb0 [<c122b0f7>] pnp_device_probe+0x67/0xaf [<c12d5246>] driver_probe_device+0x5b/0x148 [<c12d539a>] __driver_attach+0x67/0x69 [<c12d4c33>] bus_for_each_dev+0x46/0x64 [<c12d512c>] driver_attach+0x19/0x1b [<c12d46f5>] bus_add_driver+0x17a/0x225 [<c12d55b8>] driver_register+0x65/0x110 [<c122af44>] pnp_register_driver+0x17/0x19 [<c1647a91>] pnp_system_init+0xd/0xf [<c1001114>] do_one_initcall+0x27/0x178 [<c162b357>] kernel_init+0x129/0x1c7 [<c10238b6>] kernel_thread_helper+0x6/0x10 I scanned through both call sites briefly but didn't find anything obvious. The second class of leaks seems to be related to kobjects: unreferenced object 0xf6951920 (size 32): comm "swapper", pid 1, jiffies 4294892614 (age 57258.656s) hex dump (first 32 bytes): 63 70 75 69 64 6c 65 00 2f 76 69 72 74 75 61 6c cpuidle./virtual 2f 67 72 61 70 68 69 63 73 2f 66 62 63 6f 6e 00 /graphics/fbcon. backtrace: [<c11e33c6>] kvasprintf+0x2a/0x47 [<c11db5d7>] kobject_set_name_vargs+0x17/0x52 [<c11db629>] kobject_add_varg+0x17/0x41 [<c11db67a>] kobject_init_and_add+0x27/0x2d [<c1389b0c>] cpuidle_add_sysfs+0x3e/0x56 [<c138944e>] __cpuidle_register_device+0xfb/0x116 [<c13895fc>] cpuidle_register_device+0x18/0x54 [<c1645397>] intel_idle_init+0x2b9/0x327 [<c1001114>] do_one_initcall+0x27/0x178 [<c162b357>] kernel_init+0x129/0x1c7 [<c10238b6>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff unreferenced object 0xf60045c0 (size 32): comm "swapper", pid 1, jiffies 4294893885 (age 57253.572s) hex dump (first 32 bytes): 30 00 64 4b bc a3 bc a3 80 f5 80 f5 a7 15 a7 15 0.dK............ 34 07 34 07 69 4f 69 4f f4 47 f4 47 ef 27 ef 27 4.4.iOiO.G.G.'.' backtrace: [<c145d4eb>] kmemleak_alloc+0x27/0x4d [<c10adb0c>] __kmalloc+0xd4/0x10d [<c11e33c6>] kvasprintf+0x2a/0x47 [<c11db5d7>] kobject_set_name_vargs+0x17/0x52 [<c11db629>] kobject_add_varg+0x17/0x41 [<c11db6ac>] kobject_add+0x2c/0x54 [<c138ad14>] add_sysfs_fw_map_entry+0x43/0x7c [<c164f00f>] memmap_init+0x16/0x30 [<c1001114>] do_one_initcall+0x27/0x178 [<c162b357>] kernel_init+0x129/0x1c7 [<c10238b6>] kernel_thread_helper+0x6/0x10 [<ffffffff>] 0xffffffff The third class of leaks is relateed to drm_setversion(): unreferenced object 0xf6b10620 (size 32): comm "X", pid 2268, jiffies 4294894722 (age 57250.228s) hex dump (first 32 bytes): 6e 6f 75 76 65 61 75 40 70 63 69 3a 30 30 30 30 nouveau(a)pci:0000 3a 30 35 3a 30 30 2e 30 00 00 00 00 00 00 00 00 :05:00.0........ backtrace: [<c145d4eb>] kmemleak_alloc+0x27/0x4d [<c10adb0c>] __kmalloc+0xd4/0x10d [<c125315e>] drm_setversion+0x140/0x1bf [<c12514f2>] drm_ioctl+0x258/0x3d7 [<c10bdd42>] vfs_ioctl+0x27/0x9b [<c10bdee2>] do_vfs_ioctl+0x66/0x54b [<c10be3fa>] sys_ioctl+0x33/0x4f [<c102339c>] sysenter_do_call+0x12/0x2c [<ffffffff>] 0xffffffff for which I wasn't able to find the allocation call-site. Maybe Zeno has some out-of-tree DRM module? The fourth class of leaks is related to per-CPU allocations in the block layer: unreferenced object 0xf6681400 (size 1024): comm "async/2", pid 1307, jiffies 4294894138 (age 57252.564s) hex dump (first 32 bytes): 80 87 ff ff c4 ff ff ff c4 ff ff ff c4 ff ff ff ................ fc ff ff ff fc ff ff ff fc ff ff ff fc ff ff ff ................ backtrace: [<c145d4eb>] kmemleak_alloc+0x27/0x4d [<c10adb0c>] __kmalloc+0xd4/0x10d [<c10ae982>] pcpu_mem_alloc+0x18/0x3a [<c10af239>] pcpu_extend_area_map+0x1a/0xad [<c10af578>] pcpu_alloc+0x2ac/0x82b [<c10afb10>] __alloc_percpu+0xa/0xc [<c11d4518>] alloc_disk_node+0x2e/0xbf [<c11d45b6>] alloc_disk+0xd/0xf [<c130260c>] sd_probe+0x54/0x298 [<c12d5246>] driver_probe_device+0x5b/0x148 [<c12d53ca>] __device_attach+0x2e/0x32 [<c12d49f3>] bus_for_each_drv+0x46/0x64 [<c12d5449>] device_attach+0x5c/0x60 [<c12d484d>] bus_probe_device+0x1a/0x30 [<c12d358a>] device_add+0x448/0x509 [<c12fb881>] scsi_sysfs_add_sdev+0x54/0x212 for which I didn't find anything obvious that could explain it. I suspect most of the reports are false positives. Catalin, what do you make out of them? Pekka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Zeno Davatz on 14 Jul 2010 04:30 Dear Pekka On Wed, Jul 14, 2010 at 10:05 AM, Pekka Enberg <penberg(a)cs.helsinki.fi> wrote: > On Wed, Jul 14, 2010 at 9:12 AM, Zeno Davatz <zdavatz(a)gmail.com> wrote: >> I got a new Intel core-8 i7 processor. >> >> I am on kernel uname -a >> >> Linux zenogentoo 2.6.35-rc5 #97 SMP Tue Jul 13 16:13:25 CEST 2010 i686 >> Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz GenuineIntel GNU/Linux >> >> Sometimes in the middle of nowhere all of a sudden all of my 8-cores >> are at 100% CPU usage and my machine really lags and hangs and is not >> useable anymore. Some random process just grabs a bunch CPUs according >> to htop. > > Why did you enable CONFIG_DEBUG_KMEMLEAK? Memory leak scanning is > likely the source of these pauses. Shall I disable that? I will do that and try again. >> I am attaching you the file from /sys/kernel/debug/kmemleak > > Zeno, can you post your dmesg and .config, please? Sure, see attached files. > The third class of leaks is relateed to drm_setversion(): > > unreferenced object 0xf6b10620 (size 32): > comm "X", pid 2268, jiffies 4294894722 (age 57250.228s) > hex dump (first 32 bytes): > 6e 6f 75 76 65 61 75 40 70 63 69 3a 30 30 30 30 nouveau(a)pci:000 0 > 3a 30 35 3a 30 30 2e 30 00 00 00 00 00 00 00 00 :05:00.0......... > backtrace: > [<c145d4eb>] kmemleak_alloc+0x27/0x4d > [<c10adb0c>] __kmalloc+0xd4/0x10d > [<c125315e>] drm_setversion+0x140/0x1bf > [<c12514f2>] drm_ioctl+0x258/0x3d7 > [<c10bdd42>] vfs_ioctl+0x27/0x9b > [<c10bdee2>] do_vfs_ioctl+0x66/0x54b > [<c10be3fa>] sys_ioctl+0x33/0x4f > [<c102339c>] sysenter_do_call+0x12/0x2c > [<ffffffff>] 0xffffffff > > for which I wasn't able to find the allocation call-site. Maybe Zeno > has some out-of-tree DRM module? I am using the nouveau drivers in the kernel as I got an Nvidia Graphics card. 05:00.0 VGA compatible controller: nVidia Corporation G98 [GeForce 8400 GS] (rev a1) (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. Device 8321 Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at fa000000 (32-bit, non-prefetchable) [size=16M] Memory at d0000000 (64-bit, prefetchable) [size=256M] Memory at f8000000 (64-bit, non-prefetchable) [size=32M] I/O ports at ec00 [size=128] [virtual] Expansion ROM at fb000000 [disabled] [size=128K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [100] Virtual Channel <?> Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information <?> Kernel driver in use: nouveau Best Zeno
From: Pekka Enberg on 14 Jul 2010 04:40 Zeno Davatz wrote: > On Wed, Jul 14, 2010 at 10:31 AM, Damien Wyart <damien.wyart(a)free.fr> wrote: > >>> On Wed, Jul 14, 2010 at 9:12 AM, Zeno Davatz <zdavatz(a)gmail.com> wrote: >>>> I got a new Intel core-8 i7 processor. >>>> I am on kernel uname -a >>>> Linux zenogentoo 2.6.35-rc5 #97 SMP Tue Jul 13 16:13:25 CEST 2010 i686 >>>> Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz GenuineIntel GNU/Linux >>>> Sometimes in the middle of nowhere all of a sudden all of my 8-cores >>>> are at 100% CPU usage and my machine really lags and hangs and is not >>>> useable anymore. Some random process just grabs a bunch CPUs according >>>> to htop. >> * Pekka Enberg <penberg(a)cs.helsinki.fi> [2010-07-14 11:05]: >>> Why did you enable CONFIG_DEBUG_KMEMLEAK? Memory leak scanning is >>> likely the source of these pauses. >> I am seeing the same problem with a Core i7 920 and 2.6.35-rc5, and I do >> not have CONFIG_DEBUG_KMEMLEAK enabled, so I think this is not related. >> >> I do not see anything special in the logs, just the load becoming mad >> and almost preventing ssh access. I've been seeing that since the first >> 2.6.35 rc I tested (-rc2 or -rc3, I don't remember) and I did not have >> time to report it before but I was surprised nobody else did. No problem >> with 2.6.34 and 2.6.34.1. > > same with me. My last build I tested was 2.6.34-rc7. No problems > there. No CPU jumps out of nowhere. > > It is like any application all of a sudden use 400% CPU i.e. htop. Interesting. Lets CC some scheduler folks for help. Pekka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Zeno Davatz on 14 Jul 2010 04:40
On Wed, Jul 14, 2010 at 10:31 AM, Damien Wyart <damien.wyart(a)free.fr> wrote: >> On Wed, Jul 14, 2010 at 9:12 AM, Zeno Davatz <zdavatz(a)gmail.com> wrote: >> > I got a new Intel core-8 i7 processor. > >> > I am on kernel uname -a > >> > Linux zenogentoo 2.6.35-rc5 #97 SMP Tue Jul 13 16:13:25 CEST 2010 i686 >> > Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz GenuineIntel GNU/Linux > >> > Sometimes in the middle of nowhere all of a sudden all of my 8-cores >> > are at 100% CPU usage and my machine really lags and hangs and is not >> > useable anymore. Some random process just grabs a bunch CPUs according >> > to htop. > > * Pekka Enberg <penberg(a)cs.helsinki.fi> [2010-07-14 11:05]: >> Why did you enable CONFIG_DEBUG_KMEMLEAK? Memory leak scanning is >> likely the source of these pauses. > > I am seeing the same problem with a Core i7 920 and 2.6.35-rc5, and I do > not have CONFIG_DEBUG_KMEMLEAK enabled, so I think this is not related. > > I do not see anything special in the logs, just the load becoming mad > and almost preventing ssh access. I've been seeing that since the first > 2.6.35 rc I tested (-rc2 or -rc3, I don't remember) and I did not have > time to report it before but I was surprised nobody else did. No problem > with 2.6.34 and 2.6.34.1. same with me. My last build I tested was 2.6.34-rc7. No problems there. No CPU jumps out of nowhere. It is like any application all of a sudden use 400% CPU i.e. htop. Best Zeno -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |