Prev: [PATCH -tip 2/2] kprobes: Add mcount in kprobes blacklist
Next: net: reserve ports for applications using fixed port numbers
From: Cong Wang on 5 Feb 2010 04:30 Eric W. Biederman wrote: > Amerigo Wang <amwang(a)redhat.com> writes: > >> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. >> As reported by several people, it is something like: >> >> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 >> [ 6967.956156] Disabling non-boot CPUs ... >> [ 6967.970401] >> [ 6967.970408] ============================================= >> [ 6967.970419] [ INFO: possible recursive locking detected ] >> [ 6967.970431] 2.6.33-rc2-git6 #27 >> [ 6967.970439] --------------------------------------------- >> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: >> [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>] >> sysfs_hash_and_remove+0x3d/0x4f >> [ 6967.970493] >> [ 6967.970497] but task is already holding lock: >> [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>] >> sysfs_get_active_two+0x16/0x36 >> [...] >> >> Eric already provides a patch for this[1], but it still can't fix the >> problem. Based on his work and Peter's suggestion, I write this patch, >> hopefully we can fix the warning completely. >> >> This patch put sysfs s_active into two classes, one is for PM, the other >> is for the rest, so lockdep will distinguish them. >> >> 1. http://lkml.org/lkml/2010/1/10/282 > > What testing has this patch seen? > > In particular does this work to actually clear up the pm case? > Sorry, it seems that my machine doesn't support s2ram, I am still trying to make it working... I hope the reporters of this bug can help to test this patch. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Eric W. Biederman on 5 Feb 2010 04:40 Xiaotian Feng <xtfeng(a)gmail.com> writes: > On Fri, Feb 5, 2010 at 2:42 PM, Amerigo Wang <amwang(a)redhat.com> wrote: >> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. >> As reported by several people, it is something like: >> >> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 >> [ 6967.956156] Disabling non-boot CPUs ... >> [ 6967.970401] >> [ 6967.970408] ============================================= >> [ 6967.970419] [ INFO: possible recursive locking detected ] >> [ 6967.970431] 2.6.33-rc2-git6 #27 >> [ 6967.970439] --------------------------------------------- >> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: >> [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>] >> sysfs_hash_and_remove+0x3d/0x4f >> [ 6967.970493] >> [ 6967.970497] but task is already holding lock: >> [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>] >> sysfs_get_active_two+0x16/0x36 >> [...] >> >> Eric already provides a patch for this[1], but it still can't fix the >> problem. Based on his work and Peter's suggestion, I write this patch, >> hopefully we can fix the warning completely. >> >> This patch put sysfs s_active into two classes, one is for PM, the other >> is for the rest, so lockdep will distinguish them. > > I think this patch does not hit the root cause, we have a similiar > warning which is not related with PM. The root cause is that our locking is crazy complicated. No lockdep changes are going to fix that. What we can do and what the patch does is teach lockdep to treat some of the sysfs files as a different group (subclass) from other sysfs files. Which keeps us from overgeneralizing too much and having a better signal to noise ratio. As for the block device problem goes, I can't easily say that the block layer is correct. I expect it is because changing the scheduler is unlikely to delete block devices. If the block layer has bugs then adding another subclass as Amerigo suggests should simply make lockdep warnings harder to trigger and more accurate so that sounds like a path worth walking. In general I recommend that pieces of code that need to do a lot of work in a sysfs attribute consider using a work queue or a kernel thread, as that can be easier to analyze. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Cong Wang on 5 Feb 2010 04:50 Eric W. Biederman wrote: > Xiaotian Feng <xtfeng(a)gmail.com> writes: > >> On Fri, Feb 5, 2010 at 2:42 PM, Amerigo Wang <amwang(a)redhat.com> wrote: >>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. >>> As reported by several people, it is something like: >>> >>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 >>> [ 6967.956156] Disabling non-boot CPUs ... >>> [ 6967.970401] >>> [ 6967.970408] ============================================= >>> [ 6967.970419] [ INFO: possible recursive locking detected ] >>> [ 6967.970431] 2.6.33-rc2-git6 #27 >>> [ 6967.970439] --------------------------------------------- >>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: >>> [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>] >>> sysfs_hash_and_remove+0x3d/0x4f >>> [ 6967.970493] >>> [ 6967.970497] but task is already holding lock: >>> [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>] >>> sysfs_get_active_two+0x16/0x36 >>> [...] >>> >>> Eric already provides a patch for this[1], but it still can't fix the >>> problem. Based on his work and Peter's suggestion, I write this patch, >>> hopefully we can fix the warning completely. >>> >>> This patch put sysfs s_active into two classes, one is for PM, the other >>> is for the rest, so lockdep will distinguish them. >> I think this patch does not hit the root cause, we have a similiar >> warning which is not related with PM. > > The root cause is that our locking is crazy complicated. No lockdep > changes are going to fix that. > > What we can do and what the patch does is teach lockdep to treat some > of the sysfs files as a different group (subclass) from other sysfs > files. Which keeps us from overgeneralizing too much and having > a better signal to noise ratio. > > As for the block device problem goes, I can't easily say that > the block layer is correct. I expect it is because changing > the scheduler is unlikely to delete block devices. If the block layer > has bugs then adding another subclass as Amerigo suggests should simply > make lockdep warnings harder to trigger and more accurate so that > sounds like a path worth walking. > > In general I recommend that pieces of code that need to do a lot of > work in a sysfs attribute consider using a work queue or a kernel > thread, as that can be easier to analyze. > Cc'ing Jens Axboe. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Xiaotian Feng on 5 Feb 2010 05:10 On Fri, Feb 5, 2010 at 5:39 PM, Eric W. Biederman <ebiederm(a)xmission.com> wrote: > Xiaotian Feng <xtfeng(a)gmail.com> writes: > >> On Fri, Feb 5, 2010 at 2:42 PM, Amerigo Wang <amwang(a)redhat.com> wrote: >>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. >>> As reported by several people, it is something like: >>> >>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 >>> [ 6967.956156] Disabling non-boot CPUs ... >>> [ 6967.970401] >>> [ 6967.970408] ============================================= >>> [ 6967.970419] [ INFO: possible recursive locking detected ] >>> [ 6967.970431] 2.6.33-rc2-git6 #27 >>> [ 6967.970439] --------------------------------------------- >>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: >>> [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>] >>> sysfs_hash_and_remove+0x3d/0x4f >>> [ 6967.970493] >>> [ 6967.970497] but task is already holding lock: >>> [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>] >>> sysfs_get_active_two+0x16/0x36 >>> [...] >>> >>> Eric already provides a patch for this[1], but it still can't fix the >>> problem. Based on his work and Peter's suggestion, I write this patch, >>> hopefully we can fix the warning completely. >>> >>> This patch put sysfs s_active into two classes, one is for PM, the other >>> is for the rest, so lockdep will distinguish them. >> >> I think this patch does not hit the root cause, we have a similiar >> warning which is not related with PM. > > The root cause is that our locking is crazy complicated. No lockdep > changes are going to fix that. > > What we can do and what the patch does is teach lockdep to treat some > of the sysfs files as a different group (subclass) from other sysfs > files. Which keeps us from overgeneralizing too much and having > a better signal to noise ratio. > > As for the block device problem goes, I can't easily say that > the block layer is correct. I expect it is because changing > the scheduler is unlikely to delete block devices. If the block layer > has bugs then adding another subclass as Amerigo suggests should simply > make lockdep warnings harder to trigger and more accurate so that > sounds like a path worth walking. > > In general I recommend that pieces of code that need to do a lot of > work in a sysfs attribute consider using a work queue or a kernel > thread, as that can be easier to analyze. PM case store /sys/devices/system/cpu1/online remove /sys/devices/system/cpu1/cache/ iosched case store /sys/block/sdx/queue/scheduler remove /sys/block/sdx/queue/iosched/ So it looks like this is from sysfs layer .... > > Eric > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Cong Wang on 7 Feb 2010 22:20
Xiaotian Feng wrote: > On Fri, Feb 5, 2010 at 5:39 PM, Eric W. Biederman <ebiederm(a)xmission.com> wrote: >> Xiaotian Feng <xtfeng(a)gmail.com> writes: >> >>> On Fri, Feb 5, 2010 at 2:42 PM, Amerigo Wang <amwang(a)redhat.com> wrote: >>>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. >>>> As reported by several people, it is something like: >>>> >>>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 >>>> [ 6967.956156] Disabling non-boot CPUs ... >>>> [ 6967.970401] >>>> [ 6967.970408] ============================================= >>>> [ 6967.970419] [ INFO: possible recursive locking detected ] >>>> [ 6967.970431] 2.6.33-rc2-git6 #27 >>>> [ 6967.970439] --------------------------------------------- >>>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: >>>> [ 6967.970460] (s_active){++++.+}, at: [<c10d2941>] >>>> sysfs_hash_and_remove+0x3d/0x4f >>>> [ 6967.970493] >>>> [ 6967.970497] but task is already holding lock: >>>> [ 6967.970506] (s_active){++++.+}, at: [<c10d4110>] >>>> sysfs_get_active_two+0x16/0x36 >>>> [...] >>>> >>>> Eric already provides a patch for this[1], but it still can't fix the >>>> problem. Based on his work and Peter's suggestion, I write this patch, >>>> hopefully we can fix the warning completely. >>>> >>>> This patch put sysfs s_active into two classes, one is for PM, the other >>>> is for the rest, so lockdep will distinguish them. >>> I think this patch does not hit the root cause, we have a similiar >>> warning which is not related with PM. >> The root cause is that our locking is crazy complicated. No lockdep >> changes are going to fix that. >> >> What we can do and what the patch does is teach lockdep to treat some >> of the sysfs files as a different group (subclass) from other sysfs >> files. Which keeps us from overgeneralizing too much and having >> a better signal to noise ratio. >> >> As for the block device problem goes, I can't easily say that >> the block layer is correct. I expect it is because changing >> the scheduler is unlikely to delete block devices. If the block layer >> has bugs then adding another subclass as Amerigo suggests should simply >> make lockdep warnings harder to trigger and more accurate so that >> sounds like a path worth walking. >> >> In general I recommend that pieces of code that need to do a lot of >> work in a sysfs attribute consider using a work queue or a kernel >> thread, as that can be easier to analyze. > > PM case > store /sys/devices/system/cpu1/online > remove /sys/devices/system/cpu1/cache/ > > iosched case > store /sys/block/sdx/queue/scheduler > remove /sys/block/sdx/queue/iosched/ > > So it looks like this is from sysfs layer .... > Right, and both locks are s_active, so I think they are the same problem, but I haven't check the iosched case carefully. ;) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |