From: Andrew Morton on 4 Feb 2010 19:20 On Fri, 5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky(a)gmail.com> wrote: > Currently removal of the card leads to del_disk called indirectly by mmc core. > This function expects userspace to be running, which isn't when .resume is called > > Fix that by removing the code that did that in mmc_resume_host. It is possible > because card detection logic will kick it later and remove the card. I don't really understand. The above implies that to trigger this bug, one needs to physically remove the card during a resume operation. ie: a human-vs-computer race. Sounds unlikely? So... exactly what steps does the user need to take to trigger this bug? > Also make mtd workqueue freezeable, so it won't attempt to add/remove the card > while userspace is frozen. > > > diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c > index 30acd52..879d48d 100644 > --- a/drivers/mmc/core/core.c > +++ b/drivers/mmc/core/core.c > @@ -1257,7 +1257,6 @@ int mmc_suspend_host(struct mmc_host *host, pm_message_t state) > if (host->caps & MMC_CAP_DISABLE) > cancel_delayed_work(&host->disable); > cancel_delayed_work(&host->detect); > - mmc_flush_scheduled_work(); > > mmc_bus_get(host); > if (host->bus_ops && !host->bus_dead) { > @@ -1300,15 +1299,11 @@ int mmc_resume_host(struct mmc_host *host) > mmc_select_voltage(host, host->ocr); > BUG_ON(!host->bus_ops->resume); > err = host->bus_ops->resume(host); > + > if (err) { > printk(KERN_WARNING "%s: error %d during resume " > "(card was removed?)\n", > mmc_hostname(host), err); > - if (host->bus_ops->remove) > - host->bus_ops->remove(host); > - mmc_claim_host(host); > - mmc_detach_bus(host); > - mmc_release_host(host); afacit that code's been there since March 2009. I'd have thought that someone would have noticed "kernel hangs on resume" before now. Do you think the patch should be backported into 2.6.32.x and eariler? > /* no need to bother upper layers */ > err = 0; > } > @@ -1332,7 +1327,7 @@ static int __init mmc_init(void) > { > int ret; > > - workqueue = create_singlethread_workqueue("kmmcd"); > + workqueue = create_freezeable_workqueue("kmmcd"); > if (!workqueue) > return -ENOMEM; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Maxim Levitsky on 5 Feb 2010 03:40 On Thu, 2010-02-04 at 16:09 -0800, Andrew Morton wrote: > On Fri, 5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky(a)gmail.com> wrote: > > > Currently removal of the card leads to del_disk called indirectly by mmc core. > > This function expects userspace to be running, which isn't when .resume is called > > > > Fix that by removing the code that did that in mmc_resume_host. It is possible > > because card detection logic will kick it later and remove the card. > > I don't really understand. The above implies that to trigger this bug, > one needs to physically remove the card during a resume operation. ie: > a human-vs-computer race. Sounds unlikely? > > So... exactly what steps does the user need to take to trigger this Sorry for describing this poorly. The steps are: -> Have a kernel with CONFIG_MMC_UNSAFE_RESUME -> Insert MMC/SD card -> Suspend/hibernate the system -> While system is hibernated/suspended pull the card off -> Resume the system -> Hang if CONFIG_MMC_UNSAFE_RESUME is set, mmc core allows the user to suspend/resume the card normally assuming he won't change the card or modify it in another system. The former case is actually handled quite well. if CONFIG_MMC_UNSAFE_RESUME isn't set, it removes the card during suspend, and I now think (and will test) that this will still hang the system this time on suspend. Maybe we can make del_disk behave well if called with userspace frozen? After all if user calls it, very likely that hardware is absent thus there is no point in syncing (which I think triggers the hang).... Best regards, Maxim Levitsky -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Adrian Hunter on 5 Feb 2010 05:20 ext Andrew Morton wrote: > On Fri, 5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky(a)gmail.com> wrote: > >> Currently removal of the card leads to del_disk called indirectly by mmc core. >> This function expects userspace to be running, which isn't when .resume is called >> >> Fix that by removing the code that did that in mmc_resume_host. It is possible >> because card detection logic will kick it later and remove the card. > > I don't really understand. The above implies that to trigger this bug, > one needs to physically remove the card during a resume operation. ie: > a human-vs-computer race. Sounds unlikely? > > So... exactly what steps does the user need to take to trigger this > bug? > >> Also make mtd workqueue freezeable, so it won't attempt to add/remove the card >> while userspace is frozen. >> >> >> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c >> index 30acd52..879d48d 100644 >> --- a/drivers/mmc/core/core.c >> +++ b/drivers/mmc/core/core.c >> @@ -1257,7 +1257,6 @@ int mmc_suspend_host(struct mmc_host *host, pm_message_t state) >> if (host->caps & MMC_CAP_DISABLE) >> cancel_delayed_work(&host->disable); >> cancel_delayed_work(&host->detect); >> - mmc_flush_scheduled_work(); >> >> mmc_bus_get(host); >> if (host->bus_ops && !host->bus_dead) { >> @@ -1300,15 +1299,11 @@ int mmc_resume_host(struct mmc_host *host) >> mmc_select_voltage(host, host->ocr); >> BUG_ON(!host->bus_ops->resume); >> err = host->bus_ops->resume(host); >> + >> if (err) { >> printk(KERN_WARNING "%s: error %d during resume " >> "(card was removed?)\n", >> mmc_hostname(host), err); >> - if (host->bus_ops->remove) >> - host->bus_ops->remove(host); >> - mmc_claim_host(host); >> - mmc_detach_bus(host); >> - mmc_release_host(host); > > afacit that code's been there since March 2009. I'd have thought that > someone would have noticed "kernel hangs on resume" before now. > > Do you think the patch should be backported into 2.6.32.x and eariler? It looks like the code was introduced in 2.6.32.x by commit 95cdfb72b9bc568803f395c266152c71b034b461 cc'ing the author Nicolas Pitre > >> /* no need to bother upper layers */ >> err = 0; >> } >> @@ -1332,7 +1327,7 @@ static int __init mmc_init(void) >> { >> int ret; >> >> - workqueue = create_singlethread_workqueue("kmmcd"); >> + workqueue = create_freezeable_workqueue("kmmcd"); >> if (!workqueue) >> return -ENOMEM; > -- > To unsubscribe from this list: send the line "unsubscribe linux-mmc" in > the body of a message to majordomo(a)vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Maxim Levitsky on 5 Feb 2010 08:50 On Fri, 2010-02-05 at 12:17 +0200, Adrian Hunter wrote: > ext Andrew Morton wrote: > > On Fri, 5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky(a)gmail.com> wrote: > > > >> Currently removal of the card leads to del_disk called indirectly by mmc core. > >> This function expects userspace to be running, which isn't when .resume is called > >> > >> Fix that by removing the code that did that in mmc_resume_host. It is possible > >> because card detection logic will kick it later and remove the card. > > > > I don't really understand. The above implies that to trigger this bug, > > one needs to physically remove the card during a resume operation. ie: > > a human-vs-computer race. Sounds unlikely? > > > > So... exactly what steps does the user need to take to trigger this > > bug? > > > >> Also make mtd workqueue freezeable, so it won't attempt to add/remove the card > >> while userspace is frozen. > >> > >> > >> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c > >> index 30acd52..879d48d 100644 > >> --- a/drivers/mmc/core/core.c > >> +++ b/drivers/mmc/core/core.c > >> @@ -1257,7 +1257,6 @@ int mmc_suspend_host(struct mmc_host *host, pm_message_t state) > >> if (host->caps & MMC_CAP_DISABLE) > >> cancel_delayed_work(&host->disable); > >> cancel_delayed_work(&host->detect); > >> - mmc_flush_scheduled_work(); > >> > >> mmc_bus_get(host); > >> if (host->bus_ops && !host->bus_dead) { > >> @@ -1300,15 +1299,11 @@ int mmc_resume_host(struct mmc_host *host) > >> mmc_select_voltage(host, host->ocr); > >> BUG_ON(!host->bus_ops->resume); > >> err = host->bus_ops->resume(host); > >> + > >> if (err) { > >> printk(KERN_WARNING "%s: error %d during resume " > >> "(card was removed?)\n", > >> mmc_hostname(host), err); > >> - if (host->bus_ops->remove) > >> - host->bus_ops->remove(host); > >> - mmc_claim_host(host); > >> - mmc_detach_bus(host); > >> - mmc_release_host(host); > > > > afacit that code's been there since March 2009. I'd have thought that > > someone would have noticed "kernel hangs on resume" before now. > > > > Do you think the patch should be backported into 2.6.32.x and eariler? > > It looks like the code was introduced in 2.6.32.x by commit > > 95cdfb72b9bc568803f395c266152c71b034b461 > > cc'ing the author Nicolas Pitre I don't think this is this commit fault. The problem lies somewhere in block layer. del_disk hangs if called while usrspace is frozen. Because I assume that this code was tested, I guess that it was possible to call del_disk in this way once. Fixing CONFIG_MMC_UNSAFE_RESUME=n not to do del_disk, won't be easy... Best regards, Maxim Levitsky -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andrew Morton on 5 Feb 2010 09:20 On Fri, 05 Feb 2010 10:31:42 +0200 Maxim Levitsky <maximlevitsky(a)gmail.com> wrote: > On Thu, 2010-02-04 at 16:09 -0800, Andrew Morton wrote: > > On Fri, 5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky(a)gmail.com> wrote: > > > > > Currently removal of the card leads to del_disk called indirectly by mmc core. > > > This function expects userspace to be running, which isn't when .resume is called > > > > > > Fix that by removing the code that did that in mmc_resume_host. It is possible > > > because card detection logic will kick it later and remove the card. > > > > I don't really understand. The above implies that to trigger this bug, > > one needs to physically remove the card during a resume operation. ie: > > a human-vs-computer race. Sounds unlikely? > > > > So... exactly what steps does the user need to take to trigger this > > Sorry for describing this poorly. > The steps are: > > -> Have a kernel with CONFIG_MMC_UNSAFE_RESUME > -> Insert MMC/SD card > -> Suspend/hibernate the system > -> While system is hibernated/suspended pull the card off > -> Resume the system > -> Hang > > > if CONFIG_MMC_UNSAFE_RESUME is set, mmc core allows the user to > suspend/resume the card normally assuming he won't change the card or > modify it in another system. The former case is actually handled quite > well. > > if CONFIG_MMC_UNSAFE_RESUME isn't set, it removes the card during > suspend, and I now think (and will test) that this will still hang the > system this time on suspend. > > Maybe we can make del_disk behave well if called with userspace frozen? > After all if user calls it, very likely that hardware is absent thus > there is no point in syncing (which I think triggers the hang).... > There is no del_disk in the kernel. Let's be more specific (and accurate!) about the hang. I assume it's mmc_remove_card->device_del->kobject_uevent? Yes, I'd have thought that it would be a good idea for the kobject_uevent code (or lower, in call_usermodehelper) to take avoiding action if userspace is frozen. However such action would probably involve doing a WARN_ON() too, so we'd still need MMC changes to avoid that. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Next
|
Last
Pages: 1 2 3 Prev: [PATCH V6] Work to enable SmartMedia/xD support Next: [git patches] libata fixes |