Prev: futex: Take mmap_sem for get_user_pages in fault_in_user_writeable
Next: [PATCH v9 0/8] Loongson: YeeLoong: add platform drivers
From: Alan Stern on 8 Dec 2009 15:50 On Tue, 8 Dec 2009, Rafael J. Wysocki wrote: > > This is a little more awkward because it requires the parent to iterate > > through its children. > > I can live with that. > > > But it does solve the off-tree dependency problem for suspends. > > That's a plus, but I still think we're trying to create a barrier-alike > mechanism using lock. > > There's one more possibility to consider, though. What if we use a completion > instead of the flag + wait queue? It surely is a standard synchronization > mechanism and it seems it might work here. You're right. I should have thought of that. Linus's original approach couldn't use a completion because during suspend it needed to make one task (the parent) wait for a bunch of others (the children). But if you iterate through the children by hand, that objection no longer applies. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on 8 Dec 2009 16:00 On Tuesday 08 December 2009, Alan Stern wrote: > On Tue, 8 Dec 2009, Rafael J. Wysocki wrote: > > > > This is a little more awkward because it requires the parent to iterate > > > through its children. > > > > I can live with that. > > > > > But it does solve the off-tree dependency problem for suspends. > > > > That's a plus, but I still think we're trying to create a barrier-alike > > mechanism using lock. > > > > There's one more possibility to consider, though. What if we use a completion > > instead of the flag + wait queue? It surely is a standard synchronization > > mechanism and it seems it might work here. > > You're right. I should have thought of that. Linus's original > approach couldn't use a completion because during suspend it needed to > make one task (the parent) wait for a bunch of others (the children). > But if you iterate through the children by hand, that objection no > longer applies. BTW, is there a good reason why completion_done() doesn't use spin_lock_irqsave and spin_unlock_irqrestore? complete() and complete_all() use them, so why not here? Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 8 Dec 2009 16:10 On Tue, 8 Dec 2009, Rafael J. Wysocki wrote: > > Anyway, if we use an rwsem, it won't be checkable from interrupt context just > as well. You can't do a lock() from an interrupt, but the unlocks should be irq-safe. > Suppose we use rwsem and during suspend each child uses a down_read() on a > parent and then the parent uses down_write() on itself. What if, whatever the > reason, the parent is a bit early and does the down_write() before one of the > children has a chance to do the down_read()? Aren't we toast? We're toast, but we're toast for a totally unrealted reason: it means that you tried to resume a child before a parent, which would be a major bug to begin with. Look, I even wrote out the comments, so let me repeat the code one more time. - suspend time calling: // This won't block, because we suspend nodes before parents down_read(node->parent->lock); // Do the part that may block asynchronously async_schedule(do_usb_node_suspend, node); - resume time calling: // This won't block, because we resume parents before children, // and the children will take the read lock. down_write(leaf->lock); // Do the blocking part asynchronously async_schedule(usb_node_resume, leaf); See? So when we take the parent lock for suspend, we are guaranteed to do so _before_ the parent node itself suspends. And conversely, when we take the parent lock (asynchronously) for resume, we're guaranteed to do that _after_ the parent node has done its own down_write. And that all depends on just one trivial thing; that the suspend and resume is called in the right order (children first vs parent first respectively). And that is such a _major_ correctness issue that if that isn't correct, your suspend isn't going to work _anyway_. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 8 Dec 2009 16:10 On Tue, 8 Dec 2009, Alan Stern wrote: > > That's not the way it should be done. Linus had children taking their > parents' locks during suspend, which is simple but leads to > difficulties. No it doesn't. Name them. > Instead, the PM core should do a down_write() on each device before > starting the device's async suspend routine, and an up_write() when the > routine finishes. No you should NOT do that. If you do that, you serialize the suspend incorrectly and much too early. IOW, think a topology like this: a -> b -> c \ > d -> e where you'd want to suspend 'c' and 'e' asynchronously. If we do a 'down-write()' on b, then we'll delay until 'c' has suspended, an if we have ordered the nodes in the obvious depth-first order, we'll walk the PM device list in the order: c b e d a and now we'll serialize on 'b', waiting for 'c' to suspend. Which we do _not_ want to do, because the whole point was to suspend 'c' and 'e' together. > Parents should, at the start of their async routine, > do down_read() on each of their children plus whatever other devices > they need to wait for. The core can do the waiting for children part > and the driver's suspend routine can handle any other waiting. Why? That just complicates things. Compare to my simple locking scheme I've quoted several times. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on 8 Dec 2009 16:40
On Tue, 8 Dec 2009, Linus Torvalds wrote: > > No, that's not needed. Unlike reads, writes can't move in front of > > data or control dependencies. Or so I've been lead to believe... > > Sure they can. Control dependencies are trivial - it's called "branch > prediction", and everybody does it, and data dependencies don't exist on > many CPU architectures (even to the point of reading through a pointer > that you loaded). Wait a second. Are you saying that with code like this: if (x == 1) y = 5; the CPU may write to y before it has finished reading the value of x? And this write is visible to other CPUs, so that if x was initially 0 and a second CPU sets x to 1, the second CPU may see y == 5 before it executes the write to x (whatever that may mean)? Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |