Prev: [PATCH 5/8] PM: suspend_block: Add suspend_blocker stats
Next: [PATCH/RFC] mutex: Fix optimistic spinning vs. BKL
From: Benjamin Herrenschmidt on 28 Apr 2010 00:50 On Wed, 2010-04-28 at 14:38 +1000, Benjamin Herrenschmidt wrote: > Currently, we can hit a nasty case with optimistic spinning on mutexes: > > CPU A tries to take a mutex, while holding the BKL > > CPU B tried to take the BLK while holding the mutex > > This looks like a AB-BA scenario but in practice, is allowed and happens > due to the auto-release-on-schedule nature of the BKL. .../... BTW. The patch is only compile-tested so far :-) It's going to be hammered with the test case that triggered the original bug hopefully tonight. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Arnd Bergmann on 28 Apr 2010 08:10 On Wednesday 28 April 2010, Benjamin Herrenschmidt wrote: > Now, we -could- make it a bit smarter about the BKL by introducing a > contention counter and only go out if we own the BKL and it is contended, > but I didn't feel like this was worth the effort, time is better spent > removing the BKL from sensitive code path instead. Agreed. > - for (;;) { > + for (timeout = jiffies + 2; jiffies < timeout;) { > [...] > - for (;;) { > + while (jiffies < timeout) { This needs to use time_before() to avoid problems on jiffies wraparound. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Benjamin Herrenschmidt on 28 Apr 2010 18:40 On Wed, 2010-04-28 at 14:06 +0200, Arnd Bergmann wrote: > > This needs to use time_before() to avoid problems on jiffies > wraparound. Ah right, forgot about that, been a while I didn't use jiffies for anything :-) I'll respin later today. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Tony Breeds on 7 May 2010 00:30 On Thu, Apr 29, 2010 at 08:35:21AM +1000, Benjamin Herrenschmidt wrote: > On Wed, 2010-04-28 at 14:06 +0200, Arnd Bergmann wrote: > > > > This needs to use time_before() to avoid problems on jiffies > > wraparound. > > Ah right, forgot about that, been a while I didn't use jiffies for > anything :-) > > I'll respin later today. Like thie perhaps? If this looks good it would be great to get this in .34 diff --git a/include/linux/sched.h b/include/linux/sched.h index 70abfd3..991d86f 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -351,7 +351,8 @@ extern signed long schedule_timeout_killable(signed long timeout); extern signed long schedule_timeout_uninterruptible(signed long timeout); asmlinkage void __schedule(void); asmlinkage void schedule(void); -extern int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner); +extern int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner, + unsigned long timeout); struct nsproxy; struct user_namespace; diff --git a/kernel/mutex.c b/kernel/mutex.c index 947b3ad..fcf2573 100644 --- a/kernel/mutex.c +++ b/kernel/mutex.c @@ -145,6 +145,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, struct task_struct *task = current; struct mutex_waiter waiter; unsigned long flags; + unsigned long timeout; preempt_disable(); mutex_acquire(&lock->dep_map, subclass, 0, ip); @@ -168,15 +169,22 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, * to serialize everything. */ - for (;;) { + for (timeout = jiffies + 2; time_before(jiffies, timeout);) { struct thread_info *owner; /* + * If we own the BKL, then don't spin. The owner of the mutex + * might be waiting on us to release the BKL. + */ + if (current->lock_depth >= 0) + break; + + /* * If there's an owner, wait for it to either * release the lock or go to sleep. */ owner = ACCESS_ONCE(lock->owner); - if (owner && !mutex_spin_on_owner(lock, owner)) + if (owner && !mutex_spin_on_owner(lock, owner, timeout)) break; if (atomic_cmpxchg(&lock->count, 1, 0) == 1) { diff --git a/kernel/sched.c b/kernel/sched.c index 663a1d0..ef58dc0 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -5523,7 +5523,8 @@ EXPORT_SYMBOL(schedule); * Look out! "owner" is an entirely speculative pointer * access and not reliable. */ -int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner) +int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner, + unsigned long timeout) { unsigned int cpu; struct rq *rq; @@ -5559,7 +5560,7 @@ int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner) rq = cpu_rq(cpu); - for (;;) { + while (time_before(jiffies, timeout)) { /* * Owner changed, break to re-assess state. */ Yours Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on 7 May 2010 01:40 On Fri, May 07, 2010 at 02:20:10PM +1000, Tony Breeds wrote: > On Thu, Apr 29, 2010 at 08:35:21AM +1000, Benjamin Herrenschmidt wrote: > > On Wed, 2010-04-28 at 14:06 +0200, Arnd Bergmann wrote: > > > > > > This needs to use time_before() to avoid problems on jiffies > > > wraparound. > > > > Ah right, forgot about that, been a while I didn't use jiffies for > > anything :-) > > > > I'll respin later today. > > Like thie perhaps? If this looks good it would be great to get this in .34 > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 70abfd3..991d86f 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -351,7 +351,8 @@ extern signed long schedule_timeout_killable(signed long timeout); > extern signed long schedule_timeout_uninterruptible(signed long timeout); > asmlinkage void __schedule(void); > asmlinkage void schedule(void); > -extern int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner); > +extern int mutex_spin_on_owner(struct mutex *lock, struct thread_info *owner, > + unsigned long timeout); > > struct nsproxy; > struct user_namespace; > diff --git a/kernel/mutex.c b/kernel/mutex.c > index 947b3ad..fcf2573 100644 > --- a/kernel/mutex.c > +++ b/kernel/mutex.c > @@ -145,6 +145,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > struct task_struct *task = current; > struct mutex_waiter waiter; > unsigned long flags; > + unsigned long timeout; > > preempt_disable(); > mutex_acquire(&lock->dep_map, subclass, 0, ip); > @@ -168,15 +169,22 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > * to serialize everything. > */ > > - for (;;) { > + for (timeout = jiffies + 2; time_before(jiffies, timeout);) { > struct thread_info *owner; > > /* > + * If we own the BKL, then don't spin. The owner of the mutex > + * might be waiting on us to release the BKL. > + */ > + if (current->lock_depth >= 0) > + break; > + > + /* > * If there's an owner, wait for it to either > * release the lock or go to sleep. > */ > owner = ACCESS_ONCE(lock->owner); > - if (owner && !mutex_spin_on_owner(lock, owner)) > + if (owner && !mutex_spin_on_owner(lock, owner, timeout)) > break; I like the safeguard against the bkl, it looks indeed like something we should have in .34 But I really don't like the timeout. This is going to make the things even worse if we have another cause of deadlock by hiding the worst part of the consequences without actually solving the problem. And since the induced latency or deadlock won't be easily visible anymore, we'll miss there is a problem. So we are going to spin for two jiffies and only someone doing specific latency measurements will notice, if he's lucky enough to meet the bug. Moreover that adds some unnessary (small) overhead in this path. May be can we have it as a debugging option, something that would be part of lockdep, which would require CONFIG_DEBUG_MUTEX to support mutex adaptive spinning. A debugging option that could just dump the held locks and the current one if we spin for an excessive timeslice. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Next
|
Last
Pages: 1 2 3 4 Prev: [PATCH 5/8] PM: suspend_block: Add suspend_blocker stats Next: [PATCH/RFC] mutex: Fix optimistic spinning vs. BKL |