From: Benjamin Herrenschmidt on 1 Apr 2010 19:10 On Thu, 2010-04-01 at 12:06 +0100, David Howells wrote: > Can we provide a kmem_cache_create_early()? One that takes no locks and gets > cleaned up with the other __init stuff? Yuck. I hate having to expose more APIs. Also the problem with that is means callers have to know. So we need to propagate up all call chains etc... (ie, radix_tree_init_early(), etc...) This is pretty much exactly the discussion we had when moving sl*b early, and back then, the final word from Linus (heh, for once he agreed with me :-) was that this made no sense. We can bury logic inside kmem_cache_create() though, it's not -that- a hot path. Cheers, Ben. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: David Howells on 2 Apr 2010 10:50 Linus Torvalds <torvalds(a)linux-foundation.org> wrote: > Ahh, yes. In this case, that doesn't likely change anything. The > save/restore versions of the irq-safe locks shouldn't be appreciably more > expensive than the non-saving ones. And architectures that really care > should have done their own per-arch optimized version anyway. That depends on the CPU. Some CPUs have quite expensive interrupt disablement instructions. FRV does for instance; but fortunately, on the FRV, I can use some of the excessive quantities of conditional registers to pretend that I disable interrupts, and only actually do so if an interrupt actually happens. > Maybe we should even document that - so that nobody else makes the mistake > x86-64 did of thinking that the "generic spinlock" version of the rwsem's > is anything but a hacky and bad fallback case. In some cases, it's actually the best way. On a UP machine, for instance, where they reduce to nothing or where your only atomic instruction is an XCHG equivalent. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 2 Apr 2010 11:10 On Fri, 2 Apr 2010, David Howells wrote: > Linus Torvalds <torvalds(a)linux-foundation.org> wrote: > > > Ahh, yes. In this case, that doesn't likely change anything. The > > save/restore versions of the irq-safe locks shouldn't be appreciably more > > expensive than the non-saving ones. And architectures that really care > > should have done their own per-arch optimized version anyway. > > That depends on the CPU. Some CPUs have quite expensive interrupt disablement > instructions. FRV does for instance; but fortunately, on the FRV, I can use > some of the excessive quantities of conditional registers to pretend that I > disable interrupts, and only actually do so if an interrupt actually happens. I think you're missing the part where we're not _adding_ any irq disables: we're just changing the unconditional irq disable to a save-and-disable (and the unconditional irq enable to a restore). So even if irq's are expensive to disable, the change from spin_lock_irq() to spin_lock_irqsave() won't make that code any more expensive. > > Maybe we should even document that - so that nobody else makes the mistake > > x86-64 did of thinking that the "generic spinlock" version of the rwsem's > > is anything but a hacky and bad fallback case. > > In some cases, it's actually the best way. On a UP machine, for instance, > where they reduce to nothing or where your only atomic instruction is an XCHG > equivalent. Again, you seem to think that we used to have just a plain spin_lock. Not so. We currently have a spin_lock_irq(), and it is NOT a no-op even on UP. It does that irq disable. Anyway, I suspect that even with just an atomic xchg, you can do a better job at doing down_read() than using the generic spin-lock version (likely by busy-looping on a special "we're busy" value). But if you do want to use the generic spin-lock version, I doubt any architecture makes that irqsave version noticeable slower than the unconditional irq version. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Kevin Hilman on 7 Apr 2010 15:10 Linus Torvalds <torvalds(a)linux-foundation.org> writes: > On Wed, 31 Mar 2010, H. Peter Anvin wrote: >> >> The obvious way to fix this would be to use >> spin_lock_irqsave..spin_lock_irqrestore in __down_read as well as in the >> other locations; I don't have a good feel for what the cost of doing so >> would be, though. On x86 it's fairly expensive simply because the only >> way to save the state is to push it on the stack, which the compiler >> doesn't deal well with, but this code isn't used on x86. > [...] > So making the slow-path do the spin_[un]lock_irq{save,restore}() versions > sounds like the right thing. It won't be a performance issue: it _is_ the > slow-path, and we're already doing the expensive part (the spinlock itself > and the irq thing). > > So ACK on the idea. Who wants to write the trivial patch and test it? OK, I'll bite since I was seeing boot-time hangs on ARM (TI OMAP3) due to this. Patch below. Kevin From 7baff4008353bbfd2a2e2a4da22b87bc4efa4194 Mon Sep 17 00:00:00 2001 From: Kevin Hilman <khilman(a)deeprootsystems.com> Date: Wed, 7 Apr 2010 11:52:46 -0700 Subject: [PATCH] rwsem generic spinlock: use IRQ save/restore spinlocks rwsems can be used with IRQs disabled, particularily in early boot before IRQs are enabled. Currently the spin_unlock_irq() usage in the slow-patch will unconditionally enable interrupts and cause problems since interrupts are not yet initialized or enabled. This patch uses save/restore versions of IRQ spinlocks in the slowpath to ensure interrupts are not unintentionally disabled. Signed-off-by: Kevin Hilman <khilman(a)deeprootsystems.com> --- lib/rwsem-spinlock.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) diff --git a/lib/rwsem-spinlock.c b/lib/rwsem-spinlock.c index ccf95bf..ffc9fc7 100644 --- a/lib/rwsem-spinlock.c +++ b/lib/rwsem-spinlock.c @@ -143,13 +143,14 @@ void __sched __down_read(struct rw_semaphore *sem) { struct rwsem_waiter waiter; struct task_struct *tsk; + unsigned long flags; - spin_lock_irq(&sem->wait_lock); + spin_lock_irqsave(&sem->wait_lock, flags); if (sem->activity >= 0 && list_empty(&sem->wait_list)) { /* granted */ sem->activity++; - spin_unlock_irq(&sem->wait_lock); + spin_unlock_irqrestore(&sem->wait_lock, flags); goto out; } @@ -164,7 +165,7 @@ void __sched __down_read(struct rw_semaphore *sem) list_add_tail(&waiter.list, &sem->wait_list); /* we don't need to touch the semaphore struct anymore */ - spin_unlock_irq(&sem->wait_lock); + spin_unlock_irqrestore(&sem->wait_lock, flags); /* wait to be given the lock */ for (;;) { @@ -209,13 +210,14 @@ void __sched __down_write_nested(struct rw_semaphore *sem, int subclass) { struct rwsem_waiter waiter; struct task_struct *tsk; + unsigned long flags; - spin_lock_irq(&sem->wait_lock); + spin_lock_irqsave(&sem->wait_lock, flags); if (sem->activity == 0 && list_empty(&sem->wait_list)) { /* granted */ sem->activity = -1; - spin_unlock_irq(&sem->wait_lock); + spin_unlock_irqrestore(&sem->wait_lock, flags); goto out; } @@ -230,7 +232,7 @@ void __sched __down_write_nested(struct rw_semaphore *sem, int subclass) list_add_tail(&waiter.list, &sem->wait_list); /* we don't need to touch the semaphore struct anymore */ - spin_unlock_irq(&sem->wait_lock); + spin_unlock_irqrestore(&sem->wait_lock, flags); /* wait to be given the lock */ for (;;) { -- 1.7.0.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on 7 Apr 2010 18:40
On 04/07/2010 02:13 PM, Rafael J. Wysocki wrote: > This message has been generated automatically as a part of a summary report > of recent regressions. > > The following bug entry is on the current list of known regressions > from 2.6.33. Please verify if it still should be listed and let the tracking team > know (either way). > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15668 > Subject : start_kernel(): bug: interrupts were enabled early > Submitter : Rabin Vincent <rabin(a)rab.in> > Date : 2010-03-25 19:53 (14 days old) > First-Bad-Commit: http://git.kernel.org/git/linus/773e3eb7b81e5ba13b5155dfb3bb75b8ce37f8f9 ^^^^ FWIW, this /git is redundant. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |