sysrq: don't hold the sysrq_key_table_lock during the handler [Kernel]

Prev: writeback: the kupdate expire timestamp should be a moving target
Next: hw-breakpoints, kgdb, x86: add a flag topassDIE_DEBUG notification

From: Dmitry Torokhov on 27 Jul 2010 12:40

On Tue, Jul 27, 2010 at 07:57:54AM -0400, Neil Horman wrote:
> On Tue, Jul 27, 2010 at 01:15:52AM -0700, Dmitry Torokhov wrote:
> > On Mon, Jul 26, 2010 at 04:34:20PM -0400, Neil Horman wrote:
> > > On Mon, Jul 26, 2010 at 10:41:54AM -0700, Dmitry Torokhov wrote:
> > > > On Mon, Jul 26, 2010 at 06:51:48AM -0400, Neil Horman wrote:
> > > > > On Mon, Jul 26, 2010 at 05:54:02PM +0800, Xiaotian Feng wrote:
> > > > > <snip>
> > > > >
> > > > > This creates the possibility of a race in the handler. Not that it happens
> > > > > often, but sysrq keys can be registered and unregistered dynamically. If that
> > > > > lock isn't held while we call the keys handler, the code implementing that
> > > > > handler can live in a module that gets removed while its executing, leading to
> > > > > an oops, etc. I think the better solution would be to use an rcu lock here.
> > > >
> > > > I'd simply changed spinlock to a mutex.
> > > >
> > > I don't think you can do that safely in this path, as sysrqs will be looked up
> > > in both process (echo t > /proc/sysrq-trigger) context and in interrupt
> > > (alt-sysrq-t) context. If a mutex is locked and you try to take it in interrupt
> > > context, you get a sleeping-in-interrupt panic IIRC
> > >
> >
> > Yes, indeed. But then even RCU will not really help us since keyboard
> > driver will have inpterrupts disabled anyways.
> >
>
> Hm, thats true. I suppose the right thing to do then is grab a reference on any
> sysrq implemented within code that might be considered transient before
> releasing the lock. I've not tested this patch out, but it should do what we
> need, in that it allows us to release the lock without having to worry about the
> op list changing underneath us, or having the module with the handler code
> dissappear
>

That would only help if you also offload execution to a workqueue (which
may not be desirable in all cases) since keyboard driver^H^H input core
still calls into SysRq code holding [another] spinlock with interrupts
disabled.

--
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Neil Horman on 27 Jul 2010 15:30

On Tue, Jul 27, 2010 at 09:38:52AM -0700, Dmitry Torokhov wrote:
> On Tue, Jul 27, 2010 at 07:57:54AM -0400, Neil Horman wrote:
> > On Tue, Jul 27, 2010 at 01:15:52AM -0700, Dmitry Torokhov wrote:
> > > On Mon, Jul 26, 2010 at 04:34:20PM -0400, Neil Horman wrote:
> > > > On Mon, Jul 26, 2010 at 10:41:54AM -0700, Dmitry Torokhov wrote:
> > > > > On Mon, Jul 26, 2010 at 06:51:48AM -0400, Neil Horman wrote:
> > > > > > On Mon, Jul 26, 2010 at 05:54:02PM +0800, Xiaotian Feng wrote:
> > > > > > <snip>
> > > > > >
> > > > > > This creates the possibility of a race in the handler. Not that it happens
> > > > > > often, but sysrq keys can be registered and unregistered dynamically. If that
> > > > > > lock isn't held while we call the keys handler, the code implementing that
> > > > > > handler can live in a module that gets removed while its executing, leading to
> > > > > > an oops, etc. I think the better solution would be to use an rcu lock here.
> > > > >
> > > > > I'd simply changed spinlock to a mutex.
> > > > >
> > > > I don't think you can do that safely in this path, as sysrqs will be looked up
> > > > in both process (echo t > /proc/sysrq-trigger) context and in interrupt
> > > > (alt-sysrq-t) context. If a mutex is locked and you try to take it in interrupt
> > > > context, you get a sleeping-in-interrupt panic IIRC
> > > >
> > >
> > > Yes, indeed. But then even RCU will not really help us since keyboard
> > > driver will have inpterrupts disabled anyways.
> > >
> >
> > Hm, thats true. I suppose the right thing to do then is grab a reference on any
> > sysrq implemented within code that might be considered transient before
> > releasing the lock. I've not tested this patch out, but it should do what we
> > need, in that it allows us to release the lock without having to worry about the
> > op list changing underneath us, or having the module with the handler code
> > dissappear
> >
>
> That would only help if you also offload execution to a workqueue (which
> may not be desirable in all cases) since keyboard driver^H^H input core
> still calls into SysRq code holding [another] spinlock with interrupts
> disabled.
>

Um, no, I don't think so. The concern that I had with the patch was that after
you unlock that spinlock, a module which previously registered a sysrq handler
could be removed during its execution leaving it executing in unknown memory.
By doing a successful try_module_get we prevent the module remove code from
deleting a module from the kernel, avoiding that condition until the execution
of the requested sysrq handler completes. Offloading execution of the handler
to a workqueue does nothing here, unless you see another problem, independent of
the one I was addressing.

I suppose there is a possibiliy that the o_op value could change after we unlock
the lock, but we could manage that by copying the pointer (although I don't
think its needed unless some module tries to unregister sysrq handlers outside
of the module_exit routine it has.

Neil

> --
> Dmitry
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Andrew Morton on 27 Jul 2010 19:40

On Mon, 26 Jul 2010 17:54:02 +0800
Xiaotian Feng <dfeng(a)redhat.com> wrote:

> sysrq_key_table_lock is used to protect the sysrq_key_table, make sure
> we get/replace the right operation for the sysrq. But in __handle_sysrq,
> kernel will hold this lock and disable irqs until we finished op_p->handler().
> This may cause false positive watchdog alert when we're doing "show-task-states"
> on a system with many tasks.
>

It would be better to find a suitable point in an inner loop and add an
appropriately-commented touch_nmi_watchdog().

That way the problem gets fixed for all irqs-off callers, not just one
of them.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev |
Pages: 1 2
Prev: writeback: the kupdate expire timestamp should be a moving target
Next: hw-breakpoints, kgdb, x86: add a flag topassDIE_DEBUG notification