Prev: 2.6.34-rc1: pci 0000:00:00.0: address space collision / spontaenous reboots [now 2.6.34-rc1]
Next: [PATCH 1/2] firewire: core: fix Model_ID in modalias
From: Randy Dunlap on 23 Mar 2010 11:30 On Fri, 19 Mar 2010 14:46:10 -0400 Mathieu Desnoyers wrote: > * Randy Dunlap (randy.dunlap(a)oracle.com) wrote: > > On 03/18/10 17:59, Mathieu Desnoyers wrote: > > > * Steven Rostedt (rostedt(a)goodmis.org) wrote: > > >> On Thu, 2010-03-18 at 16:26 -0700, Randy Dunlap wrote: > > >>> I can build/boot 2.6.33 with CONFIG_TRACE/TRACING disabled successfully, > > >>> but when I enable lots of tracing config options and then boot with > > >>> ftrace=nop on the kernel command line, I see a GP fault when the parport & > > >>> parport_pc modules are loading/initializing. > > >> > > >> Do you see it without adding the "ftrace=nop"? The only thing that > > >> should do is expand the ring buffer on boot up. > > >> > > >>> > > >>> It happens in drivers/parport/share.c::parport_register_device(), when that > > >>> function calls try_module_get(). > > >>> > > >>> If I comment out the trace_module_get() calls in include/linux/module.h, > > >>> the kernel boots with no problems. > > >> > > >> > > >> Interesting. Well, trace_module_get() is a TRACE_EVENT tracepoint. But > > >> should be disabled here. It may be something to do with DEFINE_TRACE. > > >> > > >> (added Mathieu to Cc since he wrote that code) > > > > > > can you try replacing the "local_read(__module_ref_addr(module, cpu))" argument > > > with "0" ? > > > > Yes, that boots with no problems. > > clickety-clicketa... git blame include/linux/module.h : > > commit 7ead8b8313d92b3a69a1a61b0dcbc4cd66c960dc > Author: Li Zefan <lizf(a)cn.fujitsu.com> > Date: Mon Aug 17 16:56:28 2009 +0800 > > tracing/events: Add module tracepoints > > (Adding Li Zefan in CC) > > Two things: > > 1) In this commit, most of the tracepoints contain argument with side-effects. > These do not belong there; they should be moved into TRACE_EVENT macros. > > 2) There seem to be a null-pointer bug with > local_read(__module_ref_addr(module, cpu)) in try_module_get(). This should > be investigated even if we move the argument to TRACE_EVENT. Hi Li, Fix this, please? > Thanks, > > Mathieu > > > > > > Arguments with side-effects are not skipped by the jump over disabled > > > instrumentation. This is why we should do that part within the probe declaration > > > in the TRACE_EVENT macros. > > > > > > But if we find out that the problem really is this argument, then it should be > > > fixed, because something would be wrong with it (just moving it to TRACE_EVENT > > > is not a proper solution). > > > > > > Thanks, > > > > > > Mathieu --- ~Randy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Mathieu Desnoyers on 23 Mar 2010 21:30 * Randy Dunlap (randy.dunlap(a)oracle.com) wrote: > On Fri, 19 Mar 2010 14:46:10 -0400 Mathieu Desnoyers wrote: > > > * Randy Dunlap (randy.dunlap(a)oracle.com) wrote: > > > On 03/18/10 17:59, Mathieu Desnoyers wrote: > > > > * Steven Rostedt (rostedt(a)goodmis.org) wrote: > > > >> On Thu, 2010-03-18 at 16:26 -0700, Randy Dunlap wrote: > > > >>> I can build/boot 2.6.33 with CONFIG_TRACE/TRACING disabled successfully, > > > >>> but when I enable lots of tracing config options and then boot with > > > >>> ftrace=nop on the kernel command line, I see a GP fault when the parport & > > > >>> parport_pc modules are loading/initializing. > > > >> > > > >> Do you see it without adding the "ftrace=nop"? The only thing that > > > >> should do is expand the ring buffer on boot up. > > > >> > > > >>> > > > >>> It happens in drivers/parport/share.c::parport_register_device(), when that > > > >>> function calls try_module_get(). > > > >>> > > > >>> If I comment out the trace_module_get() calls in include/linux/module.h, > > > >>> the kernel boots with no problems. > > > >> > > > >> > > > >> Interesting. Well, trace_module_get() is a TRACE_EVENT tracepoint. But > > > >> should be disabled here. It may be something to do with DEFINE_TRACE. > > > >> > > > >> (added Mathieu to Cc since he wrote that code) > > > > > > > > can you try replacing the "local_read(__module_ref_addr(module, cpu))" argument > > > > with "0" ? > > > > > > Yes, that boots with no problems. > > > > clickety-clicketa... git blame include/linux/module.h : > > > > commit 7ead8b8313d92b3a69a1a61b0dcbc4cd66c960dc > > Author: Li Zefan <lizf(a)cn.fujitsu.com> > > Date: Mon Aug 17 16:56:28 2009 +0800 > > > > tracing/events: Add module tracepoints > > > > (Adding Li Zefan in CC) > > > > Two things: > > > > 1) In this commit, most of the tracepoints contain argument with side-effects. > > These do not belong there; they should be moved into TRACE_EVENT macros. > > > > 2) There seem to be a null-pointer bug with > > local_read(__module_ref_addr(module, cpu)) in try_module_get(). This should > > be investigated even if we move the argument to TRACE_EVENT. > > Hi Li, > > Fix this, please? > While we wait for the sun to move to other time zones, can you check if the following patch fixes your problem ? module: fix __module_ref_addr() __module_ref_addr() should use per_cpu_ptr() to obfuscate the pointer (RELOC_HIDE is needed for per cpu pointers). This non-standard per-cpu pointer use has been introduced by commit 720eba31f47aeade8ec130ca7f4353223c49170f Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> CC: Eric Dumazet <dada1(a)cosmosbay.com> CC: Rusty Russell <rusty(a)rustcorp.com.au> --- include/linux/module.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6-lttng/include/linux/module.h =================================================================== --- linux-2.6-lttng.orig/include/linux/module.h 2010-03-23 18:11:14.000000000 -0400 +++ linux-2.6-lttng/include/linux/module.h 2010-03-23 18:14:07.000000000 -0400 @@ -467,7 +467,7 @@ void symbol_put_addr(void *addr); static inline local_t *__module_ref_addr(struct module *mod, int cpu) { #ifdef CONFIG_SMP - return (local_t *) (mod->refptr + per_cpu_offset(cpu)); + return (local_t *) per_cpu_ptr(mod->refptr, cpu); #else return &mod->ref; #endif > > > Thanks, > > > > Mathieu > > > > > > > > > Arguments with side-effects are not skipped by the jump over disabled > > > > instrumentation. This is why we should do that part within the probe declaration > > > > in the TRACE_EVENT macros. > > > > > > > > But if we find out that the problem really is this argument, then it should be > > > > fixed, because something would be wrong with it (just moving it to TRACE_EVENT > > > > is not a proper solution). > > > > > > > > Thanks, > > > > > > > > Mathieu > > > --- > ~Randy -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Li Zefan on 23 Mar 2010 21:50 Mathieu Desnoyers wrote: > * Randy Dunlap (randy.dunlap(a)oracle.com) wrote: >> On Fri, 19 Mar 2010 14:46:10 -0400 Mathieu Desnoyers wrote: >> >>> * Randy Dunlap (randy.dunlap(a)oracle.com) wrote: >>>> On 03/18/10 17:59, Mathieu Desnoyers wrote: >>>>> * Steven Rostedt (rostedt(a)goodmis.org) wrote: >>>>>> On Thu, 2010-03-18 at 16:26 -0700, Randy Dunlap wrote: >>>>>>> I can build/boot 2.6.33 with CONFIG_TRACE/TRACING disabled successfully, >>>>>>> but when I enable lots of tracing config options and then boot with >>>>>>> ftrace=nop on the kernel command line, I see a GP fault when the parport & >>>>>>> parport_pc modules are loading/initializing. >>>>>> Do you see it without adding the "ftrace=nop"? The only thing that >>>>>> should do is expand the ring buffer on boot up. >>>>>> >>>>>>> It happens in drivers/parport/share.c::parport_register_device(), when that >>>>>>> function calls try_module_get(). >>>>>>> >>>>>>> If I comment out the trace_module_get() calls in include/linux/module.h, >>>>>>> the kernel boots with no problems. >>>>>> >>>>>> Interesting. Well, trace_module_get() is a TRACE_EVENT tracepoint. But >>>>>> should be disabled here. It may be something to do with DEFINE_TRACE. >>>>>> >>>>>> (added Mathieu to Cc since he wrote that code) >>>>> can you try replacing the "local_read(__module_ref_addr(module, cpu))" argument >>>>> with "0" ? >>>> Yes, that boots with no problems. >>> clickety-clicketa... git blame include/linux/module.h : >>> >>> commit 7ead8b8313d92b3a69a1a61b0dcbc4cd66c960dc >>> Author: Li Zefan <lizf(a)cn.fujitsu.com> >>> Date: Mon Aug 17 16:56:28 2009 +0800 >>> >>> tracing/events: Add module tracepoints >>> >>> (Adding Li Zefan in CC) >>> >>> Two things: >>> >>> 1) In this commit, most of the tracepoints contain argument with side-effects. >>> These do not belong there; they should be moved into TRACE_EVENT macros. >>> >>> 2) There seem to be a null-pointer bug with >>> local_read(__module_ref_addr(module, cpu)) in try_module_get(). This should >>> be investigated even if we move the argument to TRACE_EVENT. >> Hi Li, >> >> Fix this, please? >> > > While we wait for the sun to move to other time zones, can you check if the > following patch fixes your problem ? > Sorry, I overlooked this mail thread.. I'll make a patch to move side-effects arguments from trace_module_xxx() to the definition of TRACE_EVENT(). But it's for reducing overhead when tracepoints are disabled, this should not be the real cultprit of the bug here. > > module: fix __module_ref_addr() > > __module_ref_addr() should use per_cpu_ptr() to obfuscate the pointer > (RELOC_HIDE is needed for per cpu pointers). > > This non-standard per-cpu pointer use has been introduced by commit > 720eba31f47aeade8ec130ca7f4353223c49170f > So the uptream kernel is free from this bug, because __module_ref_addr() has gone. > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> > CC: Eric Dumazet <dada1(a)cosmosbay.com> > CC: Rusty Russell <rusty(a)rustcorp.com.au> > --- > include/linux/module.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: linux-2.6-lttng/include/linux/module.h > =================================================================== > --- linux-2.6-lttng.orig/include/linux/module.h 2010-03-23 18:11:14.000000000 -0400 > +++ linux-2.6-lttng/include/linux/module.h 2010-03-23 18:14:07.000000000 -0400 > @@ -467,7 +467,7 @@ void symbol_put_addr(void *addr); > static inline local_t *__module_ref_addr(struct module *mod, int cpu) > { > #ifdef CONFIG_SMP > - return (local_t *) (mod->refptr + per_cpu_offset(cpu)); > + return (local_t *) per_cpu_ptr(mod->refptr, cpu); > #else > return &mod->ref; > #endif > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Randy Dunlap on 24 Mar 2010 16:30 On 03/23/10 18:20, Mathieu Desnoyers wrote: > > While we wait for the sun to move to other time zones, can you check if the > following patch fixes your problem ? > Hi Mathieu, Yes, this works. Thanks. I'll test Li's patch now... > > module: fix __module_ref_addr() > > __module_ref_addr() should use per_cpu_ptr() to obfuscate the pointer > (RELOC_HIDE is needed for per cpu pointers). > > This non-standard per-cpu pointer use has been introduced by commit > 720eba31f47aeade8ec130ca7f4353223c49170f > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com> > CC: Eric Dumazet <dada1(a)cosmosbay.com> > CC: Rusty Russell <rusty(a)rustcorp.com.au> > --- > include/linux/module.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: linux-2.6-lttng/include/linux/module.h > =================================================================== > --- linux-2.6-lttng.orig/include/linux/module.h 2010-03-23 18:11:14.000000000 -0400 > +++ linux-2.6-lttng/include/linux/module.h 2010-03-23 18:14:07.000000000 -0400 > @@ -467,7 +467,7 @@ void symbol_put_addr(void *addr); > static inline local_t *__module_ref_addr(struct module *mod, int cpu) > { > #ifdef CONFIG_SMP > - return (local_t *) (mod->refptr + per_cpu_offset(cpu)); > + return (local_t *) per_cpu_ptr(mod->refptr, cpu); > #else > return &mod->ref; > #endif -- ~Randy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Steven Rostedt on 24 Mar 2010 16:40
On Wed, 2010-03-24 at 13:21 -0700, Randy Dunlap wrote: > On 03/23/10 18:20, Mathieu Desnoyers wrote: > > > > While we wait for the sun to move to other time zones, can you check if the > > following patch fixes your problem ? > > > > Hi Mathieu, > Yes, this works. Thanks. > > I'll test Li's patch now... > If it works, can you send me your "Tested-by". I have it queued to go. Thanks, -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |