Prev: drm fixes + agp + one fb patch (bisected)
Next: x86: enlightenment for ticket spin locks - Xen implementation
From: David Miller on 14 Jul 2010 15:10 From: Thomas Gleixner <tglx(a)linutronix.de> Date: Wed, 14 Jul 2010 17:27:01 +0200 (CEST) > On Wed, 14 Jul 2010, Christoph Hellwig wrote: > >> Turns out this wasn't a regression introduced by a commit, but it >> happens when CONFIG_FUNCTION_GRAPH_TRACER is enabled. From a quick >> look I have no idea why these would interact badly, especially as >> CONFIG_FUNCTION_GRAPH_TRACER works fine with irq stacks if the >> CONFIG_4KSTACKS options is set. > > So you're saying, that the problem appears when > CONFIG_FUNCTION_GRAPH_TRACER is enabled w/o being used and that it > exists prior to your patches with irq stacks and 8k stack size, but > works with 4k stacks. That's definitely more than odd. Some hard-coded check somewhere assuming kernel stack pages won't straddle a page boundary? Just a guess... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Steven Rostedt on 23 Jul 2010 10:20 On Wed, 2010-07-14 at 17:47 +0200, Christoph Hellwig wrote: > > So you're saying, that the problem appears when > > CONFIG_FUNCTION_GRAPH_TRACER is enabled w/o being used and that it > > exists prior to your patches with irq stacks and 8k stack size, but > > works with 4k stacks. That's definitely more than odd. > > No, the problem does not show up with 8k stack size without irqstacks, > and does not show up with 4k stacks with irq stacks, but does show up > with 8k stacks with irqstacks as long as CONFIG_FUNCTION_GRAPH_TRACER is > enabled. Just disabling it in Ingo's example config makes it work, > and enabling it in my usual test configs makes the boot fail with > similar messages to the one Ingo sees. Examining the difference between 32bit and 64bit (where it only triggers for 32bit) I found this: In arch/x86/kernel/dumpstack_64.c: tinfo = task_thread_info(task); [...] bp = ops->walk_stack(tinfo, stack, bp, ops, data, estack_end, &graph); Note: tinfo here that is passed to walk_stack() is the actual thread info structure for the task. In arch/x86/kernel/dumpstack_32.c: context = (struct thread_info *) ((unsigned long)stack & (~(THREAD_SIZE - 1))); bp = ops->walk_stack(context, stack, bp, ops, data, NULL, &graph); Note: here, context (which ends up being tinfo) is just a bitmasking of the current stack. If the stack is the irqstack, then what is passed to walk_stack() is not the actual thread info structure. Note, if THREAD_SIZE is 8k and irqstacks are 4K then context is totally wrong here. Now for the reason that function graph noticed this: static const struct stacktrace_ops print_trace_ops = { .warning = print_trace_warning, .warning_symbol = print_trace_warning_symbol, .stack = print_trace_stack, .address = print_trace_address, .walk_stack = print_context_stack, }; Where walk_stack is print_context_stack: tinfo is pretty much ignored in print_context_stack, but when function graph is enabled we have: print_ftrace_graph_addr(addr, data, ops, tinfo, graph); Which actually plays with the tinfo structure. If tinfo is corrupted here, then we get the bug. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Steven Rostedt on 23 Jul 2010 10:30 On Fri, 2010-07-23 at 10:15 -0400, Steven Rostedt wrote: > On Wed, 2010-07-14 at 17:47 +0200, Christoph Hellwig wrote: > In arch/x86/kernel/dumpstack_32.c: > > context = (struct thread_info *) > ((unsigned long)stack & (~(THREAD_SIZE - 1))); > bp = ops->walk_stack(context, stack, bp, ops, data, NULL, &graph); > > Note: here, context (which ends up being tinfo) is just a bitmasking of > the current stack. If the stack is the irqstack, then what is passed to > walk_stack() is not the actual thread info structure. > > Note, if THREAD_SIZE is 8k and irqstacks are 4K then context is totally > wrong here. irqstacks are indeed 8k as well, but: union irq_ctx { struct thread_info tinfo; u32 stack[THREAD_SIZE/sizeof(u32)]; } __attribute__((aligned(PAGE_SIZE))); The stack is 8k, but it is aligned 4k. Which will have context = (struct thread_info *) ((unsigned long)stack & (~(THREAD_SIZE - 1))); not give the expected result. So we should do: } __attribute__((aligned(THREAD_SIZE))); -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Hellwig on 23 Jul 2010 12:50 On Fri, Jul 23, 2010 at 10:24:02AM -0400, Steven Rostedt wrote: > So we should do: > > } __attribute__((aligned(THREAD_SIZE))); Yes, that looks like a likely culprit. I tried to verify it, but right now I can't reproduce the issue anymore. I've done another completely clean rebuild to see if I can reproduce it again. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Hellwig on 27 Jul 2010 08:20 Ingo, can you test this patch implementing Steve's suggestion? Unfortunately I'm not able to reproduce the bug on my setup anymore. --- From: Christoph Hellwig <hch(a)lst.de> Subject: [PATCH] x86-32: align irq stacks properly As suggested by Steven Rostedt we need to align the irq stacks on the stack size, not just the page size to make them work for stack traces with 8k stacks. Signed-off-by: Christoph Hellwig <hch(a)lst.de> Index: linux-2.6-tip/arch/x86/kernel/irq_32.c =================================================================== --- linux-2.6-tip.orig/arch/x86/kernel/irq_32.c 2010-07-27 14:06:50.634494682 +0200 +++ linux-2.6-tip/arch/x86/kernel/irq_32.c 2010-07-27 14:06:58.031494680 +0200 @@ -55,7 +55,7 @@ static inline void print_stack_overflow( union irq_ctx { struct thread_info tinfo; u32 stack[THREAD_SIZE/sizeof(u32)]; -} __attribute__((aligned(PAGE_SIZE))); +} __attribute__((aligned(THREAD_SIZE))); static DEFINE_PER_CPU(union irq_ctx *, hardirq_ctx); static DEFINE_PER_CPU(union irq_ctx *, softirq_ctx); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: drm fixes + agp + one fb patch (bisected) Next: x86: enlightenment for ticket spin locks - Xen implementation |