Prev: linux-next: build warning after merge of the mfd tree
Next: linux-next: build failure after merge of the driver-core tree
From: Américo Wang on 3 Mar 2010 01:10 Hi, Steven, Frederic, Ingo, I am not sure if this is ftrace's fault, but it is ftrace who triggers the soft lockup. On my machine, it is pretty easy, just run: echo function_graph > current_tracer I found this when 2.6.33-rc7, but forgot to report here. I just tried today's Linus tree, it still has this problem. Attached are the full soft lockup warnings. Below is my related configs: CONFIG_STACKTRACE_SUPPORT=y # CONFIG_RCU_TRACE is not set # CONFIG_TREE_RCU_TRACE is not set CONFIG_TRACEPOINTS=y CONFIG_HAVE_ARCH_TRACEHOOK=y # CONFIG_NF_CONNTRACK is not set # CONFIG_NETFILTER_XT_TARGET_TRACE is not set CONFIG_MOUSE_PS2_TRACKPOINT=y CONFIG_TRACE_IRQFLAGS_SUPPORT=y CONFIG_TRACE_IRQFLAGS=y CONFIG_STACKTRACE=y # CONFIG_BACKTRACE_SELF_TEST is not set CONFIG_USER_STACKTRACE_SUPPORT=y CONFIG_NOP_TRACER=y CONFIG_HAVE_FTRACE_NMI_ENTER=y CONFIG_HAVE_FUNCTION_TRACER=y CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y CONFIG_HAVE_DYNAMIC_FTRACE=y CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y CONFIG_HAVE_SYSCALL_TRACEPOINTS=y CONFIG_FTRACE_NMI_ENTER=y CONFIG_EVENT_TRACING=y CONFIG_CONTEXT_SWITCH_TRACER=y CONFIG_TRACING=y CONFIG_GENERIC_TRACER=y CONFIG_TRACING_SUPPORT=y CONFIG_FTRACE=y CONFIG_FUNCTION_TRACER=y CONFIG_FUNCTION_GRAPH_TRACER=y # CONFIG_IRQSOFF_TRACER is not set # CONFIG_SYSPROF_TRACER is not set # CONFIG_SCHED_TRACER is not set # CONFIG_FTRACE_SYSCALLS is not set # CONFIG_BOOT_TRACER is not set # CONFIG_KSYM_TRACER is not set # CONFIG_STACK_TRACER is not set # CONFIG_KMEMTRACE is not set # CONFIG_WORKQUEUE_TRACER is not set CONFIG_BLK_DEV_IO_TRACE=y CONFIG_DYNAMIC_FTRACE=y CONFIG_FTRACE_MCOUNT_RECORD=y # CONFIG_FTRACE_STARTUP_TEST is not set # CONFIG_MMIOTRACE is not set CONFIG_SAMPLE_TRACEPOINTS=m # CONFIG_SAMPLE_TRACE_EVENTS is not set CONFIG_HAVE_MMIOTRACE_SUPPORT=y Sorry that I don't have time to look into this. If you need other info, just say. Thanks!
From: Steven Rostedt on 3 Mar 2010 08:10 On Wed, 2010-03-03 at 14:04 +0800, Am�rico Wang wrote: > Hi, Steven, Frederic, Ingo, > > I am not sure if this is ftrace's fault, but it is ftrace who triggers > the soft lockup. On my machine, it is pretty easy, just run: > > echo function_graph > current_tracer > > I found this when 2.6.33-rc7, but forgot to report here. I just tried today's > Linus tree, it still has this problem. > > Attached are the full soft lockup warnings. > > Below is my related configs: Actually this is not enough for me. Could you send me the full config, as well as the full dmesg, not just the dump. Thanks! -- Steve > > CONFIG_STACKTRACE_SUPPORT=y > # CONFIG_RCU_TRACE is not set > # CONFIG_TREE_RCU_TRACE is not set > CONFIG_TRACEPOINTS=y > CONFIG_HAVE_ARCH_TRACEHOOK=y > # CONFIG_NF_CONNTRACK is not set > # CONFIG_NETFILTER_XT_TARGET_TRACE is not set > CONFIG_MOUSE_PS2_TRACKPOINT=y > CONFIG_TRACE_IRQFLAGS_SUPPORT=y > CONFIG_TRACE_IRQFLAGS=y > CONFIG_STACKTRACE=y > # CONFIG_BACKTRACE_SELF_TEST is not set > CONFIG_USER_STACKTRACE_SUPPORT=y > CONFIG_NOP_TRACER=y > CONFIG_HAVE_FTRACE_NMI_ENTER=y > CONFIG_HAVE_FUNCTION_TRACER=y > CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y > CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y > CONFIG_HAVE_DYNAMIC_FTRACE=y > CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y > CONFIG_HAVE_SYSCALL_TRACEPOINTS=y > CONFIG_FTRACE_NMI_ENTER=y > CONFIG_EVENT_TRACING=y > CONFIG_CONTEXT_SWITCH_TRACER=y > CONFIG_TRACING=y > CONFIG_GENERIC_TRACER=y > CONFIG_TRACING_SUPPORT=y > CONFIG_FTRACE=y > CONFIG_FUNCTION_TRACER=y > CONFIG_FUNCTION_GRAPH_TRACER=y > # CONFIG_IRQSOFF_TRACER is not set > # CONFIG_SYSPROF_TRACER is not set > # CONFIG_SCHED_TRACER is not set > # CONFIG_FTRACE_SYSCALLS is not set > # CONFIG_BOOT_TRACER is not set > # CONFIG_KSYM_TRACER is not set > # CONFIG_STACK_TRACER is not set > # CONFIG_KMEMTRACE is not set > # CONFIG_WORKQUEUE_TRACER is not set > CONFIG_BLK_DEV_IO_TRACE=y > CONFIG_DYNAMIC_FTRACE=y > CONFIG_FTRACE_MCOUNT_RECORD=y > # CONFIG_FTRACE_STARTUP_TEST is not set > # CONFIG_MMIOTRACE is not set > CONFIG_SAMPLE_TRACEPOINTS=m > # CONFIG_SAMPLE_TRACE_EVENTS is not set > CONFIG_HAVE_MMIOTRACE_SUPPORT=y > > Sorry that I don't have time to look into this. If you need other info, > just say. > > Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Li Zefan on 3 Mar 2010 20:40 Américo Wang wrote: > Hi, Steven, Frederic, Ingo, > > I am not sure if this is ftrace's fault, but it is ftrace who triggers > the soft lockup. On my machine, it is pretty easy, just run: > > echo function_graph > current_tracer > > I found this when 2.6.33-rc7, but forgot to report here. I just tried today's > Linus tree, it still has this problem. > I can also trigger soft lockup while running a stress test for ftrace. I'll post detailed information after more testing. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on 3 Mar 2010 20:50 On Wed, Mar 03, 2010 at 02:04:23PM +0800, Am�rico Wang wrote: > Mar 3 13:34:37 dhcp-66-70-5 kernel: BUG: soft lockup - CPU#3 stuck for 61s! [kstop/3:4876] > Mar 3 13:34:37 dhcp-66-70-5 kernel: Modules linked in: ipt_REJECT xt_tcpudp x_tables bridge stp deflate zlib_deflate ctr cast5 crypto_null ccm serpent blowfish twofish twofish_common ecb xcbc cbc md5 sha256_generic sha512_generic des_generic aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i iw_cxgb3 ib_core cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video output sbs sbshc power_meter hwmon battery acpi_memhotplug ac parport_pc lp parport sg dcdbas sr_mod cdrom serio_raw rtc_cmos rtc_core rtc_lib tpm_tis tpm tpm_bios button tg3 libphy shpchp i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod sata_nv libata sd_mod s > Mar 3 13:34:37 dhcp-66-70-5 kernel: si_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: ip_tables] > Mar 3 13:34:37 dhcp-66-70-5 kernel: irq event stamp: 51397 > Mar 3 13:34:37 dhcp-66-70-5 kernel: hardirqs last enabled at (51396): [<ffffffff8155287c>] restore_args+0x0/0x30 > Mar 3 13:34:37 dhcp-66-70-5 kernel: hardirqs last disabled at (51397): [<ffffffff81003917>] save_args+0x67/0x70 > Mar 3 13:34:37 dhcp-66-70-5 kernel: softirqs last enabled at (6536): [<ffffffff81003881>] return_to_handler+0x0/0x2f > Mar 3 13:34:37 dhcp-66-70-5 kernel: softirqs last disabled at (6567): [<ffffffff81003881>] return_to_handler+0x0/0x2f > Mar 3 13:34:37 dhcp-66-70-5 kernel: CPU 3 > Mar 3 13:34:37 dhcp-66-70-5 kernel: Modules linked in: ipt_REJECT xt_tcpudp x_tables bridge stp deflate zlib_deflate ctr cast5 crypto_null ccm serpent blowfish twofish twofish_common ecb xcbc cbc md5 sha256_generic sha512_generic des_generic aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i iw_cxgb3 ib_core cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video output sbs sbshc power_meter hwmon battery acpi_memhotplug ac parport_pc lp parport sg dcdbas sr_mod cdrom serio_raw rtc_cmos rtc_core rtc_lib tpm_tis tpm tpm_bios button tg3 libphy shpchp i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod sata_nv libata sd_mod s > Mar 3 13:34:37 dhcp-66-70-5 kernel: si_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: ip_tables] > Mar 3 13:34:37 dhcp-66-70-5 kernel: > Mar 3 13:34:38 dhcp-66-70-5 kernel: Pid: 4876, comm: kstop/3 Not tainted 2.6.33 #31 0J001K/PowerEdge T105 > Mar 3 13:34:38 dhcp-66-70-5 kernel: RIP: 0010:[<ffffffff8100d3d7>] [<ffffffff8100d3d7>] native_sched_clock+0x2a/0x69 > Mar 3 13:34:38 dhcp-66-70-5 kernel: RSP: 0018:ffff88002fe03da0 EFLAGS: 00000246 > Mar 3 13:34:38 dhcp-66-70-5 kernel: RAX: 00000000b24e3e78 RBX: ffff88002fe03da0 RCX: ffff88002fe03ee0 > Mar 3 13:34:39 dhcp-66-70-5 kernel: RDX: 0000000000000053 RSI: ffffffff810dfde6 RDI: ffffffff810e138d > Mar 3 13:34:41 dhcp-66-70-5 kernel: RBP: ffffffff81003881 R08: 0000000000000110 R09: ffffffff81003896 > Mar 3 13:34:41 dhcp-66-70-5 kernel: R10: ffff88021697dfd8 R11: 0000000000000000 R12: ffff88002fe03d20 > Mar 3 13:34:41 dhcp-66-70-5 kernel: R13: ffffffff810e138d R14: ffff8802287a6680 R15: ffffffff81003881 > Mar 3 13:34:41 dhcp-66-70-5 kernel: FS: 00007fd9db23a720(0000) GS:ffff88002fe00000(0000) knlGS:0000000000000000 > Mar 3 13:34:41 dhcp-66-70-5 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > Mar 3 13:34:41 dhcp-66-70-5 kernel: CR2: 00000000008e42c0 CR3: 0000000219d90000 CR4: 00000000000006e0 > Mar 3 13:34:41 dhcp-66-70-5 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Mar 3 13:34:41 dhcp-66-70-5 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Mar 3 13:34:41 dhcp-66-70-5 kernel: Process kstop/3 (pid: 4876, threadinfo ffff88021697c000, task ffff8802287a6680) > Mar 3 13:34:41 dhcp-66-70-5 kernel: Stack: > Mar 3 13:34:41 dhcp-66-70-5 kernel: ffff88002fe03db0 ffffffff810e7407 ffff88002fe03df0 ffffffff810ff331 > Mar 3 13:34:41 dhcp-66-70-5 kernel: <0> ffff88002ffceec0 ffff88002fe03e98 ffffffff810e138d ffff8802287a6680 > Mar 3 13:34:41 dhcp-66-70-5 kernel: <0> ffff88002fe03e00 ffffffff810dfde6 ffff88002fe03e40 ffffffff8102cd97 > Mar 3 13:34:41 dhcp-66-70-5 kernel: Call Trace: > Mar 3 13:34:41 dhcp-66-70-5 kernel: <IRQ> > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff810e7407>] ? trace_clock_local+0x2c/0x43 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff810ff331>] ? ftrace_push_return_trace+0xac/0x17a > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff810e138d>] ? __rcu_process_callbacks+0x1d4/0x4a9 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff810dfde6>] ? cpu_needs_another_gp+0x4/0x46 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8102cd97>] ? prepare_ftrace_return+0xda/0x138 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003896>] ? return_to_handler+0x15/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff810dfdeb>] ? cpu_needs_another_gp+0x9/0x46 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003881>] ? return_to_handler+0x0/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81008d3c>] print_context_stack+0xb6/0x144 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003881>] ? return_to_handler+0x0/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81007375>] dump_trace+0x320/0x3bf > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003881>] ? return_to_handler+0x0/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81008979>] show_trace_log_lvl+0x5f/0x72 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8108568d>] ? worker_thread+0x0/0x46c > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8107de2a>] ? exit_signals+0x7/0x253 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003881>] ? return_to_handler+0x0/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8100703f>] show_stack_log_lvl+0x185/0x19b > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81006d50>] ? do_softirq+0x8c/0x181 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003881>] ? return_to_handler+0x0/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81007516>] show_registers+0x102/0x3ac > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003881>] ? return_to_handler+0x0/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8100e739>] show_regs+0x1d/0x53 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003881>] ? return_to_handler+0x0/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff810da649>] softlockup_tick+0x268/0x2d2 > Mar 3 13:34:41 dhcp-66-70-5 kernel: <EOI> > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8107de2a>] ? exit_signals+0x7/0x253 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8102cce7>] ? prepare_ftrace_return+0x2a/0x138 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81066035>] ? do_exit+0x160/0xad0 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8108568d>] ? worker_thread+0x0/0x46c > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8108568d>] ? worker_thread+0x0/0x46c > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8107de2f>] ? exit_signals+0xc/0x253 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81066044>] ? do_exit+0x16f/0xad0 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8108ab21>] ? autoremove_wake_function+0x0/0x52 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8108568d>] ? worker_thread+0x0/0x46c > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003881>] ? return_to_handler+0x0/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81076327>] run_local_timers+0x32/0x3b > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81003881>] ? return_to_handler+0x0/0x2f > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff81076547>] update_process_times+0x51/0xa5 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff810048d4>] ? kernel_thread_helper+0x4/0x10 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8155287c>] ? restore_args+0x0/0x30 > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff8108a4c9>] ? kthread+0x0/0xec > Mar 3 13:34:41 dhcp-66-70-5 kernel: [<ffffffff810048d0>] ? kernel_thread_helper+0x0/0x10 So it is stuck in stop machine. I wonder where exactly. I see some do_exit at the top but I wonder how much they are reliable. Anyway, as Steve said, we really need a full config to reproduce it. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Américo Wang on 3 Mar 2010 22:00
On Wed, Mar 3, 2010 at 9:05 PM, Steven Rostedt <rostedt(a)goodmis.org> wrote: > On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote: >> Hi, Steven, Frederic, Ingo, >> >> I am not sure if this is ftrace's fault, but it is ftrace who triggers >> the soft lockup. On my machine, it is pretty easy, just run: >> >> echo function_graph > current_tracer >> >> I found this when 2.6.33-rc7, but forgot to report here. I just tried today's >> Linus tree, it still has this problem. >> >> Attached are the full soft lockup warnings. >> >> Below is my related configs: > > Actually this is not enough for me. Could you send me the full config, > as well as the full dmesg, not just the dump. > Full dmesg: http://pastebin.ca/1822197 Full config: http://pastebin.ca/1822201 Note, I changed some tracing unrelated config after getting that soft lockup, so it would not matter. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |