Prev: linux-next: manual merge of the staging-next tree with the net tree
Next: [PATCHv9 2.6.34-rc5 5/5] mx5: Add USB to Freescale MX51 defconfig
From: Prarit Bhargava on 27 Apr 2010 14:50 On 04/27/2010 02:34 PM, Konrad Rzeszutek Wilk wrote: >>> Can you provide a short example of test scenario? As in what I should do >>> to reproduce this problem? >>> >>> >> Take the latest upstream (well ... to be honest, a bit older than that >> because of some other bugs) -- take 2.6.33 and try to boot it as a PV >> > 2.6.34-rc5 PV boots under Xen for me (and pretty much since 2.6.33 + > Suresh fix for the CONFIG_RODATA_MARK). > > Perhaps I am missing some of the .config options you have set that make it not work? > > The irqbalance daemon looks to be running - but I think you are hitting > this during bootup? How long do you have to wait for this to trigger? > > It happens during bootup. I don't have a 2.6.33 vanilla panic handy but I do have one from an earlier 2.6.32... rip: ffffffff81256f45 delay_tsc+0x45 rsp: ffff8800fac95a98 rax: fffffffff6ef46d0 rbx: 00000002 rcx: f6ef46d0 rdx: 0010850c rsi: 002b3bb6 rdi: 002b3bcc rbp: ffff8800fac95ab8 r8: ffffffff r9: 00000002 r10: 00000002 r11: 00000000 r12: fffffffff6dec1c4 r13: 00000002 r14: 002b3bcc r15: 00000001 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 Stack: 000000000002ef45 ffff8800fac95c88 0000000000000009 ffff8800fac93540 ffff8800fac95ac8 ffffffff81256ef6 ffff8800fac95b48 ffffffff814c6341 0000000000000010 ffff8800fac95b38 ffff880000000008 ffff8800fac95b58 ffff8800fac95b08 a22d306b065d4a66 0000000000000000 0000000000000000 Code: f3 90 65 8b 1c 25 d8 e3 00 00 44 39 eb 75 23 66 66 90 0f ae e8<e8> 46 3d dc ff 66 90 48 98 48 89 Call Trace: [<ffffffff81256f45>] delay_tsc+0x45<-- [<ffffffff81256ef6>] __const_udelay+0x46 [<ffffffff814c6341>] panic+0x135 [<ffffffff814ca23c>] oops_end+0xdc [<ffffffff81042272>] no_context+0xf2 [<ffffffff8125946c>] __bitmap_weight+0x8c [<ffffffff81042505>] __bad_area_nosemaphore+0x125 [<ffffffff8105fad4>] find_busiest_group+0x254 [<ffffffff810425d3>] bad_area_nosemaphore+0x13 [<ffffffff814cbccf>] do_page_fault+0x2ef [<ffffffff814c9595>] page_fault+0x25 [<ffffffff810302f2>] irq_force_complete_move+0x12 [<ffffffff81015214>] fixup_irqs+0xa4 [<ffffffff8102ce59>] cpu_disable_common+0x1a9 [<ffffffff8100f9c2>] check_events+0x12 [<ffffffff810c2550>] __stop_machine+0x120 [<ffffffff8100ff75>] xen_cpu_disable+0x25 [<ffffffff814b0427>] take_cpu_down+0x17 [<ffffffff810c25f9>] stop_cpu+0xa9 [<ffffffff8108869d>] worker_thread+0x16d [<ffffffff8100f19d>] xen_force_evtchn_callback+0xd [<ffffffff8108dd00>] wake_up_bit+0x40 [<ffffffff814c90f6>] _spin_unlock_irqrestore+0x16 [<ffffffff81088530>] create_workqueue_thread+0xd0 [<ffffffff8108d9a6>] kthread+0x96 [<ffffffff8101418a>] child_rip+0xa [<ffffffff81013351>] int_ret_from_sys_call+0x7 [<ffffffff81013add>] retint_restore_args+0x5 [<ffffffff81014180>] kernel_thread+0xe0 > How many CPUs did you assign to your guest? > > It didn't matter as long as vcpus >1 and maxcpus > vcpus. > What are the "other bugs" you speak off? > I got a different panic (which I've yet to resolve). > >> guest. I'm using a RHEL5 Xen HV fwiw ... >> > OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one > (2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary > to do that? > I haven't tried it -- it might work :) Also, did you try booting with maxvcpus > vcpus as drjones suggested ? P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Greg KH on 28 Apr 2010 15:20 On Wed, Apr 28, 2010 at 11:50:39AM -0700, Andrew Morton wrote: > I worry that if the -stable maintainer see me drop a patch, but the > patch in Linus's tree doesn't have the stable tag, they might not merge > the fix into -stable. I bugged them about this scenario recently and > the reply was a bit waffly ;) It was? I try my best, that if I see you drop a patch, to go dig through Linus's tree to find if it landed there. If not, I leave it in my queue, and do that for a few releases. If after a long time (like 6 months) I either ping someone, or just drop it from my queue as I guessed that someone dropped it for some reason. If I miss one of these, please let me know. > By far the safest thing to do is to include the stable tag in your > changelog right at the outset. Yes, that's the _easiest_ and will not get lost. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Konrad Rzeszutek Wilk on 3 May 2010 15:20 >> OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one >> (2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary >> to do that? >> > > I haven't tried it -- it might work :) > > Also, did you try booting with maxvcpus > vcpus as drjones suggested ? Yes. No luck reproducing the crash/panic. I am just not seeing the failure you guys are seeing. Let me build once more 2.6.33 vanilla + CONFIG_DEBUG_MARK_RODATA=n) and check this. And also install a vanilla RHEL5 dom0 as it looks impossible to compile a 2.6.18-era kernel under FC11. The Xen I am using is xen-unstable - so 4.0.1. I know that the IRQ balance code in the Xen hypervisor was fixed in 4.0 (it used to run out of context - now it runs in the IRQ context). Maybe this bug you are seeing (and have the fix for) is just a red-heering? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Prarit Bhargava on 3 May 2010 16:00 On 05/03/2010 03:16 PM, Konrad Rzeszutek Wilk wrote: >>> OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one >>> (2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary >>> to do that? >>> >>> >> I haven't tried it -- it might work :) >> >> Also, did you try booting with maxvcpus> vcpus as drjones suggested ? >> > Yes. No luck reproducing the crash/panic. I am just not seeing the failure you > guys are seeing. > > Let me build once more 2.6.33 vanilla + CONFIG_DEBUG_MARK_RODATA=n) and check > this. And also install a vanilla RHEL5 dom0 as it looks impossible to > compile a 2.6.18-era kernel under FC11. > Let me try reproducing this on FC11 + 2.6.33. P. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Konrad Rzeszutek Wilk on 4 May 2010 11:10
On Mon, May 03, 2010 at 03:16:34PM -0400, Konrad Rzeszutek Wilk wrote: > >> OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one > >> (2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary > >> to do that? > >> > > > > I haven't tried it -- it might work :) > > > > Also, did you try booting with maxvcpus > vcpus as drjones suggested ? > > Yes. No luck reproducing the crash/panic. I am just not seeing the failure you > guys are seeing. > > Let me build once more 2.6.33 vanilla + CONFIG_DEBUG_MARK_RODATA=n) and check > this. And also install a vanilla RHEL5 dom0 as it looks impossible to > compile a 2.6.18-era kernel under FC11. Rebuilding everything from scratch did it. I am seeing a similar failure where xenctx reports: Call Trace: [<ffffffff8107f780>] stop_cpu+0xc6 <-- [<ffffffff8105520e>] worker_thread+0x15d [<ffffffff8107f6ba>] __stop_machine+0x106 [<ffffffff81058afb>] wake_up_bit+0x25 [<ffffffff81038720>] spin_unlock_irqrestore+0x9 [<ffffffff810550b1>] spin_lock_irq+0xb [<ffffffff810586cb>] kthread+0x7a [<ffffffff8100a964>] kernel_thread_helper+0x4 [<ffffffff81009d61>] int_ret_from_sys_call+0x7 [<ffffffff814033dd>] retint_restore_args+0x5 [<ffffffff8100a960>] gs_change+0x13 With this guest file: kernel = "/mnt/lab/vs11/vmlinuz" ramdisk = "/mnt/lab/vs11/initramfs.cpio.gz" memory = 2048 maxvcpus = 4 vcpus = 2 vif = [ 'mac=00:0F:4B:00:00:71, bridge=switch' ] vfb = [ 'vnc=1, vnclisten=0.0.0.0,vncunused=1'] root = "debug loglevel=10 plymouth:splash=solar plymouth:debug norm console=hvc0 initcall_debug" This is with the latest linux kernel: d93ac51c7a129db7a1431d859a3ef45a0b1f3fc5 (Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client) With your patch the PV guests keeps on going. So: Tested-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com> > > The Xen I am using is xen-unstable - so 4.0.1. I know that the IRQ balance > code in the Xen hypervisor was fixed in 4.0 (it used to run out of > context - now it runs in the IRQ context). Maybe this bug you are seeing > (and have the fix for) is just a red-heering? Interestingly enough, I couldn't reproduce this on my Intel box, but on a AMD box with a very wacked TSC (cpu MHz : 2795681.405) I can reproduce this. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo(a)vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |