From: Justin Mattock on 7 Aug 2010 01:50 hello, I just built a fresh clfs system using the tutorial.. right now Im able to boot and am able to login, the system seems to be running as it should except for when I try to install gmp and/or do a /sbin/lilo I see a message appear on screen(below) then if I do any kind of command(dmesg > dmesg) I get a stuck screen. has there been anything similar to the below message? keep in mind the kernel I'm using is 2.6.35-rc6 which on other machines(same type of system) run just fine without such message. only real thing different that I did with this build was build the latest gcc with gmp/mpfr/mpc inside gcc source directory instead of installing them on the system then using the switches to there location. <0>[ 48.976957] ------------[ cut here ]------------ <2>[ 48.977187] kernel BUG at fs/ext4/mballoc.c:2993! <0>[ 48.977415] invalid opcode: 0000 [#1] SMP <0>[ 48.977694] last sysfs file: /sys/devices/virtual/vc/vcsa12/uevent <4>[ 48.977873] CPU 0 <4>[ 48.977873] Modules linked in: uvcvideo videodev v4l1_compat firewire_ohci firewire_core ohci1394 i2c_nforce2 ohci_hcd forcedeth evdev thermal button aes_x86_64 lzo lzo_decompress lzo_compress tun kvm_intel ipcomp xfrm_ipcomp crypto_null sha256_generic cbc des_generic cast5 blowfish serpent camellia twofish twofish_common ctr ah4 esp4 authenc adm1021 raw1394 ieee1394 uhci_hcd ehci_hcd hci_uart rfcomm btusb hidp l2cap bluetooth coretemp acpi_cpufreq processor mperf appletouch applesmc <4>[ 48.977873] <4>[ 48.977873] Pid: 1482, comm: lilo Not tainted 2.6.35-rc6 #1 Mac-F2218FC8/iMac9,1 <4>[ 48.977873] RIP: 0010:[<ffffffff81150b02>] [<ffffffff81150b02>] ext4_mb_normalize_request+0x2d3/0x342 <4>[ 48.977873] RSP: 0018:ffff880137a6fa88 EFLAGS: 00010206 <4>[ 48.977873] RAX: ffff88013eef0000 RBX: ffff880138ee5000 RCX: 0000000000000010 <4>[ 48.977873] RDX: 0000000000000010 RSI: 0000000000000010 RDI: ffff88013eee1568 <4>[ 48.977873] RBP: ffff880137a6fad8 R08: 000000000001fff0 R09: ffff880137a6fb08 <4>[ 48.977873] R10: 0000000100006e10 R11: ffff880137a6fc30 R12: 0000000000000010 <4>[ 48.977873] R13: ffff880137a6fc10 R14: 000000000001fff0 R15: 0000000000020000 <4>[ 48.977873] FS: 00007f58b5b65700(0000) GS:ffff880001a00000(0000) knlGS:0000000000000000 <4>[ 48.977873] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b <4>[ 48.977873] CR2: 0000000000669018 CR3: 0000000138463000 CR4: 00000000000406f0 <4>[ 48.977873] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>[ 48.977873] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>[ 48.977873] Process lilo (pid: 1482, threadinfo ffff880137a6e000, task ffff880137310f60) <0>[ 48.977873] Stack: <4>[ 48.977873] 0000000000000000 0000000000008050 ffff880138faaca8 0002000081150729 <4>[ 48.977873] <0> ffff880137a6fb08 ffff880137a6fc10 ffff880137a6fc64 ffff88013eee1568 <4>[ 48.977873] <0> ffff880138ee5000 0000000000000000 ffff880137a6fb58 ffffffff81154ca0 <0>[ 48.977873] Call Trace: <4>[ 48.977873] [<ffffffff81154ca0>] ext4_mb_new_blocks+0x173/0x3d3 <4>[ 48.977873] [<ffffffff8114a36f>] ? ext4_ext_find_extent+0x45/0x2a6 <4>[ 48.977873] [<ffffffff8114d2f6>] ext4_ext_map_blocks+0x1732/0x1aeb <4>[ 48.977873] [<ffffffff811cc9e4>] ? radix_tree_gang_lookup_tag_slot+0x81/0xa2 <4>[ 48.977873] [<ffffffff810bb944>] ? pagevec_lookup_tag+0x20/0x29 <4>[ 48.977873] [<ffffffff8113477b>] ext4_map_blocks+0x115/0x1f4 <4>[ 48.977873] [<ffffffff8113672b>] mpage_da_map_blocks+0xeb/0x364 <4>[ 48.977873] [<ffffffff81144cf9>] ? ext4_journal_start_sb+0xc7/0x103 <4>[ 48.977873] [<ffffffff811370b5>] ext4_da_writepages+0x330/0x579 <4>[ 48.977873] [<ffffffff813e88a9>] ? mutex_unlock+0x9/0xb <4>[ 48.977873] [<ffffffff810b555f>] ? generic_file_aio_write+0x84/0xa4 <4>[ 48.977873] [<ffffffff810bafa7>] do_writepages+0x1f/0x28 <4>[ 48.977873] [<ffffffff810b4f5c>] __filemap_fdatawrite_range+0x4e/0x50 <4>[ 48.977873] [<ffffffff810b4fee>] filemap_write_and_wait_range+0x28/0x51 <4>[ 48.977873] [<ffffffff811030ca>] vfs_fsync_range+0x36/0x79 <4>[ 48.977873] [<ffffffff8110316b>] vfs_fsync+0x17/0x19 <4>[ 48.977873] [<ffffffff81103196>] do_fsync+0x29/0x3e <4>[ 48.977873] [<ffffffff81103433>] sys_fdatasync+0xe/0x12 <4>[ 48.977873] [<ffffffff810263c2>] system_call_fastpath+0x16/0x1b <0>[ 48.977873] Code: 44 8b 45 b8 8b 43 10 89 c2 49 39 d7 7f 07 41 39 c4 76 02 0f 0b 4d 85 f6 74 11 48 8b 7b 08 48 8b 87 28 03 00 00 4c 3b 70 10 76 02 <0f> 0b 44 89 63 20 44 89 43 2c 49 8b 75 28 48 85 f6 74 1f 41 8b <1>[ 48.977873] RIP [<ffffffff81150b02>] ext4_mb_normalize_request+0x2d3/0x342 <4>[ 48.977873] RSP <ffff880137a6fa88> <4>[ 48.994547] ---[ end trace 5f3a007a6b3c50ca ]--- -- Justin P. Mattock -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Ted Ts'o on 7 Aug 2010 02:50 On Fri, Aug 06, 2010 at 10:48:40PM -0700, Justin Mattock wrote: > hello, > I just built a fresh clfs system using the tutorial.. right now Im > able to boot and am able to login, the system seems to be running as > it should except for when I try to install gmp and/or do a /sbin/lilo > I see a message appear on screen(below) then if I do any kind of > command(dmesg > dmesg) I get a stuck screen. has there been anything > similar to the below message? > > keep in mind the kernel I'm using is 2.6.35-rc6 which on other > machines(same type of system) run just fine without such message. Um, is this a completely modified 2.6.35-rc6 kernel? The reason why I ask is there is no BUG_ON at line fs/ext4/mballoc.c:2993 for that kernel version. There are two BUG_ON statements nearby, but given the line number doesn't match up with either one, it's hard to say for sure which one triggered it. What were the kernel messages right before the BUG_ON? was there a "start NNNNN size NNN, fe_logical NNNN" (where NNNN is some number) right before the "cut here" message? Have you tried forcing an fsck run on the file system to make sure it's not caused by a file-system corruption? And have you tried using a standard released gcc so we can determine for sure whether this is a potential kernel bug, file system corruption issue, or gcc issue? - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Justin P. Mattock on 7 Aug 2010 03:50 On 08/06/2010 11:45 PM, Ted Ts'o wrote: > On Fri, Aug 06, 2010 at 10:48:40PM -0700, Justin Mattock wrote: >> hello, >> I just built a fresh clfs system using the tutorial.. right now Im >> able to boot and am able to login, the system seems to be running as >> it should except for when I try to install gmp and/or do a /sbin/lilo >> I see a message appear on screen(below) then if I do any kind of >> command(dmesg> dmesg) I get a stuck screen. has there been anything >> similar to the below message? >> >> keep in mind the kernel I'm using is 2.6.35-rc6 which on other >> machines(same type of system) run just fine without such message. > > Um, is this a completely modified 2.6.35-rc6 kernel? The reason why I > ask is there is no BUG_ON at line fs/ext4/mballoc.c:2993 for that > kernel version. no not modified at all. current git commit: 2.6.35-rc6-00191-ga2dccdb but says 2.6.35-rc6 because git is not installed yet on this system. (I was able to use ohci1394_dma=early to capture this, no ssh yet) > > There are two BUG_ON statements nearby, but given the line number > doesn't match up with either one, it's hard to say for sure which one > triggered it. What were the kernel messages right before the BUG_ON? > was there a "start NNNNN size NNN, fe_logical NNNN" (where NNNN is > some number) right before the "cut here" message? > > Have you tried forcing an fsck run on the file system to make sure > it's not caused by a file-system corruption? > before the cut here message I have loads of avc denials from SELinux showing up in the log, after the avc's denials I see this: EXT4-fs (sda3): re-mounted. Opts: errors=remount-ro,user_xattr EXT4-fs (sda3): re-mounted. Opts: errors=remount-ro,user_xattr as for fsck I did not do that, but just saw on a reboot that it had fired off with nothing stating corruption or anything. > And have you tried using a standard released gcc so we can determine > for sure whether this is a potential kernel bug, file system > corruption issue, or gcc issue? > > - Ted > this is strange.. I ended up taking a kernel from another machine(literally the same kernel) loaded it up etc.. after booting up doing /sbin/lilo worked, installing gmp worked.. prior too make install with gmp would trigger this half way through the installation reliably as well as /sbin/lilo, and now nothing of the sort of what I posted. After testing the other machines kernel I recompiled the kernel on the new system rebooted and did those steps to reproduce with nothing of the sort of what I had posted as well. The only thing I can think of is during my building of the system, is maybe this was happening because I built the kernel as root i.e. I usually will chroot towards the end of building a system, build the kernel as root, check the symlinks, configurations, then tar ball the whole thing and transfer, then once booted into the new system, start building everything all over again. as for the gcc version I'm using 4.6.0 20100731 as for this being the culprit.. not sure if building the kernel as root causes gcc to change things with this version of gcc or not.. Right now, as I write things look normal again, I've done /sbin/lilo numerous times with all a success, and built gmp mpfr just to make sure with all being a success. Justin P. Mattock -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Pages: 1 Prev: linux-next: build error after merge of the kgdb tree Next: linux-next: Tree for August 7 |