From: Stanislaw Gruszka on
Hello.

After update to 2.6.34-rc1, I was experimented by strange oopses during
boot, what looked like memory corruption. Bisection shows that first bad
commit is 59be5a8e8ce765cf739ec7f07176219972de7481 ("x86: Make 32bit
support NO_BOOTMEM"). When I disable CONFIG_NO_BOOTMEM I'm able to start
system. Not sure what info is need to track down this issue, so please
let me know.

Cheers
Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
On Fri, Mar 19, 2010 at 6:12 AM, Stanislaw Gruszka <sgruszka(a)redhat.com> wrote:
> Hello.
>
> After update to 2.6.34-rc1, I was experimented by strange oopses during
> boot, what looked like memory corruption. Bisection shows that first bad
> commit is 59be5a8e8ce765cf739ec7f07176219972de7481 ("x86: Make 32bit
> support NO_BOOTMEM"). When I disable CONFIG_NO_BOOTMEM I'm able to start
> system. Not sure what info is need to track down this issue, so please
> let me know.

can you check patch

https://patchwork.kernel.org/patch/87081/

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dave Airlie on
On Fri, Mar 19, 2010 at 11:12 PM, Stanislaw Gruszka <sgruszka(a)redhat.com> wrote:
> Hello.
>
> After update to 2.6.34-rc1, I was experimented by strange oopses during
> boot, what looked like memory corruption. Bisection shows that first bad
> commit is 59be5a8e8ce765cf739ec7f07176219972de7481 ("x86: Make 32bit
> support NO_BOOTMEM"). When I disable CONFIG_NO_BOOTMEM I'm able to start
> system. Not sure what info is need to track down this issue, so please
> let me know.
>

I had a similar issue today, wasted a morning try to get either -rc1
or -rc2 to boot on my 32-bit desktop here,
until I worked out that the default for this option is to be on.

I suggest CONFIG_NO_BOOTMEM be default n for now.

Not sure why it was merged as default y.

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stanislaw Gruszka on
On Sat, Mar 20, 2010 at 11:26:06AM -0700, Yinghai Lu wrote:
> > After update to 2.6.34-rc1, I was experimented by strange oopses during
> > boot, what looked like memory corruption. Bisection shows that first bad
> > commit is 59be5a8e8ce765cf739ec7f07176219972de7481 ("x86: Make 32bit
> > support NO_BOOTMEM"). When I disable CONFIG_NO_BOOTMEM I'm able to start
> > system. Not sure what info is need to track down this issue, so please
> > let me know.
>
> can you check patch
>
> https://patchwork.kernel.org/patch/87081/

Patch helps somehow. Instead of many random oopses, now I have one and
the same oops, here is photo:
http://people.redhat.com/sgruszka/20100322_001.jpg

Oops is in pcpu_alloc+0x1aa, in code this is

(gdb) l *(pcpu_alloc +0x1aa)
0xc04c2272 is in prefetch (/mnt/rhel5/usr/src/kernels/linux-2.6-debuginfo/arch/x86/include/asm/processor.h:886).
881 * It's not worth to care about 3dnow prefetches for the K6
882 * because they are microcoded there and very slow.
883 */
884 static inline void prefetch(const void *x)
885 {
886 alternative_input(BASE_PREFETCH,
887 "prefetchnta (%1)",
888 X86_FEATURE_XMM,
889 "r" (x));
890 }
(gdb) l *(pcpu_alloc +0x1a0)
0xc04c2268 is in pcpu_alloc (mm/percpu.c:1137).
1132 */
1133 goto restart;
1134 }
1135
1136 off = pcpu_alloc_area(chunk, size, align);
1137 if (off >= 0)
1138 goto area_found;
1139 }
1140 }
1141
(gdb) l *(pcpu_alloc +0x1b0)
0xc04c2278 is in pcpu_alloc (mm/percpu.c:1116).
1111 }
1112
1113 restart:
1114 /* search through normal chunks */
1115 for (slot = pcpu_size_to_slot(size); slot < pcpu_nr_slots; slot++) {
1116 list_for_each_entry(chunk, &pcpu_slot[slot], list) {
1117 if (size > chunk->contig_hint)
1118 continue;
1119
1120 new_alloc = pcpu_need_to_extend(chunk);

So seems pcpu_slot[slot] is somehow corrupted. Looking further give
pcpu_slot is allocated by:

pcpu_slot = alloc_bootmem(pcpu_nr_slots * sizeof(pcpu_slot[0]));

So still we have some problem with CONFIG_NO_BOOTMEM on 32 bits.

Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Yinghai Lu on
On 03/23/2010 04:35 AM, Stanislaw Gruszka wrote:
> On Sat, Mar 20, 2010 at 11:26:06AM -0700, Yinghai Lu wrote:
>>> After update to 2.6.34-rc1, I was experimented by strange oopses during
>>> boot, what looked like memory corruption. Bisection shows that first bad
>>> commit is 59be5a8e8ce765cf739ec7f07176219972de7481 ("x86: Make 32bit
>>> support NO_BOOTMEM"). When I disable CONFIG_NO_BOOTMEM I'm able to start
>>> system. Not sure what info is need to track down this issue, so please
>>> let me know.
>>
>> can you check patch
>>
>> https://patchwork.kernel.org/patch/87081/
>
> Patch helps somehow. Instead of many random oopses, now I have one and
> the same oops, here is photo:
> http://people.redhat.com/sgruszka/20100322_001.jpg

how does e820 look like?

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/