Prev: Finer granularity and task/cgroup irq time accounting
Next: drivers/gpu/drm/radeon/r600_blit.c: fix possible NULL pointer derefernce
From: dann frazier on 20 Jul 2010 13:40 Debian's ia64 autobuilders have been experiencing system crashes while trying to run the gdb test suite: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=588574 I was able to reproduce this w/ the latest git tree, and bisected it down to this commit, introduced in 2.6.32: commit 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 Author: Hugh Dickins <hugh.dickins(a)tiscali.co.uk> Date: Mon Sep 21 17:03:34 2009 -0700 mm: ZERO_PAGE without PTE_SPECIAL Reinstate anonymous use of ZERO_PAGE to all architectures, not just to those which __HAVE_ARCH_PTE_SPECIAL: as suggested by Nick Piggin. Contrary to how I'd imagined it, there's nothing ugly about this, just a zero_pfn test built into one or another block of vm_normal_page(). But the MIPS ZERO_PAGE-of-many-colours case demands is_zero_pfn() and my_zero_pfn() inlines. Reinstate its mremap move_pte() shuffling of ZERO_PAGEs we did from 2.6.17 to 2.6.19? Not unless someone shouts for that: it would have to take vm_flags to weed out some cases. fyi, I found this to not be reproducible on SLES11 SP1 (which is 2.6.32-based). I compared the .configs and found that the relevant difference is the PAGE_SIZE. It does not fail w/ 64KB pages, but reliably fails w/ 16KB pages. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on 20 Jul 2010 22:00 On Tue, 20 Jul 2010 11:35:12 -0600 dann frazier <dannf(a)debian.org> wrote: > Debian's ia64 autobuilders have been experiencing system crashes while > trying to run the gdb test suite: > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=588574 > > I was able to reproduce this w/ the latest git tree, and bisected it > down to this commit, introduced in 2.6.32: > > commit 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 > Author: Hugh Dickins <hugh.dickins(a)tiscali.co.uk> > Date: Mon Sep 21 17:03:34 2009 -0700 > > mm: ZERO_PAGE without PTE_SPECIAL > > Reinstate anonymous use of ZERO_PAGE to all architectures, not just to > those which __HAVE_ARCH_PTE_SPECIAL: as suggested by Nick Piggin. > > Contrary to how I'd imagined it, there's nothing ugly about this, just a > zero_pfn test built into one or another block of vm_normal_page(). > > But the MIPS ZERO_PAGE-of-many-colours case demands is_zero_pfn() and > my_zero_pfn() inlines. Reinstate its mremap move_pte() shuffling of > ZERO_PAGEs we did from 2.6.17 to 2.6.19? Not unless someone shouts for > that: it would have to take vm_flags to weed out some cases. > > fyi, I found this to not be reproducible on SLES11 SP1 (which is > 2.6.32-based). I compared the .configs and found that the relevant > difference is the PAGE_SIZE. It does not fail w/ 64KB pages, but > reliably fails w/ 16KB pages. > Sorry, I have no idea... Hmm, what is the address of empty_zero_page[] on your debian(16kb-page) ? Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: dann frazier on 20 Jul 2010 23:10 On Wed, Jul 21, 2010 at 10:51:36AM +0900, KAMEZAWA Hiroyuki wrote: > On Tue, 20 Jul 2010 11:35:12 -0600 > dann frazier <dannf(a)debian.org> wrote: > > > Debian's ia64 autobuilders have been experiencing system crashes while > > trying to run the gdb test suite: > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=588574 > > > > I was able to reproduce this w/ the latest git tree, and bisected it > > down to this commit, introduced in 2.6.32: > > > > commit 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 > > Author: Hugh Dickins <hugh.dickins(a)tiscali.co.uk> > > Date: Mon Sep 21 17:03:34 2009 -0700 > > > > mm: ZERO_PAGE without PTE_SPECIAL > > > > Reinstate anonymous use of ZERO_PAGE to all architectures, not just to > > those which __HAVE_ARCH_PTE_SPECIAL: as suggested by Nick Piggin. > > > > Contrary to how I'd imagined it, there's nothing ugly about this, just a > > zero_pfn test built into one or another block of vm_normal_page(). > > > > But the MIPS ZERO_PAGE-of-many-colours case demands is_zero_pfn() and > > my_zero_pfn() inlines. Reinstate its mremap move_pte() shuffling of > > ZERO_PAGEs we did from 2.6.17 to 2.6.19? Not unless someone shouts for > > that: it would have to take vm_flags to weed out some cases. > > > > fyi, I found this to not be reproducible on SLES11 SP1 (which is > > 2.6.32-based). I compared the .configs and found that the relevant > > difference is the PAGE_SIZE. It does not fail w/ 64KB pages, but > > reliably fails w/ 16KB pages. > > > > Sorry, I have no idea... > Hmm, what is the address of empty_zero_page[] on your debian(16kb-page) ? dannf(a)krebs:~$ grep empty_zero_page /boot/System.map-2.6.32-5-mckinley a0000001008784c0 d __ksymtab_empty_zero_page a000000100882688 d __kcrctab_empty_zero_page a000000100884ca4 r __kstrtab_empty_zero_page a000000100974000 D empty_zero_page -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Hugh Dickins on 21 Jul 2010 00:30 On Tue, 20 Jul 2010, dann frazier wrote: > On Wed, Jul 21, 2010 at 10:51:36AM +0900, KAMEZAWA Hiroyuki wrote: > > On Tue, 20 Jul 2010 11:35:12 -0600 > > dann frazier <dannf(a)debian.org> wrote: > > > > > Debian's ia64 autobuilders have been experiencing system crashes while > > > trying to run the gdb test suite: > > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=588574 > > > > > > I was able to reproduce this w/ the latest git tree, and bisected it > > > down to this commit, introduced in 2.6.32: > > > > > > commit 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 > > > Author: Hugh Dickins <hugh.dickins(a)tiscali.co.uk> > > > Date: Mon Sep 21 17:03:34 2009 -0700 > > > > > > mm: ZERO_PAGE without PTE_SPECIAL > > > > > > Reinstate anonymous use of ZERO_PAGE to all architectures, not just to > > > those which __HAVE_ARCH_PTE_SPECIAL: as suggested by Nick Piggin. > > > > > > Contrary to how I'd imagined it, there's nothing ugly about this, just a > > > zero_pfn test built into one or another block of vm_normal_page(). > > > > > > But the MIPS ZERO_PAGE-of-many-colours case demands is_zero_pfn() and > > > my_zero_pfn() inlines. Reinstate its mremap move_pte() shuffling of > > > ZERO_PAGEs we did from 2.6.17 to 2.6.19? Not unless someone shouts for > > > that: it would have to take vm_flags to weed out some cases. > > > > > > fyi, I found this to not be reproducible on SLES11 SP1 (which is > > > 2.6.32-based). I compared the .configs and found that the relevant > > > difference is the PAGE_SIZE. It does not fail w/ 64KB pages, but > > > reliably fails w/ 16KB pages. > > > > > > > Sorry, I have no idea... > > Hmm, what is the address of empty_zero_page[] on your debian(16kb-page) ? > > > dannf(a)krebs:~$ grep empty_zero_page /boot/System.map-2.6.32-5-mckinley > a0000001008784c0 d __ksymtab_empty_zero_page > a000000100882688 d __kcrctab_empty_zero_page > a000000100884ca4 r __kstrtab_empty_zero_page > a000000100974000 D empty_zero_page Thanks a lot for reporting this, but I too have no idea yet. It is likely that the bug is not to be found in that 62eede62, but rather in one of the preceding patches to mm/memory.c which 62eede62 was extending to ia64 and other architectures without PTE_SPECIAL. I wonder, from looking at that gdb testsuite log, is it plausible that all these hangs/crashes occurred when writing out a coredump? Is that something you could check for us? or rule out the possibility. I was rather proud of the get_dump_page() simplification, but perhaps there's something nasty lurking in there. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on 21 Jul 2010 09:00
> On Tue, 20 Jul 2010, dann frazier wrote: > > On Wed, Jul 21, 2010 at 10:51:36AM +0900, KAMEZAWA Hiroyuki wrote: > > > On Tue, 20 Jul 2010 11:35:12 -0600 > > > dann frazier <dannf(a)debian.org> wrote: > > > > > > > Debian's ia64 autobuilders have been experiencing system crashes while > > > > trying to run the gdb test suite: > > > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=588574 > > > > > > > > I was able to reproduce this w/ the latest git tree, and bisected it > > > > down to this commit, introduced in 2.6.32: > > > > > > > > commit 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 > > > > Author: Hugh Dickins <hugh.dickins(a)tiscali.co.uk> > > > > Date: Mon Sep 21 17:03:34 2009 -0700 > > > > > > > > mm: ZERO_PAGE without PTE_SPECIAL > > > > > > > > Reinstate anonymous use of ZERO_PAGE to all architectures, not just to > > > > those which __HAVE_ARCH_PTE_SPECIAL: as suggested by Nick Piggin. > > > > > > > > Contrary to how I'd imagined it, there's nothing ugly about this, just a > > > > zero_pfn test built into one or another block of vm_normal_page(). > > > > > > > > But the MIPS ZERO_PAGE-of-many-colours case demands is_zero_pfn() and > > > > my_zero_pfn() inlines. Reinstate its mremap move_pte() shuffling of > > > > ZERO_PAGEs we did from 2.6.17 to 2.6.19? Not unless someone shouts for > > > > that: it would have to take vm_flags to weed out some cases. > > > > > > > > fyi, I found this to not be reproducible on SLES11 SP1 (which is > > > > 2.6.32-based). I compared the .configs and found that the relevant > > > > difference is the PAGE_SIZE. It does not fail w/ 64KB pages, but > > > > reliably fails w/ 16KB pages. > > > > > > > > > > Sorry, I have no idea... > > > Hmm, what is the address of empty_zero_page[] on your debian(16kb-page) ? > > > > > > dannf(a)krebs:~$ grep empty_zero_page /boot/System.map-2.6.32-5-mckinley > > a0000001008784c0 d __ksymtab_empty_zero_page > > a000000100882688 d __kcrctab_empty_zero_page > > a000000100884ca4 r __kstrtab_empty_zero_page > > a000000100974000 D empty_zero_page > > Thanks a lot for reporting this, but I too have no idea yet. > > It is likely that the bug is not to be found in that 62eede62, but > rather in one of the preceding patches to mm/memory.c which 62eede62 > was extending to ia64 and other architectures without PTE_SPECIAL. > > I wonder, from looking at that gdb testsuite log, is it plausible > that all these hangs/crashes occurred when writing out a coredump? > Is that something you could check for us? or rule out the possibility. > > I was rather proud of the get_dump_page() simplification, > but perhaps there's something nasty lurking in there. Ug. I did tested some zero page thing at developing 62eede62 on ia64. but unforunatelly, I've lost ia64 test environment by physical machine crash. and I don't remember I did test which page size ;) Umm... I also have no idea. sorry. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |