From: Christoph Lameter on 27 Apr 2010 18:30 On Tue, 27 Apr 2010, Mel Gorman wrote: > The problem is that there are some races that either allow migration PTEs to > be copied or a migration PTE to be left behind. Migration still completes and > the page is unlocked but later a fault will call migration_entry_to_page() > and BUG() because the page is not locked. This series aims to close some > of these races. In general migration ptes were designed to cause the code encountering them to go to sleep until migration is finished and a regular pte is available. Looks like we are tolerating the handling of migration entries. Never imagined copying page table with this. There could be recursion issues of various kinds because page migration requires operations on the page tables while performing migration. A simple fix would be to *not* migrate page table tables. > Patch 1 alters fork() to restart page table copying when a migration PTE is > encountered. Can we simply wait like in the fault path? > Patch 3 notes that while a VMA is moved under the anon_vma lock, the page > tables are not similarly protected. Where migration PTEs are > encountered, they are cleaned up. This means they are copied / moved etc and "cleaned" up in a state when the page was unlocked. Migration entries are not supposed to exist when a page is not locked. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andrea Arcangeli on 27 Apr 2010 18:40 On Tue, Apr 27, 2010 at 05:27:36PM -0500, Christoph Lameter wrote: > Can we simply wait like in the fault path? There is no bug there, no need to wait either. I already audited it before, and I didn't see any bug. Unless you can show a bug with CPU A running the rmap_walk on process1 before process2, there is no bug to fix there. > > > Patch 3 notes that while a VMA is moved under the anon_vma lock, the page > > tables are not similarly protected. Where migration PTEs are > > encountered, they are cleaned up. > > This means they are copied / moved etc and "cleaned" up in a state when > the page was unlocked. Migration entries are not supposed to exist when > a page is not locked. patch 3 is real, and the first thought I had was to lock down the page before running vma_adjust and unlock after move_page_tables. But these are virtual addresses. Maybe there's a simpler way to keep migration away while we run those two operations. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on 27 Apr 2010 20:20 On Wed, 28 Apr 2010 00:32:42 +0200 Andrea Arcangeli <aarcange(a)redhat.com> wrote: > On Tue, Apr 27, 2010 at 05:27:36PM -0500, Christoph Lameter wrote: > > Can we simply wait like in the fault path? > > There is no bug there, no need to wait either. I already audited it > before, and I didn't see any bug. Unless you can show a bug with CPU A > running the rmap_walk on process1 before process2, there is no bug to > fix there. > I think there is no bug, either. But that safety is fragile. > > > > > Patch 3 notes that while a VMA is moved under the anon_vma lock, the page > > > tables are not similarly protected. Where migration PTEs are > > > encountered, they are cleaned up. > > > > This means they are copied / moved etc and "cleaned" up in a state when > > the page was unlocked. Migration entries are not supposed to exist when > > a page is not locked. > > patch 3 is real, and the first thought I had was to lock down the page > before running vma_adjust and unlock after move_page_tables. But these > are virtual addresses. Maybe there's a simpler way to keep migration > away while we run those two operations. > Doing some check in move_ptes() after vma_adjust() is not safe. IOW, when vma's information and information in page-table is incosistent...objrmap is broken and migartion will cause panic. Then...I think there are 2 ways. 1. use seqcounter in "mm_struct" as previous patch and lock it at mremap. or 2. get_user_pages_fast() when do remap. Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Mel Gorman on 28 Apr 2010 05:20 On Wed, Apr 28, 2010 at 12:32:42AM +0200, Andrea Arcangeli wrote: > On Tue, Apr 27, 2010 at 05:27:36PM -0500, Christoph Lameter wrote: > > Can we simply wait like in the fault path? > > There is no bug there, no need to wait either. I already audited it > before, and I didn't see any bug. Unless you can show a bug with CPU A > running the rmap_walk on process1 before process2, there is no bug to > fix there. > Yes, this patch is now dropped. > > > > > Patch 3 notes that while a VMA is moved under the anon_vma lock, the page > > > tables are not similarly protected. Where migration PTEs are > > > encountered, they are cleaned up. > > > > This means they are copied / moved etc and "cleaned" up in a state when > > the page was unlocked. Migration entries are not supposed to exist when > > a page is not locked. > > patch 3 is real, and the first thought I had was to lock down the page > before running vma_adjust and unlock after move_page_tables. But these > are virtual addresses. Maybe there's a simpler way to keep migration > away while we run those two operations. > I see there is a large discussion on that patch so I'll read that rather than commenting here. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andrea Arcangeli on 28 Apr 2010 09:30
On Wed, Apr 28, 2010 at 09:13:45AM +0900, KAMEZAWA Hiroyuki wrote: > Doing some check in move_ptes() after vma_adjust() is not safe. > IOW, when vma's information and information in page-table is incosistent...objrmap > is broken and migartion will cause panic. > > Then...I think there are 2 ways. > 1. use seqcounter in "mm_struct" as previous patch and lock it at mremap. > or > 2. get_user_pages_fast() when do remap. 3 take the anon_vma->lock -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |