From: Minchan Kim on 13 Apr 2010 00:30 On Tue, Apr 13, 2010 at 1:23 PM, Minchan Kim <minchan.kim(a)gmail.com> wrote: > On Tue, Apr 13, 2010 at 10:08 AM, Linus Torvalds > <torvalds(a)linux-foundation.org> wrote: >> >> >> On Tue, 13 Apr 2010, Johannes Weiner wrote: >>> >>> Would you mind pasting that nice description of the error case from your >>> other email into that changelog? I skimmed over the description but when >>> I read this patch several hours later, I had to go back to that previous >>> email to fully make sense of it. >> >> It now looks like this.. >> >> Linus >> --- >> From: Linus Torvalds <torvalds(a)linux-foundation.org> >> Date: Mon, 12 Apr 2010 12:44:29 -0700 >> Subject: [PATCH 4/4] anonvma: when setting up page->mapping, we need to pick the _oldest_ anonvma >> >> Otherwise we might be mapping in a page in a new mapping, but that page >> (through the swapcache) would later be mapped into an old mapping too. >> The page->mapping must be the case that works for everybody, not just >> the mapping that happened to page it in first. >> >> Here's the scenario: >> >> - page gets allocated/mapped by process A. Let's call the anon_vma we >> associate the page with 'A' to keep it easy to track. >> >> - Process A forks, creating process B. The anon_vma in B is 'B', and has >> a chain that looks like 'B' -> 'A'. Everything is fine. >> >> - Swapping happens. The page (with mapping pointing to 'A') gets swapped >> out (perhaps not to disk - it's enough to assume that it's just not >> mapped any more, and lives entirely in the swap-cache) >> >> - Process B pages it in, which goes like this: >> >> do_swap_page -> >> page = lookup_swap_cache(entry); >> ... >> set_pte_at(mm, address, page_table, pte); >> page_add_anon_rmap(page, vma, address); >> >> And think about what happens here! >> >> In particular, what happens is that this will now be the "first" >> mapping of that page, so page_add_anon_rmap() used to do >> >> if (first) >> __page_set_anon_rmap(page, vma, address); >> >> and notice what anon_vma it will use? It will use the anon_vma for >> process B! >> >> What happens then? Trivial: process 'A' also pages it in (nothing >> happens, it's not the first mapping), and then process 'B' execve's >> or exits or unmaps, making anon_vma B go away. >> >> End result: process A has a page that points to anon_vma B, but >> anon_vma B does not exist any more. This can go on forever. Forget >> about RCU grace periods, forget about locking, forget anything like >> that. The bug is simply that page->mapping points to an anon_vma >> that was correct at one point, but was _not_ the one that was shared >> by all users of that possible mapping. >> >> Changing it to always use the deepest anon_vma in the anonvma chain gets >> us to the safest model. >> >> This can be improved in certain cases: if we know the page is private to >> just this particular mapping (for example, it's a new page, or it is the >> only swapcache entry), we could pick the top (most specific) anon_vma. >> >> But that's a future optimization. Make it _work_ reliably first. >> >> Reviewed-by: Rik van Riel <riel(a)redhat.com> >> Acked-by: Johannes Weiner <hannes(a)cmpxchg.org> >> Tested-by: Borislav Petkov <bp(a)alien8.de> [ "What do you know, I think you fixed it!" ] >> Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org> Reviewed-by: Minchan Kim <minchan.kim(a)gmail.com> Sorry for mistake. I was extremely excited. :) -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Pages: 1 Prev: lmb: Add lmb_find_area_node() Next: lmb: Seperate __lmb_find_base() from __lmb_alloc_base() |