Prev: [Bug #15615] NULL pointer deref in task_is_waking
Next: [Bug #15603] lockdep warning at boot time when determining whether to resume
From: Rik van Riel on 10 Apr 2010 16:40 On 04/10/2010 04:05 PM, Linus Torvalds wrote: > This patch is scary and untested, but the more I look at that code, the > more convinced I am that vma_adjust was _really_ badly screwed up. The > patch below may make things worse. I'll test it myself too, but I'm > sending it out first, since I was writing the email as I was looking at > the piece of cr*p. Your patch looks correct. Gotta love how before, "vma" could be either exporter or importer! I'm guessing that it did not break before my changes, because of plain old luck... Acked-by: Rik van Riel <riel(a)redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 10 Apr 2010 16:40 On Sat, 10 Apr 2010, Rik van Riel wrote: > On 04/10/2010 04:05 PM, Linus Torvalds wrote: > > > And vma_adjust is the one place that does that anon_vma_merge(), which is > > apart from the actual unmapping sequence the only other place that > > actually free's anon_vmas. So there are reasons to be very suspicious of > > that code. > > It frees anon_vma_chain structures, but not actual anon_vmas. Rik, I think you're ignoring the fact that the anon_vma_chain is also the implicit refcount. So when you don't create the chains, you implicitly end up freeing the anon_vma too early. In fact, it might well happen at that 'anon_vma_merge()': when it does the unlink_anon_vmas(), it may be unlinking the last remaining anon_vma ref, and then anon_vma_unlink _will_ in fact free the anon_vma. Even though we have a 'vma->anon_vma' pointer that points to it - because the chains weren't set up correctly. > Walking the anon_vma (from rmap) requires the anon_vma->lock, > which is taken in anon_vma_merge whenever a chain is unlinked. None of that matters. If the dang thing got free'd, the lock isn't reliable any more. > A few lines up from that code, we have: > > if (vma->anon_vma && (insert || importer || start != vma->vm_start)) > anon_vma = vma->anon_vma; > > So anon_vma should always be vma->anon_vma. No. vma->anon_vma is NULL, so the above lines are total no-ops. We're trying to _fill_ it. But we're doing it wrong. So we end up with: anon_vma = next->anon-vma importer = vma and we do: if (anon_vma_clone(importer, vma)) { return -ENOMEM; } importer->anon_vma = anon_vma; do you see? The "anon_vma_clone(importer, vma)" does NOTHING, because it is cloning from the wrong source (from 'vma', rather than from 'next', so it leaves the vma chains empty. And then, despite having empty chains, we do that importer->anon_vma = anon_vma; which sets the anon_vma to the (non-NULL) next->anon_vma. And then, a bit later, we'll do anon_vma_merge(vma, next); which will happily notice that the anon_vma's of both vma and next match (because we just _set_ them to match), and then frees the ONLY REMAINING CHAIN - the one in next. The one we DID NOT CORRECTLY COPY, because we got our sources completely screwed up. > What am I overlooking? Can you see it now? > If we import a chain, from vma to importer, importer->anon_vma > will be equal to vma->anon_vma. The thing you seem to miss is that we aren't supposed to import the chain from 'vma' AT ALL. The anon_vma came from _next_, not from 'vma'! > I do not see how 'importer' could get a state different from 'vma'. Stop worrying about 'vma'. Start worrying about 'next'. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 10 Apr 2010 16:50 On Sat, 10 Apr 2010, Borislav Petkov wrote: > From: Linus Torvalds <torvalds(a)linux-foundation.org> > Date: Sat, Apr 10, 2010 at 01:12:46PM -0700 > > > So I'm actually pretty optimistic that this really is it. > > Ok, let me verify what/in which order should be tested before I test > something wrongly. The RCU-safe fix for the TLB flush can stay for > correctness reasons, this last patch, obviosly, what happens with the > find_mergeable_anon_vma() changes to use only singleton lists for > merging? Should I keep those too? Yes. So the patches I actually think are important are: - the RCU fix is real, although admittedly the race window is probably too small to ever really hit. - the simplification rule to find_mergeable_anon_vma's is required, because otherwise our anon_vma_merge() will do the wrong thing (maybe Johannes' patch would be an alternative, but quite frankly, I think we want the simpler code, and I don't think we even _want_ to share anon_vma's that are complex due to forking) I like my "cleanup" version (the bigger one with lots of comments) more than the two-liner version, but they should be equivalent. - the vma_adjust() fix is the one that I think may actually end up fixing your problems for good. Knock wood. So I think they are all required, but I suspect that the vma_adjust() one is finally the most direct explanation of the problem you've seen. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on 10 Apr 2010 16:50 On 04/10/2010 04:34 PM, Linus Torvalds wrote: >> What am I overlooking? > > Can you see it now? Yeah, after reading through your patch it became obvious. It's the code above this code that sets up the problem. It's a small miracle it worked before... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 10 Apr 2010 17:40
On Sat, 10 Apr 2010, Borislav Petkov wrote: > > Damn, nope, still no joy :(. It looked like it was fixed but one of the > test was to hibernate right after the 3 kvm guests were shut down and I > guess the mem freeing pattern kinda hits it where it most hurts. Damn, I really hoped that was it. Three independent bugs found and fixed, and still no joy? Oh well. > By the way, do we want to create a new thread - the mailchain is off the > screen limits of my netbook :) I prefer to keep it in one thread so that they all show up together if I need to, but feel free to start a new one. Not a biggie. > [ 647.492781] BUG: unable to handle kernel NULL pointer dereference at (null) > [ 647.493001] IP: [<ffffffff810c60a0>] page_referenced+0xee/0x1dc Well, it sure is consistent. I'll start to think about what else could go wrong.. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |