Prev: [Bug #15615] NULL pointer deref in task_is_waking
Next: [Bug #15603] lockdep warning at boot time when determining whether to resume
From: Rik van Riel on 12 Apr 2010 14:50 On 04/12/2010 12:26 PM, Linus Torvalds wrote: > But there is a _much_ more subtle case that involved swapping. > > So guys, here's my fairly simple theory on what happens: That bug looks entirely possible. Given that Borislav has heavy swapping going on, it is quite possible that this is the bug he has been triggering. > The patch below is my largely mindless try at fixing this. It's untested. > I'm not entirely sure that it actually works. But it makes some amount of > conceptual sense. No? The patch would help avoid the bug you described. It does have the drawback of moving all the pages of child processes back into the anon_vma of the parent process after swapin, even if they are privately owned pages by the child process. I am guessing it may need a check to see whether the page and swap slot are exclusively owned by the current process. Page or swap slot shared? => oldest anon_vma Page and swap slot exclusive? => newest anon_vma I suspect the easiest way to achieve this would be to pass a flag in from do_swap_page, where we already check this, a few lines above calling page_add_anon_rmap: if ((flags & FAULT_FLAG_WRITE) && reuse_swap_page(page)) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); flags &= ~FAULT_FLAG_WRITE; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 12 Apr 2010 15:30 On Mon, 12 Apr 2010, Borislav Petkov wrote: > > But I'll give the patch a run anyway in an hour or so anyway. Thanks. I suspect you will find that even if there is no actual disk IO swapping going on during any of the normal loads, the shrink_all_memory() thing in your hibernation event will cause swap to happen. Or at least swap-cache entries to be done. Oh, and I've decided that my rcu_read_lock() patch for the tlb_gather() thing for unmapping is bogus. Exactly because the critical issue isn't when the page is free'd (and page->mapping is cleared), but when the page is unmapped (and page_mapped() clears). And that is done correctly even with the delayed frees in tlb_gather. So addign the rcu_read_lock/rcu_read_unlock around it all doesn't actually matter or help. So the patches that I think fix real bugs are - the anon_vma_prepare() fix to only share anon_vma's if they are singletons. - the vma_adjust() fix to copy the right anon_vma chains - the anon_vma_clone() fix to traverse the avc's in reverse order, so that the resulting cloned chain is the same as the original chain You got this patch as part of the "verify_vma()" patch, but the only part of that patch that matters is the one-liner that changes a "for_each_list_entry" to use the "_reverse()" version.. - and that last patch to pick the right anon_vma when mapping a page (which could still be improved: the "insert new page" case does _not_ have to take the oldest anon_vma, and Rik is correct that if we have an exclusive swap cache entry we could also take the top one) I think I'll re-post all four patches with real commit messages, to get ack's for them. I'd like to finally get the much delayed -rc4 out the door. Oh, and if that "pick the right anon_vma" patch doesn't fix it, I suspect we'll have to revert the whole anon_vma changes for 2.6.34. It's getting pretty late in the -rc series to fix this bug. I'm _hoping_ that I really nailed it this time, and that we're ok, but if Borislav reports it still happening, and people not having any other ideas, I think I'll just have to do an -rc4 with it all reverted, and then we can try again for 35 if somebody figures out the bug. Hmm? I'd hate to revert it all now because of the hours I've put in looking at the code (to the point that I feel I understand it), but at the same time, if it was somebody else who was chasing this bug and not being able to fix it, I'd tell them "revert it, it's too late". Amount of effort spent doesn't matter if the bug still happens ;^( Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 12 Apr 2010 18:30
Oh, btw, I like your email gateway. Only noticed now: mail.skyhub.de (SuperMail on ZX Spectrum 128k) that's a tough little machine. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |