Prev: [Bug #15615] NULL pointer deref in task_is_waking
Next: [Bug #15603] lockdep warning at boot time when determining whether to resume
From: Linus Torvalds on 7 Apr 2010 22:40 On Thu, 8 Apr 2010, KOSAKI Motohiro wrote: > > Now pagefault don't insert anon_vma anymore, right? if so, SIGBUS is better. > Now SIGBUS and VM_FAULT_OOM make different result. > > SIGBUS -> kill current task > VM_FAULT_OOM -> invoke oom-killer (see pagefault_out_of_memory()) Yeah, maybe VM_FAULT_SIGBUS works ok instead of VM_FAULT_OOM. But the cause of it is the system having been oom when themappign was created, so I think either is fine. > If current task can't recover proper anon_vma. we should just kill current > instead random highest badness process. otherwise !anon_vma process continue > to randomly invoke oom-killer. Yes, that is a good point. Anyway, I think it might be interesting to test my anon_vma_prepare() locking change patch together with Rik's _first_ version of his "fix anon_vma_prepare" thing (the one without the spinlock). They should apply independently of each other, and maybe it all even works together. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 8 Apr 2010 10:20 On Thu, 8 Apr 2010, Borislav Petkov wrote: > > There are still issues: vma_adjust() grabs mapping->i_mmap_lock for file > mappings while we might sleep in anon_vma_prepare(): Ahh. Good catch. So I can't actually do that anon_vma_prepare() thing in __insert_vm_struct. It should be simple enough to just move it into the caller, just after it releases that lock. There's only one user of that __insert_vm_struct() anyway. You can do it yourself, or you can replace my previous patch with this.. [ The patch below also makes it warn once and return SIGBUS for the case where there is no anon_vma. I decided I still want to hear about it if there might be some path that tries to insert a vma on its own ] Linus --- mm/memory.c | 12 +++--------- mm/mmap.c | 17 ++++------------- 2 files changed, 7 insertions(+), 22 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 833952d..08d4423 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2223,9 +2223,6 @@ reuse: gotten: pte_unmap_unlock(page_table, ptl); - if (unlikely(anon_vma_prepare(vma))) - goto oom; - if (is_zero_pfn(pte_pfn(orig_pte))) { new_page = alloc_zeroed_user_highpage_movable(vma, address); if (!new_page) @@ -2766,8 +2763,6 @@ static int do_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma, /* Allocate our own private page. */ pte_unmap(page_table); - if (unlikely(anon_vma_prepare(vma))) - goto oom; page = alloc_zeroed_user_highpage_movable(vma, address); if (!page) goto oom; @@ -2863,10 +2858,6 @@ static int __do_fault(struct mm_struct *mm, struct vm_area_struct *vma, if (flags & FAULT_FLAG_WRITE) { if (!(vma->vm_flags & VM_SHARED)) { anon = 1; - if (unlikely(anon_vma_prepare(vma))) { - ret = VM_FAULT_OOM; - goto out; - } page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address); if (!page) { @@ -3115,6 +3106,9 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma, pmd_t *pmd; pte_t *pte; + if (WARN_ONCE(!vma->anon_vma, "Mapping with no anon_vma")) + return VM_FAULT_SIGBUS; + __set_current_state(TASK_RUNNING); count_vm_event(PGFAULT); diff --git a/mm/mmap.c b/mm/mmap.c index 75557c6..82392c2 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -463,6 +463,8 @@ static void vma_link(struct mm_struct *mm, struct vm_area_struct *vma, mm->map_count++; validate_mm(mm); + + anon_vma_prepare(vma); } /* @@ -628,6 +630,8 @@ again: remove_next = 1 + (end > next->vm_end); if (mapping) spin_unlock(&mapping->i_mmap_lock); + anon_vma_prepare(vma); + if (remove_next) { if (file) { fput(file); @@ -1674,12 +1678,6 @@ int expand_upwards(struct vm_area_struct *vma, unsigned long address) if (!(vma->vm_flags & VM_GROWSUP)) return -EFAULT; - /* - * We must make sure the anon_vma is allocated - * so that the anon_vma locking is not a noop. - */ - if (unlikely(anon_vma_prepare(vma))) - return -ENOMEM; anon_vma_lock(vma); /* @@ -1720,13 +1718,6 @@ static int expand_downwards(struct vm_area_struct *vma, { int error; - /* - * We must make sure the anon_vma is allocated - * so that the anon_vma locking is not a noop. - */ - if (unlikely(anon_vma_prepare(vma))) - return -ENOMEM; - address &= PAGE_MASK; error = security_file_mmap(NULL, 0, 0, 0, address, 1); if (error) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on 8 Apr 2010 14:30 On 04/08/2010 10:11 AM, Linus Torvalds wrote: > > > On Thu, 8 Apr 2010, Borislav Petkov wrote: >> >> There are still issues: vma_adjust() grabs mapping->i_mmap_lock for file >> mappings while we might sleep in anon_vma_prepare(): > > Ahh. Good catch. So I can't actually do that anon_vma_prepare() thing in > __insert_vm_struct. > > It should be simple enough to just move it into the caller, just after it > releases that lock. There's only one user of that __insert_vm_struct() > anyway. You can do it yourself, or you can replace my previous patch with > this.. > > [ The patch below also makes it warn once and return SIGBUS for the case > where there is no anon_vma. I decided I still want to hear about it if > there might be some path that tries to insert a vma on its own ] Reviewed-by: Rik van Riel <riel(a)redhat.com> I haven't seen any places that insert VMAs by itself. Several strange places that allocate them, but they all appear to use the standard functions to insert them. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 8 Apr 2010 14:40 On Thu, 8 Apr 2010, Rik van Riel wrote: > > Reviewed-by: Rik van Riel <riel(a)redhat.com> Yeah, I think I'll commit it as-is, assuming we get confirmation that it (along with your patch) actually ends up fixing the original problem. I had actually had lockdep etc on with that patch, but for some reason I'd overlooked the SPINLOCK_SLEEP debugging, so I hadn't seen the stupid issue that Borislav pointed out. I wonder if LOCKDEP or spinlock debugging hould just select it. Small detail, but I should have caught that obvious bug myself. > I haven't seen any places that insert VMAs by itself. > Several strange places that allocate them, but they > all appear to use the standard functions to insert them. Yeah, it's complicated enough to add a vma with all the rbtree etc stuff that I hope nobody actually cooks their own. But I too grepped for vma allocations, and there were more of them than I expected, so... Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 8 Apr 2010 19:30
On Thu, 8 Apr 2010, Borislav Petkov wrote: > > And this happens quite often - I changed the WARN_ONCE to WARN and can't > start kvm, iceowl (mozilla calendar) and the console-kit-daemon craps up > upon boot too: Hmm. I tried console-kit-daemon, which I had installed, but didn't get anything like that. Probably some setup difference. I also went through every user of 'vm_area_cachep', and saw nothing suspicious at least for the mmu case (I didn't check the nommu.c code). I must have missed something. One thing you could do is to add some more debugging info when that "no anon_vma" warning happens. In particular, if you still have the SLUB debugging on, you could try to do that page = virt_to_head_page(vma); object_err(vm_area_cachep, page, (void *)vma, "NULL anon_vma"); and it should give you _which_ routine did the kmem_cache_alloc() for the vma that doesn't have an anon_vma. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |