Prev: [PATCH] Documentation: Add doc about aerdriver.* parameters to kernel-parameters.txt
Next: [PATCH] Allow empty e820 map if EFI map will be provided later.
From: Oleg Nesterov on 2 Jun 2010 11:50 (add Roland) On 06/02, KOSAKI Motohiro wrote: > > > Otoh, if we make do_coredump() interruptible (and we should do this > > in any case), then perhaps the TIF_MEMDIE+PF_COREDUMP is not really > > needed? Afaics we always send SIGKILL along with TIF_MEMDIE. > > How is to make per-process oom flag + interruptible coredump? > > this per-process oom flag can be used vmscan shortcut exiting too. > (IOW, It can help DavidR mmap_sem issue) Firstly, this solution is not complete. We should make it really interruptible (from user-space too), but we need more changes for this (in particular we need to distinguish group-exit/exec cases from the explicit SIGKILL case). Let's not discuss this here, this is the different story. But. I agree very much that it makes sense to add the quick fix right now. Even if this fix will be superseded by the "proper" fixes later. > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -2038,6 +2038,11 @@ static int elf_core_dump(struct coredump_params *cprm) > page_cache_release(page); > } else > stop = !dump_seek(cprm->file, PAGE_SIZE); > + > + /* Now, The process received OOM. Exit soon! */ > + if (current->signal->oom_victim) > + stop = 1; Agreed, most problems with memory allocations should come from this loop. > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -544,6 +544,9 @@ struct signal_struct { > int notify_count; > struct task_struct *group_exit_task; > > + /* true mean the process is OOM-killer victim. */ > + bool oom_victim; Well, the new word in signal_struct is not nice. It is better to set SIGNAL_OOM_XXX in ->signal->flags (this needs ->siglock). But. I don't think that signal_struct is the right place for the marker. The thread which actually dumps the core doesn't necessarily belong to the same thread group, but it can share ->mm with the selected oom victim. IOW, we should mark ->mm instead (perhaps mm->flags) or mm->core_state. This in turn means we need find_lock_task_mm(). What do you think? Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Roland McGrath on 2 Jun 2010 13:40 Why not just test TIF_MEMDIE? Thanks, Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Oleg Nesterov on 2 Jun 2010 14:00 On 06/02, Roland McGrath wrote: > > Why not just test TIF_MEMDIE? Because it is per-thread. when select_bad_process() finds the task P to kill it can participate in the core dump (sleep in exit_mm), but we should somehow inform the thread which actually dumps the core: P->mm->core_state->dumper. Well, we can use TIF_MEMDIE if we chose the right thread, I think. But perhaps mm->flags |= MMF_OOM is better, it can have other user. I dunno. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Roland McGrath on 2 Jun 2010 15:00 > Because it is per-thread. I see. > when select_bad_process() finds the task P to kill it can participate > in the core dump (sleep in exit_mm), but we should somehow inform the > thread which actually dumps the core: P->mm->core_state->dumper. Perhaps it should simply do that: if you would choose P to oom-kill, and P->mm->core_state!=NULL, then choose P->mm->core_state->dumper instead. > Well, we can use TIF_MEMDIE if we chose the right thread, I think. > But perhaps mm->flags |= MMF_OOM is better, it can have other user. > I dunno. This is all the quick hack before get around to just making core dumping fully-interruptible, no? So we should go with whatever is the simplest change now. Perhaps this belongs in another thread as you suggested. But I wonder what we might get just from s/TASK_UNINTERRUPTIBLE/TASK_KILLABLE/ in exit_mm. Thanks, Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Oleg Nesterov on 2 Jun 2010 16:50
On 06/02, Roland McGrath wrote: > > > when select_bad_process() finds the task P to kill it can participate > > in the core dump (sleep in exit_mm), but we should somehow inform the > > thread which actually dumps the core: P->mm->core_state->dumper. > > Perhaps it should simply do that: if you would choose P to oom-kill, and > P->mm->core_state!=NULL, then choose P->mm->core_state->dumper instead. .... to set TIF_MEMDIE which should be checked in elf_core_dump(). Probably yes. > > Well, we can use TIF_MEMDIE if we chose the right thread, I think. > > But perhaps mm->flags |= MMF_OOM is better, it can have other user. > > I dunno. > > This is all the quick hack before get around to just making core dumping > fully-interruptible, no? So we should go with whatever is the simplest > change now. Yes. > Perhaps this belongs in another thread as you suggested. But I wonder what > we might get just from s/TASK_UNINTERRUPTIBLE/TASK_KILLABLE/ in exit_mm. Oh. This needs more thinking. Definitely the task sleeping in exit_mm() must not exit until core_state->dumper->thread returns from do_coredump(). If nothing else, the dumper can use its task_struct and it relies on the stable core_thread->next list. And right now TASK_KILLABLE can't work anyway, it is possible that fatal_signal_pending() is true. But perhaps we can do something later. Assuming that do_coredump() is interruptible, TASK_KILLABLE can make the difference only if the dumper belongs to another thread-group. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |