OOM killer, page fault [Kernel]

Prev: tracing: Fix to use unused attribute
Next: [PATCH 2/2] pci: pciehp update the slot bridge res to get big range for pcie devices - v8

From: KOSAKI Motohiro on 2 Nov 2009 03:40

> On Mon, 2 Nov 2009 13:24:06 +0900 (JST)
> KOSAKI Motohiro <kosaki.motohiro(a)jp.fujitsu.com> wrote:
>
> > Hi,
> >
> > (Cc to linux-mm)
> >
> > Wow, this is very strange log.
> >
> > > Dear all,
> > >
> > > (please Cc)
> > >
> > > With 2.6.32-rc5 I got that one:
> > > [13832.210068] Xorg invoked oom-killer: gfp_mask=0x0, order=0, oom_adj=0
> >
> > order = 0
>
> I think this problem results from 'gfp_mask = 0x0'.
> Is it possible?
>
> If it isn't H/W problem, Who passes gfp_mask with 0x0?
> It's culpit.
>
> Could you add BUG_ON(gfp_mask == 0x0) in __alloc_pages_nodemask's head?

No.
In page fault case, gfp_mask show meaningless value. Please ignore it.
pagefault_out_of_memory() always pass gfp_mask==0 to oom.

mm/oom_kill.c
====================================
void pagefault_out_of_memory(void)
{
unsigned long freed = 0;

blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
if (freed > 0)
/* Got some memory back in the last second. */
return;

/*
* If this is from memcg, oom-killer is already invoked.
* and not worth to go system-wide-oom.
*/
if (mem_cgroup_oom_called(current))
goto rest_and_return;

if (sysctl_panic_on_oom)
panic("out of memory from page fault. panic_on_oom is selected.\n");

read_lock(&tasklist_lock);
__out_of_memory(0, 0); <---- here!
read_unlock(&tasklist_lock);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Minchan Kim on 2 Nov 2009 03:40

On Mon, Nov 2, 2009 at 3:59 PM, KOSAKI Motohiro
<kosaki.motohiro(a)jp.fujitsu.com> wrote:
>> On Mon, �2 Nov 2009 13:24:06 +0900 (JST)
>> KOSAKI Motohiro <kosaki.motohiro(a)jp.fujitsu.com> wrote:
>>
>> > Hi,
>> >
>> > (Cc to linux-mm)
>> >
>> > Wow, this is very strange log.
>> >
>> > > Dear all,
>> > >
>> > > (please Cc)
>> > >
>> > > With 2.6.32-rc5 I got that one:
>> > > [13832.210068] Xorg invoked oom-killer: gfp_mask=0x0, order=0, oom_adj=0
>> >
>> > order = 0
>>
>> I think this problem results from 'gfp_mask = 0x0'.
>> Is it possible?
>>
>> If it isn't H/W problem, Who passes gfp_mask with 0x0?
>> It's culpit.
>>
>> Could you add BUG_ON(gfp_mask == 0x0) in __alloc_pages_nodemask's head?
>
> No.
> In page fault case, gfp_mask show meaningless value. Please ignore it.
> pagefault_out_of_memory() always pass gfp_mask==0 to oom.
>
>
> mm/oom_kill.c
> ====================================
> void pagefault_out_of_memory(void)
> {
> � � � �unsigned long freed = 0;
>
> � � � �blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
> � � � �if (freed > 0)
> � � � � � � � �/* Got some memory back in the last second. */
> � � � � � � � �return;
>
> � � � �/*
> � � � � * If this is from memcg, oom-killer is already invoked.
> � � � � * and not worth to go system-wide-oom.
> � � � � */
> � � � �if (mem_cgroup_oom_called(current))
> � � � � � � � �goto rest_and_return;
>
> � � � �if (sysctl_panic_on_oom)
> � � � � � � � �panic("out of memory from page fault. panic_on_oom is selected.\n");
>
> � � � �read_lock(&tasklist_lock);
> � � � �__out_of_memory(0, 0); � � � <---- here!
> � � � �read_unlock(&tasklist_lock);
>
>

Yeb. Kame already noticed it. :)
Thanks for pointing me out, again.

I already suggested another patch.
What do you think about it?

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Norbert Preining on 2 Nov 2009 09:20

Hi all,

wow, many messages ... At the end I lost track of which patch I should try?

BTW, that happened only once, and whatever I do I cannot reproduce that.

I will anyway include any patch you send me and hope that it happens again.

Thanks

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining Associate Professor
JAIST Japan Advanced Institute of Science and Technology preining(a)jaist.ac.jp
Vienna University of Technology preining(a)logic.at
Debian Developer (Debian TeX Task Force) preining(a)debian.org
gpg DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
BAUMBER
A fitted elasticated bottom sheet which turns your mattress
bananashaped.
--- Douglas Adams, The Meaning of Liff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Minchan Kim on 2 Nov 2009 09:50

Hi.

On Mon, Nov 2, 2009 at 11:19 PM, Norbert Preining <preining(a)logic.at> wrote:
> Hi all,
>
> wow, many messages ... At the end I lost track of which patch I should try?
>
> BTW, that happened only once, and whatever I do I cannot reproduce that.
>
> I will anyway include any patch you send me and hope that it happens again.

Pz forget my previous patch.
Could you test following patch?

diff --git a/mm/memory.c b/mm/memory.c
index 7e91b5f..47e4b15 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2713,7 +2713,11 @@ static int __do_fault(struct mm_struct *mm,
struct vm_area_struct *vma,
vmf.page = NULL;

ret = vma->vm_ops->fault(vma, &vmf);
- if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE)))
+ if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE))) {
+ printk(KERN_DEBUG "vma->vm_ops->fault : 0x%lx\n",
vma->vm_ops->fault);
+ WARN_ON(1);
+
+ }
return ret;

if (unlikely(PageHWPoison(vmf.page))) {

> Thanks
>
> Norbert
>
> -------------------------------------------------------------------------------
> Dr. Norbert Preining � � � � � � � � � � � � � � � � � � � �Associate Professor
> JAIST Japan Advanced Institute of Science and Technology � preining(a)jaist.ac.jp
> Vienna University of Technology � � � � � � � � � � � � � � � preining(a)logic.at
> Debian Developer (Debian TeX Task Force) � � � � � � � � � �preining(a)debian.org
> gpg DSA: 0x09C5B094 � � �fp: 14DF 2E6C 0307 BE6D AD76 �A9C0 D2BF 4AA3 09C5 B094
> -------------------------------------------------------------------------------
> BAUMBER
> A fitted elasticated bottom sheet which turns your mattress
> bananashaped.
> � � � � � � � � � � � �--- Douglas Adams, The Meaning of Liff
>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Hugh Dickins on 2 Nov 2009 11:30

On Mon, 2 Nov 2009, Minchan Kim wrote:
> On Mon, 2 Nov 2009 14:02:16 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> wrote:
> >
> > Maybe some code returns VM_FAULT_OOM by mistake and pagefault_oom_killer()
> > is called. digging mm/memory.c is necessary...
> >
> > I wonder why...now is this code
> > ===
> > static int do_nonlinear_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > unsigned long address, pte_t *page_table, pmd_t *pmd,
> > unsigned int flags, pte_t orig_pte)
> > {
> > pgoff_t pgoff;
> >
> > flags |= FAULT_FLAG_NONLINEAR;
> >
> > if (!pte_unmap_same(mm, pmd, page_table, orig_pte))
> > return 0;
> >
> > if (unlikely(!(vma->vm_flags & VM_NONLINEAR))) {
> > /*
> > * Page table corrupted: show pte and kill process.
> > */
> > print_bad_pte(vma, address, orig_pte, NULL);
> > return VM_FAULT_OOM;
> > }
> >
> > pgoff = pte_to_pgoff(orig_pte);
> > return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte);
> > }
> > ==
> > Then, OOM...is this really OOM ?
>
> It seems that the goal is to kill process by OOM trick as comment said.
>
> I found It results from Hugh's commit 65500d234e74fc4e8f18e1a429bc24e51e75de4a.
> I think it's not a real OOM.
>
> BTW, If it is culpit in this case, print_bad_pte should have remained any log. :)

Yes, the chances are that this is not related to Norbert's problem.
But thank you for reminding me of that not-very-nice hack of mine.

It was kind-of valid at the time that I wrote it (2.6.15), when
VM_FAULT_OOM did kill the faulting process. But since then the fault
path has rightly been changed (in x86 at least, I didn't check the rest)
to let the OOM killer decide who to kill: so now there's a danger that
a pagetable corruption there will instead kill some unrelated process.

Being lazy, I'm inclined simply to change that to VM_FAULT_SIGBUS now:
which doesn't actually guarantee that the process will be killed, but
should be better than just repeatedly re-faulting on the entry. (I
don't much want to SIGKILL current since mm might not be current's.)

That aberrant use of VM_FAULT_OOM has recently been copied into
do_swap_page() (the first instance; the second instance is right -
hmm, well, the second instance is normally right, but I guess it
also covers pagetable corruption cases which we can't distinguish
there; oh well) and should be corrected there too.

Does VM_FAULT_SIGBUS sound good enough to you?

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: tracing: Fix to use unused attribute
Next: [PATCH 2/2] pci: pciehp update the slot bridge res to get big range for pcie devices - v8