Prev: [PATCH] wmi: fix memory leak in parse_wdg
Next: [PATCH 1/2] of/powerpc: fix fsl_msi device node pointer
From: Tony Luck on 4 Jun 2010 15:50 > I've fixed it (by reversing the order of those lines) for tomorrow's > linux-next. Somewhere between next-20100602 and next-20100604 something was changed that results in ia64 taking deref NULL oops in sys_init_module() ... Freeing unused kernel memory: 1984kB freed modprobe[1851]: NaT consumption 2216203124768 [1] Modules linked in: Pid: 1851, CPU 2, comm: modprobe psr : 0000121008526030 ifs : 8000000000000794 ip : [<a0000001000f0f31>] Not tainted (2.6.35-rc1-generic-smp-next-20100604) ip is at sys_init_module+0x131/0x420 At the point of dereference it looks like we were trying to load a 4-byte data object from offset 552 into the "struct module *" that wa returned by load_module(). -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 4 Jun 2010 16:10 On Fri, 4 Jun 2010, Tony Luck wrote: > > At the point of dereference it looks like we were trying > to load a 4-byte data object from offset 552 into the > "struct module *" that wa returned by load_module(). Sounds like 'mod->num_ctors' loaded by do_mod_ctors(). It's a 4-byte field in roughly that area. What does a NaT consumption fault mean, and does it give the invalid address it was loaded off? In the successful path of "load_module()", we will have dereferenced the "mod" pointer we return just before, so I wonder if there's some error case that incorrectly returns a positive errno instead of a negative one, and causes us to miss the "IS_ERR()" check or something. There's a couple of checking routines in module.c that do not return a negative error, but instead return 0/1. The one I looked at was converted into a negative error, but there are several cases of if (err) return ERR_PTR(err) and if something does that on a 0/1 value, it will return a bogus pointer. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Luck, Tony on 4 Jun 2010 16:50 > What does a NaT consumption fault mean, and does it give the invalid > address it was loaded off? This almost always means that we dereferenced a NULL pointer ... though any access into the bottom PAGE_SIZE of kernel virtual address space will result in this trap. This happens on ia64 because we have a "NaT" page mapped at 0x0 so that speculative loads that chase NULL pointers at the end of lists behave more rationally. Sadly I don't have the actual address. The register that was used for the dereference isn't included in the OOPS output. -Tony -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Linus Torvalds on 4 Jun 2010 18:20 On Fri, 4 Jun 2010, Luck, Tony wrote: > > This almost always means that we dereferenced a NULL pointer ... though > any access into the bottom PAGE_SIZE of kernel virtual address space > will result in this trap. This happens on ia64 because we have a "NaT" > page mapped at 0x0 so that speculative loads that chase NULL pointers > at the end of lists behave more rationally. > > Sadly I don't have the actual address. The register that was used > for the dereference isn't included in the OOPS output. Ok, so it confirms just that load_module() has returned a pointer that is either NULL or at least within PAGE_SIZE-552. It could be a negative error pointer (and the offset of 552 turns it into the NULL page), but that's what the whole IS_ERR() thing checks for, so that's not the case. So the if (err) return ERR_PTR(err); case does seem pretty likely (most of them with a "goto <error-case>", but some directly. Many of them have the stricter form of "if (err < 0)", but there's a number that do not. And in fact, I think I see the bad one: /* Figure out module layout, and allocate all the memory. */ mod = layout_and_allocate(&info); if (IS_ERR(mod)) goto free_copy; which looks fine, but "free_copy:" expects the error number in "err", which is what the other error cases do. I think this was introduced by Rusty's commit 5d3f5be82944 ("module: layout_and_allocate"), and here's a suggested fix.. The easiest fix is to actually change the "free_copy" target to return "mod" as the above goto expects, and then just do a conversion before the fall-through from the other error cases (that have it in 'err'). Does this fix it? I stopped looking for other possible causes when I found this one. Linus --- kernel/module.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 69a3f12..9a0b275 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2653,9 +2653,10 @@ static struct module *load_module(void __user *umod, module_unload_free(mod); free_module: module_deallocate(mod, &info); + mod = ERR_PTR(err); free_copy: free_copy(&info); - return ERR_PTR(err); + return mod; } /* Call module constructors. */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Luck, Tony on 4 Jun 2010 19:00 > Does this fix it? I stopped looking for other possible causes when I found > this one. It gets rid of the oops. So that's good. Something is still hokey in linux-next land though because no modules get loaded. So no ehci/uhci available :-( No obvious looking error messages on the console. -Tony --- kernel/module.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index 69a3f12..9a0b275 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2653,9 +2653,10 @@ static struct module *load_module(void __user *umod, module_unload_free(mod); free_module: module_deallocate(mod, &info); + mod = ERR_PTR(err); free_copy: free_copy(&info); - return ERR_PTR(err); + return mod; } /* Call module constructors. */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: [PATCH] wmi: fix memory leak in parse_wdg Next: [PATCH 1/2] of/powerpc: fix fsl_msi device node pointer |