From: David Rientjes on 2 Mar 2010 16:20 On Tue, 2 Mar 2010, Andi Kleen wrote: > The patch looks far more complicated than my simple fix. > > Is more complicated now better? > If you still believe these are "fixes," then perhaps you don't fully understand the issue: slab completely lacked memory hotplug support when a node is either being onlined or offlined that do not have hotadded or hotremoved cpus. It's as simple as that. To be fair, my patch may appear more complex because it implements full memory hotplug support so that the nodelists are properly drained and freed when the same memory regions you onlined for memory hot-add are now offlined. Notice, also, how it touches no other slab code as implementing new support for something shouldn't. There is no need for additional hacks to be added in other slab code if you properly allocate and initialize the nodelists for the memory being added before it is available for use by the kernel. If you'd test my patch out on your setup, that would be very helpful. I can address any additional issues that you may undercover if you post the oops while doing either memory online or offline. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on 2 Mar 2010 20:40 On Tue, 2 Mar 2010 14:20:06 -0600 (CST) Christoph Lameter <cl(a)linux-foundation.org> wrote: > > Not sure how this would sync with slab use during node bootstrap and > shutdown. Kame-san? > > Otherwise > > Acked-by: Christoph Lameter <cl(a)linux-foundation.org> > What this patch fixes ? Maybe I miss something... At node hot-add * pgdat is allocated from other node (because we have no memory for "nid") * memmap for the first section (and possiby others) will be allocated from other nodes. * Once a section for the node is onlined, any memory can be allocated localy. (Allocating memory from local node requires some new implementation as bootmem allocater, we didn't that.) Before this patch, slab's control layer is allocated by cpuhotplug. So, at least keeping this order, memory online -> cpu online slab's control layer is allocated from local node. When node-hotadd is done in this order cpu online -> memory online kmalloc_node() will allocate memory from other node via fallback. After this patch, slab's control layer is allocated by memory hotplug. Then, in any order, slab's control will be allocated via fallback routine. If this patch is an alternative fix for Andi's this logic == Index: linux-2.6.32-memhotadd/mm/slab.c =================================================================== --- linux-2.6.32-memhotadd.orig/mm/slab.c +++ linux-2.6.32-memhotadd/mm/slab.c @@ -4093,6 +4093,9 @@ static void cache_reap(struct work_struc * we can do some work if the lock was obtained. */ l3 = searchp->nodelists[node]; + /* Note node yet set up */ + if (!l3) + break; == I'm not sure this really happens. cache_reap() is for checking local node. The caller is set up by CPU_ONLINE. searchp->nodelists[] is filled by CPU_PREPARE. Then, cpu for the node should be onlined. (and it's done under proper mutex.) I'm sorry if I miss something important. But how anyone cause this ? Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: David Rientjes on 2 Mar 2010 21:40 On Wed, 3 Mar 2010, KAMEZAWA Hiroyuki wrote: > At node hot-add > > * pgdat is allocated from other node (because we have no memory for "nid") > * memmap for the first section (and possiby others) will be allocated from > other nodes. > * Once a section for the node is onlined, any memory can be allocated localy. > Correct, and the struct kmem_list3 is also alloacted from other nodes with my patch. > (Allocating memory from local node requires some new implementation as > bootmem allocater, we didn't that.) > > Before this patch, slab's control layer is allocated by cpuhotplug. > So, at least keeping this order, > memory online -> cpu online > slab's control layer is allocated from local node. > > When node-hotadd is done in this order > cpu online -> memory online > kmalloc_node() will allocate memory from other node via fallback. > > After this patch, slab's control layer is allocated by memory hotplug. > Then, in any order, slab's control will be allocated via fallback routine. > Again, this addresses memory hotplug that requires a new node to be onlined that do not have corresponding cpus that are being onlined. On x86, these represent ACPI_SRAT_MEM_HOT_PLUGGABLE regions that are onlined either by the acpi hotplug or done manually with CONFIG_ARCH_MEMORY_PROBE. On other architectures such as powerpc, this is done in different ways. All of this is spelled out in the changelog for the patch. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on 2 Mar 2010 22:00 On Tue, 2 Mar 2010 18:39:20 -0800 (PST) David Rientjes <rientjes(a)google.com> wrote: > On Wed, 3 Mar 2010, KAMEZAWA Hiroyuki wrote: > > > At node hot-add > > > > * pgdat is allocated from other node (because we have no memory for "nid") > > * memmap for the first section (and possiby others) will be allocated from > > other nodes. > > * Once a section for the node is onlined, any memory can be allocated localy. > > > > Correct, and the struct kmem_list3 is also alloacted from other nodes with > my patch. > > > (Allocating memory from local node requires some new implementation as > > bootmem allocater, we didn't that.) > > > > Before this patch, slab's control layer is allocated by cpuhotplug. > > So, at least keeping this order, > > memory online -> cpu online > > slab's control layer is allocated from local node. > > > > When node-hotadd is done in this order > > cpu online -> memory online > > kmalloc_node() will allocate memory from other node via fallback. > > > > After this patch, slab's control layer is allocated by memory hotplug. > > Then, in any order, slab's control will be allocated via fallback routine. > > > > Again, this addresses memory hotplug that requires a new node to be > onlined that do not have corresponding cpus that are being onlined. On > x86, these represent ACPI_SRAT_MEM_HOT_PLUGGABLE regions that are onlined > either by the acpi hotplug or done manually with CONFIG_ARCH_MEMORY_PROBE. > On other architectures such as powerpc, this is done in different ways. > > All of this is spelled out in the changelog for the patch. > Ah, ok. for cpu-less node and kmallco_node() against that node. Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on 3 Mar 2010 09:40
> > The patch looks far more complicated than my simple fix. > > I wouldn't exactly call the fallback_alloc() games "simple". I have to disagree on that. It was the most simple fix I could come up with, least intrusive to legacy like slab is. > > Is more complicated now better? > > Heh, heh. You can't post the oops, you don't want to rework your The missing oops was about the timer race, not about this one. > patches as per review comments, and now you complain about David's > patch without one bit of technical content. I'm sorry but I must Well sorry I'm just a bit frustrated about the glacial progress on what should be relatively straight forward fixes. IMHO something like my patch should have gone into .33 and any more complicated reworks like this into .34. > But anyway, if you have real technical concerns over the patch, please > make them known; otherwise I'd much appreciate a Tested-by tag from > you for David's patch. If it works it would be ok for me. The main concern would be to actually get it fixed. -Andi -- ak(a)linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |