From: Shaohui Zheng on 13 May 2010 08:20 This email was lost after I check the LKML, resend it, sorry if duplicated. Hi, All This patchset introduces NUMA hotplug emulator for x86. it refers too many files and might introduce new bugs, so we send a RFC to comminity first and expect comments and suggestions, thanks. * WHAT IS HOTPLUG EMULATOR NUMA hotplug emulator is collectively named for the hotplug emulation it is able to emulate NUMA Node Hotplug thru a pure software way. It intends to help people easily debug and test node/cpu/memory hotplug related stuff on a none-numa-hotplug-support machine, even an UMA machine. The emulator provides mechanism to emulate the process of physcial cpu/mem hotadd, it provides possibility to debug CPU and memory hotplug on the machines without NUMA support for kenrel developers. It offers an interface for cpu and memory hotplug test purpose. * WHY DO WE USE HOTPLUG EMULATOR We are focusing on the hotplug emualation for a few months. The emualor helps team to reproduce all the major hotplug bugs. It plays an important role to the hotplug code quality assuirance. Because of the hotplug emulator, we already move most of the debug working to virtual evironment. We send it to * EXPECT BUGS This is the first version to send to the comminity, but it is already 3rd version in internal. It expected to have bugs. OPEN: Kernel might use part of hidden memory region as RAM buffer, now emulator directly hide 128M extra space to workaround this issue. Any better way to avoid this conflict? We expect a better solution from the community(for patch 002). * Principles & Usages NUMA hotplug emulator include 3 different parts, We add a menu item to the menuconfig to enable/disable them (Refer to http://shaohui.org/images/hpe-krnl-cfg.jpg) 1) Node hotplug emulation: The emulator firstly hides RAM via E820 table, and then it can fake offlined nodes with the hidden RAM. After system bootup, user is able to hotplug-add these offlined nodes, which is just similar to a real hotplug hardware behavior. Using boot option "numa=hide=N*size" to fake offlined nodes: - N is the number of hidden nodes - size is the memory size (in MB) per hidden node. There is a sysfs entry "probe" under /sys/devices/system/node/ for user to hotplug the fake offlined nodes: - to show all fake offlined nodes: $ cat /sys/devices/system/node/probe - to hotadd a fake offlined node, e.g. nodeid is N: $ echo N > /sys/devices/system/node/probe 2) CPU hotplug emulation: The emulator reserve CPUs throu grub parameter, the reserved CPUs can be hot-add/hot-remove in software method, it emulates the procuess of physical cpu hotplug. - to hide CPUs - Using boot option "maxcpus=N" hide CPUs N is the number of initialize CPUs - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation when cpu_hpe is enabled, the rest CPUs will not be initialized - to hot-add CPU to node $ echo nid > cpu/probe - to hot-remove CPU $ echo nid > cpu/release 3) Memory hotplug emulation: The emulator reserve memory before OS booting, the reserved memory region is remove from e820 table, and they can be hot-added via the probe interface, this interface was extend to support add memory to the specified node, It maintains backwards compatibility. The difficulty of Memory Release is well-known, we have no plan for it until now. - reserve memory throu grub parameter mem=1024m - add a memory section to node 3 $ echo 0x40000000,3 > memory/probe OR $ echo 1024m,3 > memory/probe * ACKNOWLEDGMENT hotplug emulator includes a team's efforts, thanks all of them. They are: Andi Kleen, Haicheng Li, Shaohui Zheng, Fengguang Wu and Yongkang You -- Thanks & Regards, Shaohui -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Shaohui Zheng on 13 May 2010 08:30 This email was lost after I check the LKML, resend it, sorry if duplicated. Hi, All This patchset introduces NUMA hotplug emulator for x86. it refers too many files and might introduce new bugs, so we send a RFC to comminity first and expect comments and suggestions, thanks. * WHAT IS HOTPLUG EMULATOR NUMA hotplug emulator is collectively named for the hotplug emulation it is able to emulate NUMA Node Hotplug thru a pure software way. It intends to help people easily debug and test node/cpu/memory hotplug related stuff on a none-numa-hotplug-support machine, even an UMA machine. The emulator provides mechanism to emulate the process of physcial cpu/mem hotadd, it provides possibility to debug CPU and memory hotplug on the machines without NUMA support for kenrel developers. It offers an interface for cpu and memory hotplug test purpose. * WHY DO WE USE HOTPLUG EMULATOR We are focusing on the hotplug emualation for a few months. The emualor helps team to reproduce all the major hotplug bugs. It plays an important role to the hotplug code quality assuirance. Because of the hotplug emulator, we already move most of the debug working to virtual evironment. We send it to * EXPECT BUGS This is the first version to send to the comminity, but it is already 3rd version in internal. It expected to have bugs. OPEN: Kernel might use part of hidden memory region as RAM buffer, now emulator directly hide 128M extra space to workaround this issue. Any better way to avoid this conflict? We expect a better solution from the community(for patch 002). * Principles & Usages NUMA hotplug emulator include 3 different parts, We add a menu item to the menuconfig to enable/disable them 1) Node hotplug emulation: The emulator firstly hides RAM via E820 table, and then it can fake offlined nodes with the hidden RAM. After system bootup, user is able to hotplug-add these offlined nodes, which is just similar to a real hotplug hardware behavior. Using boot option "numa=hide=N*size" to fake offlined nodes: - N is the number of hidden nodes - size is the memory size (in MB) per hidden node. There is a sysfs entry "probe" under /sys/devices/system/node/ for user to hotplug the fake offlined nodes: - to show all fake offlined nodes: $ cat /sys/devices/system/node/probe - to hotadd a fake offlined node, e.g. nodeid is N: $ echo N > /sys/devices/system/node/probe 2) CPU hotplug emulation: The emulator reserve CPUs throu grub parameter, the reserved CPUs can be hot-add/hot-remove in software method, it emulates the procuess of physical cpu hotplug. - to hide CPUs - Using boot option "maxcpus=N" hide CPUs N is the number of initialize CPUs - Using boot option "cpu_hpe=on" to enable cpu hotplug emulation when cpu_hpe is enabled, the rest CPUs will not be initialized - to hot-add CPU to node $ echo nid > cpu/probe - to hot-remove CPU $ echo nid > cpu/release 3) Memory hotplug emulation: The emulator reserve memory before OS booting, the reserved memory region is remove from e820 table, and they can be hot-added via the probe interface, this interface was extend to support add memory to the specified node, It maintains backwards compatibility. The difficulty of Memory Release is well-known, we have no plan for it until now. - reserve memory throu grub parameter mem=1024m - add a memory section to node 3 $ echo 0x40000000,3 > memory/probe OR $ echo 1024m,3 > memory/probe * ACKNOWLEDGMENT hotplug emulator includes a team's efforts, thanks all of them. They are: Andi Kleen, Haicheng Li, Shaohui Zheng, Fengguang Wu and Yongkang You -- Thanks & Regards, Shaohui -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Greg KH on 13 May 2010 13:00 On Thu, May 13, 2010 at 07:48:35PM +0800, Shaohui Zheng wrote: > Userland interface to hotplug-add fake offlined nodes. Why include 2 copies of the patch in one email? > Add a sysfs entry "probe" under /sys/devices/system/node/: > > - to show all fake offlined nodes: > $ cat /sys/devices/system/node/probe > > - to hotadd a fake offlined node, e.g. nodeid is N: > $ echo N > /sys/devices/system/node/probe As you are trying to add a new sysfs file, please create the matching Documentation/ABI/ file as well. Also note that sysfs files are "one value per file", which I don't think this file follows, right? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dave Hansen on 13 May 2010 14:00 On Thu, 2010-05-13 at 09:55 -0700, Greg KH wrote: > > Add a sysfs entry "probe" under /sys/devices/system/node/: > > > > - to show all fake offlined nodes: > > $ cat /sys/devices/system/node/probe > > > > - to hotadd a fake offlined node, e.g. nodeid is N: > > $ echo N > /sys/devices/system/node/probe > > As you are trying to add a new sysfs file, please create the matching > Documentation/ABI/ file as well. > > Also note that sysfs files are "one value per file", which I don't think > this file follows, right? I think in this case, it was meant to be a list of acceptable parameters rather than a set of values, kinda like /sys/power/state. Instead, I guess we could have: /sys/devices/system/node/probeable/3 /sys/devices/system/node/probeable/43 /sys/devices/system/node/probeable/65 /sys/devices/system/node/probeable/5145 and the knowledge that you need to pick one of those to echo into /sys/devices/system/node/probe. But, it's a lot more self explanatory if you 'cat /sys/devices/system/node/probe', and then pick one of those to echo back into the file. Seems like a decent place to violate the "rule". :) -- Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dave Hansen on 13 May 2010 14:10
On Thu, 2010-05-13 at 09:56 -0700, Greg KH wrote: > On Thu, May 13, 2010 at 08:00:16PM +0800, Shaohui Zheng wrote: > > hotplug emulator:extend memory probe interface to support NUMA > > > > Extend memory probe interface to support an extra paramter nid, > > the reserved memory can be added into this node if node exists. > > > > Add a memory section(128M) to node 3(boots with mem=1024m) > > > > echo 0x40000000,3 > memory/probe I dunno. If we're going to put multiple values into the file now and add to the ABI, can we be more explicit about it? echo "physical_address=0x40000000 numa_node=3" > memory/probe I'd *GREATLY* prefer that over this new syntax. The existing mechanism is obtuse enough, and the ',3' makes it more so. We should have the code around to parse arguments like that, too, since we use it for the boot command-line. -- Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |