Prev: [PATCH UPDATED 12/35] workqueue: update cwq alignement
Next: [PATCH] um: Include missing header file in os-linux/mem.c
From: Chetan Loke on 30 Jun 2010 17:30 Hello Dmitry, On Wed, Jun 30, 2010 at 3:27 PM, Dmitry Torokhov <dtor(a)vmware.com> wrote: > Hi Chetan, > > On Wednesday, June 30, 2010 11:42:53 am Chetan Loke wrote: >> Q1)Does vmtools handle pvscsi correctly? >> > > Yes, as long as it compiled as a module or installer will not overwrite > distribution-supplied version unless user explicitly requests installer > to clobber it. > perfect. > So far distributions have not tried building their kernels with pvscsi > or vmxnet3 built-in, but did so with our ballon driver, which prompted > this particular change. > We are building iso's which will then be used to build/create an ESX appliance. So we would need the pvscsi driver from the start. vNICs will be populated post-install. At which point vmxnet[2/3] will kick-in via vmtools. >> Q2)In case if a VM wants to be a good citizen, is there a way for a >> guest to know about the balloon-event? > > I am not sure I follow. Ballooning supposed to be as transparent as > possible... > This is too product specific. I will send you an email separately. >> Q3)What if an app mlock's its memory resources and driver's have >> pinned down their pages then how does inflation work? > > We will inflate as much as we can. Obviously if there are no more > memory balloon may not grow to its full target size. > > Balloon driver communicates to the hypervisor the total amount of > memory in the guest, we may want to adjust that number by subtracting > memory allocated by the kernel, mlocked memory and so on, but it is > not done currently. Ok. I'm stuck with one question - A) Ballooning will trigger guest's native memory management policy. A.1) So this could mean guest might swap it's pages on it's vdisk, correct? Consider this setup - B) VM1..VMn have backing store(data and OS partitions) on LUNs(SAN). Further, data LUNs are mounted as RDMs. I chose RDMs just to keep it simple. C) Say there's memory pressure. How? Well, few VM's are blasting I/O to the LUNs. Plus, a backup triggered. Plus, whatever else happened. C.1) VM's now seem to need more and more memory. C.2) hypervisors block-layer/other-layers also need more memory. C.3) Hypervisor's memory-management algorithm kicks-in. ...... C.3.x) Ballooning triggers - now some VM's (excluding the ones from C.1) are giving up memory and if A.1) above is true then the guest's pages will be swapped out on the LUNs via hypervisor's SCSI-LLDD. But look at C.2) above. Is this a soft-deadlock? Oh, it's a linux-guest and if C.1) timesout then the guest will send aborts and eventually a LUN reset ;). In this particular case, if my suspicion is valid and if all the signatures match(swap is out on the SAN, block-congestion etc) then the balloon driver could just bail out. > Thanks. > -- > Dmitry Thanks Chetan Loke -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Dmitry Torokhov on 30 Jun 2010 17:40 On Wednesday, June 30, 2010 02:26:40 pm Chetan Loke wrote: > Hello Dmitry, > > On Wed, Jun 30, 2010 at 3:27 PM, Dmitry Torokhov <dtor(a)vmware.com> wrote: > > Hi Chetan, > > > > On Wednesday, June 30, 2010 11:42:53 am Chetan Loke wrote: > >> Q1)Does vmtools handle pvscsi correctly? > > > > Yes, as long as it compiled as a module or installer will not overwrite > > distribution-supplied version unless user explicitly requests installer > > to clobber it. > > perfect. > > > So far distributions have not tried building their kernels with pvscsi > > or vmxnet3 built-in, but did so with our ballon driver, which prompted > > this particular change. > > We are building iso's which will then be used to build/create an ESX > appliance. So we would need the pvscsi driver from the start. Well, with typical setup, even though pvscsi is a module, as long as it is in initramfs it will still be loaded automatically. If you are building truly custom appliance and require pvscsi built-in you'll have to modify the tools installer script. > vNICs > will be populated post-install. At which point vmxnet[2/3] will > kick-in via vmtools. Depending on what you base your appliance vmxnet3 might be already in the kernel along with pvscsi. > > >> Q2)In case if a VM wants to be a good citizen, is there a way for a > >> guest to know about the balloon-event? > > > > I am not sure I follow. Ballooning supposed to be as transparent as > > possible... > > This is too product specific. I will send you an email separately. > OK. > >> Q3)What if an app mlock's its memory resources and driver's have > >> pinned down their pages then how does inflation work? > > > > We will inflate as much as we can. Obviously if there are no more > > memory balloon may not grow to its full target size. > > > > Balloon driver communicates to the hypervisor the total amount of > > memory in the guest, we may want to adjust that number by subtracting > > memory allocated by the kernel, mlocked memory and so on, but it is > > not done currently. > > Ok. > > I'm stuck with one question - > > A) Ballooning will trigger guest's native memory management policy. > A.1) So this could mean guest might swap it's pages on it's vdisk, > correct? > Yes. > Consider this setup - > B) VM1..VMn have backing store(data and OS partitions) on LUNs(SAN). > Further, data LUNs are mounted as RDMs. I chose RDMs just to keep it > simple. > C) Say there's memory pressure. How? Well, few VM's are blasting I/O > to the LUNs. Plus, a backup triggered. Plus, whatever else happened. > C.1) VM's now seem to need more and more memory. > C.2) hypervisors block-layer/other-layers also need more memory. > C.3) Hypervisor's memory-management algorithm kicks-in. > ...... > C.3.x) Ballooning triggers - now some VM's (excluding the ones > from C.1) are giving up memory and if A.1) above is true then the > guest's pages will be swapped out on the LUNs via > hypervisor's SCSI-LLDD. But look at C.2) above. Is > this a soft-deadlock? > If there is no memory something will have to give up. If you look at the ballon driver you will see that when it switches from non-sleeping to sleeping allocations or otherwise starts getting allocation errors it will throttle the inflation rates to give the box a "breather" and not choke it completely right then and there. > Oh, it's a linux-guest and if C.1) timesout then the guest will send > aborts and eventually a LUN reset ;). > > In this particular case, if my suspicion is valid and if all the > signatures match(swap is out on the SAN, block-congestion etc) then > the balloon driver could just bail out. > Yes, it is not guaranteed that ballon will reach this target, and in this case host itself might start swapping causing severe performance issues. Realistically it all boils down to this: even though you may overcommit you still have to adequately provision your hosts so they could handle the load. Thanks. -- Dmitry -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Alexander Clouter on 2 Jul 2010 05:10
Dmitry Torokhov <dtor(a)vmware.com> wrote: > >> > Now we have 2 drivers fighting. There is no backing device and so driver >> > core will not save us by refusing to bind to already claimed device. >> >> If vmware_balloon is present in /sys/modules or is loaded, don't load >> vmmemctl. And vice versa. >> >> I dunno - it's silly for me to sit here proposing solutions. it's >> better that you do it! > > Unfortunately I do not have a good solution at the moment. I guess we'll > have to work with distributions to make sure they keep it as a module > (it also makes most sense for them since not everyone runs on our > platform). > I cannot seriously believe you are considering a viable solution is "everyone[1] must abide by these rules otherwise our installer might barf". The only benefactor of this patch is your installer and the effect is an undocumented and peculiar constraint on a kernel module. Seriously, add sometime so that you get something in /sys/modules (maybe it's time for something in /sys/class?) or maybe do something so that you have: VMWARE_BALLOON_CMD(STATUS, ...) where the guest can say if there is already something ballooning for it. Surely the guest should be aware if there is more than one balloon driver at play? I think a friend of mine summed it up rather well: "Fixing the kernel instead of fixing the VMWare installer is an inspired move". Cheers [1] the dropdown menu on distrowatch lists 319 distrubutions -- Alexander Clouter ..sigmonster says: May the bluebird of happiness twiddle your bits. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |