Prev: + tmpfs-fix-oops-on-remounts-with-mpol=default.patch added to -mm tree
Next: [PATCH 5/5] doc: add the documentation for mpol=local
From: Avi Kivity on 16 Mar 2010 01:50 On 03/16/2010 07:27 AM, Zhang, Yanmin wrote: > From: Zhang, Yanmin<yanmin_zhang(a)linux.intel.com> > > Based on the discussion in KVM community, I worked out the patch to support > perf to collect guest os statistics from host side. This patch is implemented > with Ingo, Peter and some other guys' kind help. Yang Sheng pointed out a > critical bug and provided good suggestions with other guys. I really appreciate > their kind help. > > The patch adds new subcommand kvm to perf. > > perf kvm top > perf kvm record > perf kvm report > perf kvm diff > > The new perf could profile guest os kernel except guest os user space, but it > could summarize guest os user space utilization per guest os. > > Below are some examples. > 1) perf kvm top > [root(a)lkp-ne01 norm]# perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms > --guestmodules=/home/ymzhang/guest/modules top > > Excellent, support for guest kernel != host kernel is critical (I can't remember the last time I ran same kernels). How would we support multiple guests with different kernels? Perhaps a symbol server that perf can connect to (and that would connect to guests in turn)? > diff -Nraup linux-2.6_tipmaster0315/arch/x86/kvm/vmx.c linux-2.6_tipmaster0315_perfkvm/arch/x86/kvm/vmx.c > --- linux-2.6_tipmaster0315/arch/x86/kvm/vmx.c 2010-03-16 08:59:11.825295404 +0800 > +++ linux-2.6_tipmaster0315_perfkvm/arch/x86/kvm/vmx.c 2010-03-16 09:01:09.976084492 +0800 > @@ -26,6 +26,7 @@ > #include<linux/sched.h> > #include<linux/moduleparam.h> > #include<linux/ftrace_event.h> > +#include<linux/perf_event.h> > #include "kvm_cache_regs.h" > #include "x86.h" > > @@ -3632,6 +3633,43 @@ static void update_cr8_intercept(struct > vmcs_write32(TPR_THRESHOLD, irr); > } > > +DEFINE_PER_CPU(int, kvm_in_guest) = {0}; > + > +static void kvm_set_in_guest(void) > +{ > + percpu_write(kvm_in_guest, 1); > +} > + > +static int kvm_is_in_guest(void) > +{ > + return percpu_read(kvm_in_guest); > +} > There is already PF_VCPU for this. > +static struct perf_guest_info_callbacks kvm_guest_cbs = { > + .is_in_guest = kvm_is_in_guest, > + .is_user_mode = kvm_is_user_mode, > + .get_guest_ip = kvm_get_guest_ip, > + .reset_in_guest = kvm_reset_in_guest, > +}; > Should be in common code, not vmx specific. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on 16 Mar 2010 03:30 * Avi Kivity <avi(a)redhat.com> wrote: > On 03/16/2010 07:27 AM, Zhang, Yanmin wrote: > >From: Zhang, Yanmin<yanmin_zhang(a)linux.intel.com> > > > >Based on the discussion in KVM community, I worked out the patch to support > >perf to collect guest os statistics from host side. This patch is implemented > >with Ingo, Peter and some other guys' kind help. Yang Sheng pointed out a > >critical bug and provided good suggestions with other guys. I really appreciate > >their kind help. > > > >The patch adds new subcommand kvm to perf. > > > > perf kvm top > > perf kvm record > > perf kvm report > > perf kvm diff > > > >The new perf could profile guest os kernel except guest os user space, but it > >could summarize guest os user space utilization per guest os. > > > >Below are some examples. > >1) perf kvm top > >[root(a)lkp-ne01 norm]# perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms > >--guestmodules=/home/ymzhang/guest/modules top > > > > Excellent, support for guest kernel != host kernel is critical (I > can't remember the last time I ran same kernels). > > How would we support multiple guests with different kernels? Perhaps a > symbol server that perf can connect to (and that would connect to guests in > turn)? The highest quality solution would be if KVM offered a 'guest extension' to the guest kernel's /proc/kallsyms that made it easy for user-space to get this information from an authorative source. That's the main reason why the host side /proc/kallsyms is so popular and so useful: while in theory it's mostly redundant information which can be gleaned from the System.map and other sources of symbol information, it's easily available and is _always_ trustable to come from the host kernel. Separate System.map's have a tendency to go out of sync (or go missing when a devel kernel gets rebuilt, or if a devel package is not installed), and server ports (be that a TCP port space server or an UDP port space mount-point) are both a configuration hassle and are not guest-transparent. So for instrumentation infrastructure (such as perf) we have a large and well founded preference for intrinsic, built-in, kernel-provided information: i.e. a largely 'built-in' and transparent mechanism to get to guest symbols. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Zhang, Yanmin on 16 Mar 2010 03:50 On Tue, 2010-03-16 at 07:41 +0200, Avi Kivity wrote: > On 03/16/2010 07:27 AM, Zhang, Yanmin wrote: > > From: Zhang, Yanmin<yanmin_zhang(a)linux.intel.com> > > > > Based on the discussion in KVM community, I worked out the patch to support > > perf to collect guest os statistics from host side. This patch is implemented > > with Ingo, Peter and some other guys' kind help. Yang Sheng pointed out a > > critical bug and provided good suggestions with other guys. I really appreciate > > their kind help. > > > > The patch adds new subcommand kvm to perf. > > > > perf kvm top > > perf kvm record > > perf kvm report > > perf kvm diff > > > > The new perf could profile guest os kernel except guest os user space, but it > > could summarize guest os user space utilization per guest os. > > > > Below are some examples. > > 1) perf kvm top > > [root(a)lkp-ne01 norm]# perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms > > --guestmodules=/home/ymzhang/guest/modules top > > > > > Thanks for your kind comments. > Excellent, support for guest kernel != host kernel is critical (I can't > remember the last time I ran same kernels). > > How would we support multiple guests with different kernels? With the patch, 'perf kvm report --sort pid" could show summary statistics for all guest os instances. Then, use parameter --pid of 'perf kvm record' to collect single problematic instance data. > Perhaps a > symbol server that perf can connect to (and that would connect to guests > in turn)? > > > diff -Nraup linux-2.6_tipmaster0315/arch/x86/kvm/vmx.c linux-2.6_tipmaster0315_perfkvm/arch/x86/kvm/vmx.c > > --- linux-2.6_tipmaster0315/arch/x86/kvm/vmx.c 2010-03-16 08:59:11.825295404 +0800 > > +++ linux-2.6_tipmaster0315_perfkvm/arch/x86/kvm/vmx.c 2010-03-16 09:01:09.976084492 +0800 > > @@ -26,6 +26,7 @@ > > #include<linux/sched.h> > > #include<linux/moduleparam.h> > > #include<linux/ftrace_event.h> > > +#include<linux/perf_event.h> > > #include "kvm_cache_regs.h" > > #include "x86.h" > > > > @@ -3632,6 +3633,43 @@ static void update_cr8_intercept(struct > > vmcs_write32(TPR_THRESHOLD, irr); > > } > > > > +DEFINE_PER_CPU(int, kvm_in_guest) = {0}; > > + > > +static void kvm_set_in_guest(void) > > +{ > > + percpu_write(kvm_in_guest, 1); > > +} > > + > > +static int kvm_is_in_guest(void) > > +{ > > + return percpu_read(kvm_in_guest); > > +} > > > > There is already PF_VCPU for this. Right, but there is a scope between kvm_guest_enter and really running in guest os, where a perf event might overflow. Anyway, the scope is very narrow, I will change it to use flag PF_VCPU. > > > +static struct perf_guest_info_callbacks kvm_guest_cbs = { > > + .is_in_guest = kvm_is_in_guest, > > + .is_user_mode = kvm_is_user_mode, > > + .get_guest_ip = kvm_get_guest_ip, > > + .reset_in_guest = kvm_reset_in_guest, > > +}; > > > > Should be in common code, not vmx specific. Right. I discussed with Yangsheng. I will move above data structures and callbacks to file arch/x86/kvm/x86.c, and add get_ip, a new callback to kvm_x86_ops. Yanmin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Zhang, Yanmin on 16 Mar 2010 05:30 On Tue, 2010-03-16 at 15:48 +0800, Zhang, Yanmin wrote: > On Tue, 2010-03-16 at 07:41 +0200, Avi Kivity wrote: > > On 03/16/2010 07:27 AM, Zhang, Yanmin wrote: > > > From: Zhang, Yanmin<yanmin_zhang(a)linux.intel.com> > > > > > > Based on the discussion in KVM community, I worked out the patch to support > > > perf to collect guest os statistics from host side. This patch is implemented > > > with Ingo, Peter and some other guys' kind help. Yang Sheng pointed out a > > > critical bug and provided good suggestions with other guys. I really appreciate > > > their kind help. > > > > > > The patch adds new subcommand kvm to perf. > > > > > > perf kvm top > > > perf kvm record > > > perf kvm report > > > perf kvm diff > > > > > > The new perf could profile guest os kernel except guest os user space, but it > > > could summarize guest os user space utilization per guest os. > > > > > > Below are some examples. > > > 1) perf kvm top > > > [root(a)lkp-ne01 norm]# perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms > > > --guestmodules=/home/ymzhang/guest/modules top > > > > > > > > > Thanks for your kind comments. > > > Excellent, support for guest kernel != host kernel is critical (I can't > > remember the last time I ran same kernels). > > > > How would we support multiple guests with different kernels? > With the patch, 'perf kvm report --sort pid" could show > summary statistics for all guest os instances. Then, use > parameter --pid of 'perf kvm record' to collect single problematic instance data. Sorry. I found currently --pid isn't process but a thread (main thread). Ingo, Is it possible to support a new parameter or extend --inherit, so 'perf record' and 'perf top' could collect data on all threads of a process when the process is running? If not, I need add a new ugly parameter which is similar to --pid to filter out process data in userspace. Yanmin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on 16 Mar 2010 05:30
On 03/16/2010 09:24 AM, Ingo Molnar wrote: > * Avi Kivity<avi(a)redhat.com> wrote: > > >> On 03/16/2010 07:27 AM, Zhang, Yanmin wrote: >> >>> From: Zhang, Yanmin<yanmin_zhang(a)linux.intel.com> >>> >>> Based on the discussion in KVM community, I worked out the patch to support >>> perf to collect guest os statistics from host side. This patch is implemented >>> with Ingo, Peter and some other guys' kind help. Yang Sheng pointed out a >>> critical bug and provided good suggestions with other guys. I really appreciate >>> their kind help. >>> >>> The patch adds new subcommand kvm to perf. >>> >>> perf kvm top >>> perf kvm record >>> perf kvm report >>> perf kvm diff >>> >>> The new perf could profile guest os kernel except guest os user space, but it >>> could summarize guest os user space utilization per guest os. >>> >>> Below are some examples. >>> 1) perf kvm top >>> [root(a)lkp-ne01 norm]# perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms >>> --guestmodules=/home/ymzhang/guest/modules top >>> >>> >> Excellent, support for guest kernel != host kernel is critical (I >> can't remember the last time I ran same kernels). >> >> How would we support multiple guests with different kernels? Perhaps a >> symbol server that perf can connect to (and that would connect to guests in >> turn)? >> > The highest quality solution would be if KVM offered a 'guest extension' to > the guest kernel's /proc/kallsyms that made it easy for user-space to get this > information from an authorative source. > > That's the main reason why the host side /proc/kallsyms is so popular and so > useful: while in theory it's mostly redundant information which can be gleaned > from the System.map and other sources of symbol information, it's easily > available and is _always_ trustable to come from the host kernel. > > Separate System.map's have a tendency to go out of sync (or go missing when a > devel kernel gets rebuilt, or if a devel package is not installed), and server > ports (be that a TCP port space server or an UDP port space mount-point) are > both a configuration hassle and are not guest-transparent. > > So for instrumentation infrastructure (such as perf) we have a large and well > founded preference for intrinsic, built-in, kernel-provided information: i.e. > a largely 'built-in' and transparent mechanism to get to guest symbols. > The symbol server's client can certainly access the bits through vmchannel. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |