Prev: drivers/media: Remove unnecessary casts of private_data
Next: staging: autoconvert trivial BKL users to private mutex
From: Zhang, Yanmin on 13 Jul 2010 04:20 Peter, perf doesn't work on my Nehalem EX machine. 1) The 1st start of 'perf top' is ok; 2) Kill the 1st perf and restart it. It doesn't work. No data is showed. I located below commit: commit 1ac62cfff252fb668405ef3398a1fa7f4a0d6d15 Author: Peter Zijlstra <peterz(a)infradead.org> Date: Fri Mar 26 14:08:44 2010 +0100 perf, x86: Add Nehelem PMU programming errata workaround workaround From: Peter Zijlstra <a.p.zijlstra(a)chello.nl> Date: Fri Mar 26 13:59:41 CET 2010 Implement the workaround for Intel Errata AAK100 and AAP53. Also, remove the Core-i7 name for Nehalem events since there are also Westmere based i7 chips. If I comment out the workaround in function intel_pmu_nhm_enable_all, perf could work. A quick glance shows: wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); should be: wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); I triggered sysrq to dump PMU registers and found the last bit of global status register is 1. I added a status reset operation like below patch: --- linux-2.6.35-rc5/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 09:38:11.000000000 +0800 +++ linux-2.6.35-rc5_fork/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 14:41:42.000000000 +0800 @@ -505,8 +505,13 @@ static void intel_pmu_nhm_enable_all(int wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x4300B1); wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x4300B5); - wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); + wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x0); + /* + * Reset the last 3 bits of global status register in case + * previous enabling causes overflows. + */ + wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, 0x7); for (i = 0; i < 3; i++) { struct perf_event *event = cpuc->events[i]; However, it still doesn't work. Current right way is to comment out the workaround. Yanmin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on 13 Jul 2010 05:00 * Zhang, Yanmin <yanmin_zhang(a)linux.intel.com> wrote: > Peter, > > perf doesn't work on my Nehalem EX machine. > 1) The 1st start of 'perf top' is ok; > 2) Kill the 1st perf and restart it. It doesn't work. No data is showed. > > I located below commit: > commit 1ac62cfff252fb668405ef3398a1fa7f4a0d6d15 > Author: Peter Zijlstra <peterz(a)infradead.org> > Date: Fri Mar 26 14:08:44 2010 +0100 > > perf, x86: Add Nehelem PMU programming errata workaround > > workaround From: Peter Zijlstra <a.p.zijlstra(a)chello.nl> > Date: Fri Mar 26 13:59:41 CET 2010 > > Implement the workaround for Intel Errata AAK100 and AAP53. > > Also, remove the Core-i7 name for Nehalem events since there are > also Westmere based i7 chips. > > > If I comment out the workaround in function intel_pmu_nhm_enable_all, > perf could work. > > A quick glance shows: > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); > should be: > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); > I triggered sysrq to dump PMU registers and found the last bit of > global status register is 1. I added a status reset operation like below patch: > > --- linux-2.6.35-rc5/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 09:38:11.000000000 +0800 > +++ linux-2.6.35-rc5_fork/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 14:41:42.000000000 +0800 > @@ -505,8 +505,13 @@ static void intel_pmu_nhm_enable_all(int > wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x4300B1); > wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x4300B5); > > - wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); > + wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x0); > + /* > + * Reset the last 3 bits of global status register in case > + * previous enabling causes overflows. > + */ > + wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, 0x7); > > for (i = 0; i < 3; i++) { > struct perf_event *event = cpuc->events[i]; > > > However, it still doesn't work. Current right way is to comment out > the workaround. Well, how about doing it like this: wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); /* * Reset the last 3 bits of global status register in case * previous enabling causes overflows. */ wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, 0x7); for (i = 0; i < 3; i++) { struct perf_event *event = cpuc->events[i]; ... } wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x0); I.e. global-mask, overflow-clear, explicit-enable, then global-enable? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Stephane Eranian on 13 Jul 2010 11:20 On Tue, Jul 13, 2010 at 10:14 AM, Zhang, Yanmin <yanmin_zhang(a)linux.intel.com> wrote: > Peter, > > perf doesn't work on my Nehalem EX machine. > 1) The 1st start of 'perf top' is ok; > 2) Kill the 1st perf and restart it. It doesn't work. No data is showed. > > I located below commit: > commit 1ac62cfff252fb668405ef3398a1fa7f4a0d6d15 > Author: Peter Zijlstra <peterz(a)infradead.org> > Date: Fri Mar 26 14:08:44 2010 +0100 > > perf, x86: Add Nehelem PMU programming errata workaround > > workaround From: Peter Zijlstra <a.p.zijlstra(a)chello.nl> > Date: Fri Mar 26 13:59:41 CET 2010 > > Implement the workaround for Intel Errata AAK100 and AAP53. > > Also, remove the Core-i7 name for Nehalem events since there are > also Westmere based i7 chips. > > > If I comment out the workaround in function intel_pmu_nhm_enable_all, > perf could work. > > A quick glance shows: > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); > should be: > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); > > > I triggered sysrq to dump PMU registers and found the last bit of > global status register is 1. I added a status reset operation like below patch: > What do you call the last bit? bit0 or bit63? > --- linux-2.6.35-rc5/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 09:38:11.000000000 +0800 > +++ linux-2.6.35-rc5_fork/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 14:41:42.000000000 +0800 > @@ -505,8 +505,13 @@ static void intel_pmu_nhm_enable_all(int > wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x4300B1); > wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x4300B5); > > - wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); > + wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x0); > + /* > + * Reset the last 3 bits of global status register in case > + * previous enabling causes overflows. > + */ The workaround cannot cause on overflow because the associated counters won't count anything given their umask value is 0 (which does not correspond to anything for event 0xB1, event 0xB5 is undocumented). This is for the events described in table A.2. If NHM-EX has a different definition for 0xB1, 0xB5, then that's another story. > + wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, 0x7); > > for (i = 0; i < 3; i++) { > struct perf_event *event = cpuc->events[i]; > > > > However, it still doesn't work. Current right way is to comment out > the workaround. > > Yanmin > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Zhang, Yanmin on 13 Jul 2010 20:20 On Tue, 2010-07-13 at 17:16 +0200, Stephane Eranian wrote: > On Tue, Jul 13, 2010 at 10:14 AM, Zhang, Yanmin > <yanmin_zhang(a)linux.intel.com> wrote: > > Peter, > > > > perf doesn't work on my Nehalem EX machine. > > 1) The 1st start of 'perf top' is ok; > > 2) Kill the 1st perf and restart it. It doesn't work. No data is showed. > > > > I located below commit: > > commit 1ac62cfff252fb668405ef3398a1fa7f4a0d6d15 > > Author: Peter Zijlstra <peterz(a)infradead.org> > > Date: Fri Mar 26 14:08:44 2010 +0100 > > > > perf, x86: Add Nehelem PMU programming errata workaround > > > > workaround From: Peter Zijlstra <a.p.zijlstra(a)chello.nl> > > Date: Fri Mar 26 13:59:41 CET 2010 > > > > Implement the workaround for Intel Errata AAK100 and AAP53. > > > > Also, remove the Core-i7 name for Nehalem events since there are > > also Westmere based i7 chips. > > > > > > If I comment out the workaround in function intel_pmu_nhm_enable_all, > > perf could work. > > > > A quick glance shows: > > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); > > should be: > > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); > > > > > > I triggered sysrq to dump PMU registers and found the last bit of > > global status register is 1. I added a status reset operation like below patch: > > > What do you call the last bit? bit0 or bit63? Sorry for confusing you. It's bit0, mapping to PERFMON_EVENTSEL0. > > > --- linux-2.6.35-rc5/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 09:38:11.000000000 +0800 > > +++ linux-2.6.35-rc5_fork/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 14:41:42.000000000 +0800 > > @@ -505,8 +505,13 @@ static void intel_pmu_nhm_enable_all(int > > wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x4300B1); > > wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x4300B5); > > > > - wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); > > + wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); > > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x0); > > + /* > > + * Reset the last 3 bits of global status register in case > > + * previous enabling causes overflows. > > + */ > > The workaround cannot cause on overflow because the associated counters > won't count anything given their umask value is 0 (which does not correspond > to anything for event 0xB1, event 0xB5 is undocumented). This is for the events > described in table A.2. If NHM-EX has a different definition for 0xB1, 0xB5, > then that's another story. I found the status bit is set by triggering sysrq to dump PMU registers. If I start perf by gdb, sometimes, perf could work. I found one processor's 1st status register is equal to 0 while other processors' are 1. If just starting perf, all 1st status registers are equal to 1. > > > > + wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, 0x7); > > > > for (i = 0; i < 3; i++) { > > struct perf_event *event = cpuc->events[i]; > > > > > > > > However, it still doesn't work. Current right way is to comment out > > the workaround. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Stephane Eranian on 13 Jul 2010 20:40
What about running simpler commands like perf stat? On Wed, Jul 14, 2010 at 2:13 AM, Zhang, Yanmin <yanmin_zhang(a)linux.intel.com> wrote: > On Tue, 2010-07-13 at 17:16 +0200, Stephane Eranian wrote: >> On Tue, Jul 13, 2010 at 10:14 AM, Zhang, Yanmin >> <yanmin_zhang(a)linux.intel.com> wrote: >> > Peter, >> > >> > perf doesn't work on my Nehalem EX machine. >> > 1) The 1st start of 'perf top' is ok; >> > 2) Kill the 1st perf and restart it. It doesn't work. No data is showed. >> > >> > I located below commit: >> > commit 1ac62cfff252fb668405ef3398a1fa7f4a0d6d15 >> > Author: Peter Zijlstra <peterz(a)infradead.org> >> > Date: Fri Mar 26 14:08:44 2010 +0100 >> > >> > perf, x86: Add Nehelem PMU programming errata workaround >> > >> > workaround From: Peter Zijlstra <a.p.zijlstra(a)chello.nl> >> > Date: Fri Mar 26 13:59:41 CET 2010 >> > >> > Implement the workaround for Intel Errata AAK100 and AAP53. >> > >> > Also, remove the Core-i7 name for Nehalem events since there are >> > also Westmere based i7 chips. >> > >> > >> > If I comment out the workaround in function intel_pmu_nhm_enable_all, >> > perf could work. >> > >> > A quick glance shows: >> > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); >> > should be: >> > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); >> > >> > >> > I triggered sysrq to dump PMU registers and found the last bit of >> > global status register is 1. I added a status reset operation like below patch: >> > >> What do you call the last bit? bit0 or bit63? > Sorry for confusing you. It's bit0, mapping to PERFMON_EVENTSEL0. > >> >> > --- linux-2.6.35-rc5/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 09:38:11.000000000 +0800 >> > +++ linux-2.6.35-rc5_fork/arch/x86/kernel/cpu/perf_event_intel.c 2010-07-14 14:41:42.000000000 +0800 >> > @@ -505,8 +505,13 @@ static void intel_pmu_nhm_enable_all(int >> > wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x4300B1); >> > wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x4300B5); >> > >> > - wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x3); >> > + wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x7); >> > wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0x0); >> > + /* >> > + * Reset the last 3 bits of global status register in case >> > + * previous enabling causes overflows. >> > + */ >> >> The workaround cannot cause on overflow because the associated counters >> won't count anything given their umask value is 0 (which does not correspond >> to anything for event 0xB1, event 0xB5 is undocumented). This is for the events >> described in table A.2. If NHM-EX has a different definition for 0xB1, 0xB5, >> then that's another story. > I found the status bit is set by triggering sysrq to dump PMU registers. > > If I start perf by gdb, sometimes, perf could work. I found one processor's 1st status > register is equal to 0 while other processors' are 1. If just starting perf, all 1st > status registers are equal to 1. > >> >> >> > + wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, 0x7); >> > >> > for (i = 0; i < 3; i++) { >> > struct perf_event *event = cpuc->events[i]; >> > >> > >> > >> > However, it still doesn't work. Current right way is to comment out >> > the workaround. > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |