Prev: [PATCH v2] Documentation/sysctl/vm.txt typo
Next: perf tools: allow cross compiling with DWARF support
From: Peter Zijlstra on 1 Jul 2010 11:10 On Thu, 2010-07-01 at 16:36 +0200, Peter Zijlstra wrote: > > Pushed out a new git tree with the below delta folded in. > Said tree also cures x86 and folds the SH build fix. Matt, you said it broke SH completely, but did you try perf stat? perf record is not supposed to work on SH due to the hardware not having an overflow interrupt. Which made me think, what on SH guarantees we update the counter often enough not to suffer from counter wrap? Would it make sense to make the SH code hook into their arch tick handler and update the counters from there? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: MattFleming on 1 Jul 2010 11:40 On Thu, Jul 01, 2010 at 05:02:35PM +0200, Peter Zijlstra wrote: > > Matt, you said it broke SH completely, but did you try perf stat? perf > record is not supposed to work on SH due to the hardware not having an > overflow interrupt. perf record does work to some degree. It definitely worked before applying your changes but not after. I admit I haven't really read the perf event code, but Paul will know. > Which made me think, what on SH guarantees we update the counter often > enough not to suffer from counter wrap? Would it make sense to make the > SH code hook into their arch tick handler and update the counters from > there? This was the way that the oprofile code used to work. Paul and I were talking about using a hrtimer to sample performance counters as opposed to piggy-backing on the tick handler. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on 1 Jul 2010 11:50 On Thu, 2010-07-01 at 16:31 +0100, MattFleming wrote: > On Thu, Jul 01, 2010 at 05:02:35PM +0200, Peter Zijlstra wrote: > > > > Matt, you said it broke SH completely, but did you try perf stat? perf > > record is not supposed to work on SH due to the hardware not having an > > overflow interrupt. > > perf record does work to some degree. It definitely worked before > applying your changes but not after. I admit I haven't really read the > perf event code, but Paul will know. Ok, let me look at that again. > > Which made me think, what on SH guarantees we update the counter often > > enough not to suffer from counter wrap? Would it make sense to make the > > SH code hook into their arch tick handler and update the counters from > > there? > > This was the way that the oprofile code used to work. Paul and I were > talking about using a hrtimer to sample performance counters as > opposed to piggy-backing on the tick handler. Ah, for sampling for sure, simply group a software perf event and a hardware perf event together and use PERF_SAMPLE_READ. But suppose its a non sampling counter, how do you avoid overflows of the hardware register? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Matt Fleming on 1 Jul 2010 12:10 On Thu, Jul 01, 2010 at 05:39:53PM +0200, Peter Zijlstra wrote: > On Thu, 2010-07-01 at 16:31 +0100, MattFleming wrote: > > On Thu, Jul 01, 2010 at 05:02:35PM +0200, Peter Zijlstra wrote: > > > > > > Which made me think, what on SH guarantees we update the counter often > > > enough not to suffer from counter wrap? Would it make sense to make the > > > SH code hook into their arch tick handler and update the counters from > > > there? > > > > This was the way that the oprofile code used to work. Paul and I were > > talking about using a hrtimer to sample performance counters as > > opposed to piggy-backing on the tick handler. > > Ah, for sampling for sure, simply group a software perf event and a > hardware perf event together and use PERF_SAMPLE_READ. > > But suppose its a non sampling counter, how do you avoid overflows of > the hardware register? Hmm.. good question! I'm not entirely sure we do. As you were saying, without using the arch tick handler, I don't think we can guarantee avoiding counter overflows. Currently the counters are chained such that the counters are at least 48 bits. I guess all my tests were short enough to not cause the counters to wrap ;-) At some point we will want to not require chaining, giving us 32 bits. So yeah, this is issue is gonna crop up then. Interestingly, the counters on SH don't wrap when they reach they're maximum value, they just stop incrementing. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Paul Mundt on 1 Jul 2010 23:00 On Thu, Jul 01, 2010 at 05:39:53PM +0200, Peter Zijlstra wrote: > On Thu, 2010-07-01 at 16:31 +0100, MattFleming wrote: > > On Thu, Jul 01, 2010 at 05:02:35PM +0200, Peter Zijlstra wrote: > > > > > > Matt, you said it broke SH completely, but did you try perf stat? perf > > > record is not supposed to work on SH due to the hardware not having an > > > overflow interrupt. > > > > perf record does work to some degree. It definitely worked before > > applying your changes but not after. I admit I haven't really read the > > perf event code, but Paul will know. > > Ok, let me look at that again. > Any perf record functionality observed is entirely coincidental and not by design. It was something I planned to revisit, but most of what we have right now is only geared at the one-shot perf stat case. > > > Which made me think, what on SH guarantees we update the counter often > > > enough not to suffer from counter wrap? Would it make sense to make the > > > SH code hook into their arch tick handler and update the counters from > > > there? > > > > This was the way that the oprofile code used to work. Paul and I were > > talking about using a hrtimer to sample performance counters as > > opposed to piggy-backing on the tick handler. > > Ah, for sampling for sure, simply group a software perf event and a > hardware perf event together and use PERF_SAMPLE_READ. > > But suppose its a non sampling counter, how do you avoid overflows of > the hardware register? At the moment it's not an issue since we have big enough counters that overflows don't really happen, especially if we're primarily using them for one-shot measuring. SH-4A style counters behave in such a fashion that we have 2 general purpose counters, and 2 counters for measuring bus transactions. These bus counters can optionally be disabled and used in a chained mode to provide the general purpose counters a 64-bit counter (the actual validity in the upper half of the chained counter varies depending on the CPUs, but all of them can do at least 48-bits when chained). Each counter has overflow detection and asserts an overflow bit, but there are no exceptions associated with this, so it's something that we would have to tie in to the tick or defer to a bottom half handler in the non-sampling case (or simply test on every read, and accept some degree of accuracy loss). Any perf record functionality we implement with this sort of scheme is only going to provide ballpark figures anyways, so it's certainly within the parameters of acceptable loss in exchange for increased functionality. Different CPUs also implement their overflows differently, some will roll and resume counting, but most simply stop until the overflow bit is cleared. My main plan was to build on top of the multi-pmu stuff, unchain the counters, and expose the bus counters with their own event map as a separate PMU instance. All of the other handling logic can pretty much be reused directly, but it does mean that we need to be a bit smarter about overflow detection/handling. Sampling and so on is also on the TODO list, but is as of yet still not supported. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: [PATCH v2] Documentation/sysctl/vm.txt typo Next: perf tools: allow cross compiling with DWARF support |