perf, x86: try to handle unknown nmis with running perfctrs [Kernel]

Prev: [PATCH 14/18 v2] staging: rtl8187se: call disable_pci_device() if pci_probe() failed
Next: [PATCH 1/1] Char: mxser, call pci_disable_device from probe/remove

From: Cyrill Gorcunov on 12 Aug 2010 10:40

On Thu, Aug 12, 2010 at 03:24:19PM +0200, Robert Richter wrote:
> On 10.08.10 15:24:28, Cyrill Gorcunov wrote:
> > On Tue, Aug 10, 2010 at 09:05:41PM +0200, Robert Richter wrote:
> > > On 10.08.10 13:24:51, Cyrill Gorcunov wrote:
> > >
> > > > It gets masked on NMI arrival, at least for some models (Core Duo, P4,
> > > > P6 M and I suspect more theh that, that was the reason oprofile has
> > > > it, also there is a note in SDM V3a page 643).
> > >
> > > Yes, that's right, I never noticed that. Maybe it is better to
> > > implement the apic write it in model specific code then.
> > >
> >
> > Perhaps we can make it simplier I think, ie like it was before -- we just
> > move it under your new DIE_NMIUNKNOWN, in separate patch of course. Though
> > I'm fine with either way.
>
> I do not understand why you want to put this in the 'unknown'
> path. Isn't it necessary to unmask the vector with every call of the
> nmi handler?
>
> -Robert
>

Heh, it's simple - I'm screwed. Robert you're right, of course it should NOT
be under every 'unknown' nmi. I thought about small optimization here, but
I think all this should be done only _after_ your patch is merged.

Sorry for confuse ;)

-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Frederic Weisbecker on 13 Aug 2010 00:30

On Thu, Aug 12, 2010 at 12:00:58AM +0200, Robert Richter wrote:
> I was debuging this a little more, see version 2 below.
>
> -Robert
>
> --
>
> From 8bb831af56d118b85fc38e0ddc2e516f7504b9fb Mon Sep 17 00:00:00 2001
> From: Robert Richter <robert.richter(a)amd.com>
> Date: Thu, 5 Aug 2010 16:19:59 +0200
> Subject: [PATCH] perf, x86: try to handle unknown nmis with running perfctrs
>
> When perfctrs are running it is valid to have unhandled nmis, two
> events could trigger 'simultaneously' raising two back-to-back
> NMIs. If the first NMI handles both, the latter will be empty and daze
> the CPU.
>
> The solution to avoid an 'unknown nmi' massage in this case was simply
> to stop the nmi handler chain when perfctrs are runnning by stating
> the nmi was handled. This has the drawback that a) we can not detect
> unknown nmis anymore, and b) subsequent nmi handlers are not called.
>
> This patch addresses this. Now, we check this unknown NMI if it could
> be a perfctr back-to-back NMI. Otherwise we pass it and let the kernel
> handle the unknown nmi.
>
> This is a debug log:
>
> cpu #6, nmi #32333, skip_nmi #32330, handled = 1, time = 1934364430
> cpu #6, nmi #32334, skip_nmi #32330, handled = 1, time = 1934704616
> cpu #6, nmi #32335, skip_nmi #32336, handled = 2, time = 1936032320
> cpu #6, nmi #32336, skip_nmi #32336, handled = 0, time = 1936034139
> cpu #6, nmi #32337, skip_nmi #32336, handled = 1, time = 1936120100
> cpu #6, nmi #32338, skip_nmi #32336, handled = 1, time = 1936404607
> cpu #6, nmi #32339, skip_nmi #32336, handled = 1, time = 1937983416
> cpu #6, nmi #32340, skip_nmi #32341, handled = 2, time = 1938201032
> cpu #6, nmi #32341, skip_nmi #32341, handled = 0, time = 1938202830
> cpu #6, nmi #32342, skip_nmi #32341, handled = 1, time = 1938443743
> cpu #6, nmi #32343, skip_nmi #32341, handled = 1, time = 1939956552
> cpu #6, nmi #32344, skip_nmi #32341, handled = 1, time = 1940073224
> cpu #6, nmi #32345, skip_nmi #32341, handled = 1, time = 1940485677
> cpu #6, nmi #32346, skip_nmi #32347, handled = 2, time = 1941947772
> cpu #6, nmi #32347, skip_nmi #32347, handled = 1, time = 1941949818
> cpu #6, nmi #32348, skip_nmi #32347, handled = 0, time = 1941951591
> Uhhuh. NMI received for unknown reason 00 on CPU 6.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
>
> Deltas:
>
> nmi #32334 340186
> nmi #32335 1327704
> nmi #32336 1819 <<<< back-to-back nmi [1]
> nmi #32337 85961
> nmi #32338 284507
> nmi #32339 1578809
> nmi #32340 217616
> nmi #32341 1798 <<<< back-to-back nmi [2]
> nmi #32342 240913
> nmi #32343 1512809
> nmi #32344 116672
> nmi #32345 412453
> nmi #32346 1462095 <<<< 1st nmi (standard) handling 2 counters
> nmi #32347 2046 <<<< 2nd nmi (back-to-back) handling one counter
> nmi #32348 1773 <<<< 3rd nmi (back-to-back) handling no counter! [3]
>
> For back-to-back nmi detection there are the following rules:
>
> The perfctr nmi handler was handling more than one counter and no
> counter was handled in the subsequent nmi (see [1] and [2] above).
>
> There is another case if there are two subsequent back-to-back nmis
> [3]. In this case we measure the time between the first and the
> 2nd. The 2nd is detected as back-to-back because the first handled
> more than one counter. The time between the 1st and the 2nd is used to
> calculate a range for which we assume a back-to-back nmi. Now, the 3rd
> nmi triggers, we measure again the time delta and compare it with the
> first delta from which we know it was a back-to-back nmi. If the 3rd
> nmi is within the range, it is also a back-to-back nmi and we drop it.
>
> Signed-off-by: Robert Richter <robert.richter(a)amd.com>
> ---

That time based thing looks a bit complicated.

I'm still not sure why you don't want to use a simple flag:

After handled a perf NMI:

if (handled more than one counter)
__get_cpu_var(skip_unknown) = 1;

While handling an unknown NMI:

if (__get_cpu_var(skip_unknown)) {
__get_cpu_var(skip_unknow) = 0;
return NOTIFY_STOP;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Frederic Weisbecker on 13 Aug 2010 00:40

On Wed, Aug 11, 2010 at 01:10:46PM +0200, Robert Richter wrote:
> On 10.08.10 22:44:55, Frederic Weisbecker wrote:
> > On Tue, Aug 10, 2010 at 04:48:56PM -0400, Don Zickus wrote:
> > > @@ -1200,7 +1200,7 @@ void perf_events_lapic_init(void)
> > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > > }
> > >
> > > -static DEFINE_PER_CPU(unsigned int, perfctr_handled);
> > > +static DEFINE_PER_CPU(unsigned int, perfctr_skip);
>
> Yes, using perfctr_skip is better to understand ...
>
> > > @@ -1229,14 +1228,11 @@ perf_event_nmi_handler(struct notifier_block *self,
> > > * was handling a perfctr. Otherwise we pass it and
> > > * let the kernel handle the unknown nmi.
> > > *
> > > - * Note: this could be improved if we drop unknown
> > > - * NMIs only if we handled more than one perfctr in
> > > - * the previous NMI.
> > > */
> > > - this_nmi = percpu_read(irq_stat.__nmi_count);
> > > - prev_nmi = __get_cpu_var(perfctr_handled);
> > > - if (this_nmi == prev_nmi + 1)
> > > + if (__get_cpu_var(perfctr_skip)){
> > > + __get_cpu_var(perfctr_skip) -=1;
> > > return NOTIFY_STOP;
> > > + }
> > > return NOTIFY_DONE;
> > > default:
> > > return NOTIFY_DONE;
> > > @@ -1246,11 +1242,21 @@ perf_event_nmi_handler(struct notifier_block *self,
> > >
> > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > >
> > > - if (!x86_pmu.handle_irq(regs))
> > > + handled = x86_pmu.handle_irq(regs);
> > > + if (!handled)
> > > + /* not our NMI */
> > > return NOTIFY_DONE;
> > > -
> > > - /* handled */
> > > - __get_cpu_var(perfctr_handled) = percpu_read(irq_stat.__nmi_count);
> > > + else if (handled > 1)
> > > + /*
> > > + * More than one perfctr triggered. This could have
> > > + * caused a second NMI that we must now skip because
> > > + * we have already handled it. Remember it.
> > > + *
> > > + * NOTE: We have no way of knowing if a second NMI was
> > > + * actually triggered, so we may accidentally skip a valid
> > > + * unknown nmi later.
> > > + */
> > > + __get_cpu_var(perfctr_skip) +=1;
>
> ... but this will not work. You have to mark the *absolute* nmi number
> here. If you only raise a flag, the next unknown nmi will be dropped,
> every.

Isn't it what we want? Only the next unknown nmi gets dropped.

> Because, in between there could have been other nmis that
> stopped the chain and thus the 'unknown' path is not executed.

I'm not sure what you mean here. Are you thinking about a third
NMI source that triggers while we are still handling the first
NMI in the back to back sequence?

> The trick in my patch is that you *know*, which nmi you want to skip.

Well with the flag you also know which nmi you want to skip.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev |
Pages: 1 2
Prev: [PATCH 14/18 v2] staging: rtl8187se: call disable_pci_device() if pci_probe() failed
Next: [PATCH 1/1] Char: mxser, call pci_disable_device from probe/remove