Prev: [tip:timers/core] Checkpatch: Warn about unexpectedly long msleep's
Next: genhd, efi: add efi partition metadata to hd_structs
From: Lin Ming on 4 Aug 2010 05:30 With nmi_watchdog enabled, perf_event_nmi_handler always return NOTIFY_STOP(active_events > 0), and the notifier call chain will not call further. If it was not perf NMI, does the perf nmi handler may stop the real NMI handler get called because NOTIFY_STOP is returned?? static int __kprobes perf_event_nmi_handler(struct notifier_block *self, unsigned long cmd, void *__args) { struct die_args *args = __args; struct pt_regs *regs; if (!atomic_read(&active_events)) ===> With nmi_watchdog enabled, active_events > 0 return NOTIFY_DONE; switch (cmd) { case DIE_NMI: case DIE_NMI_IPI: break; default: return NOTIFY_DONE; } regs = args->regs; apic_write(APIC_LVTPC, APIC_DM_NMI); /* * Can't rely on the handled return value to say it was our NMI, two * events could trigger 'simultaneously' raising two back-to-back NMIs. * * If the first NMI handles both, the latter will be empty and daze * the CPU. */ x86_pmu.handle_irq(regs); return NOTIFY_STOP; } Thanks, Lin Ming -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on 4 Aug 2010 06:00 On Wed, 2010-08-04 at 17:21 +0800, Lin Ming wrote: > With nmi_watchdog enabled, perf_event_nmi_handler always return > NOTIFY_STOP(active_events > 0), and the notifier call chain will not > call further. > > If it was not perf NMI, does the perf nmi handler may stop the real NMI > handler get called because NOTIFY_STOP is returned?? > > static int __kprobes > perf_event_nmi_handler(struct notifier_block *self, > unsigned long cmd, void *__args) > { > struct die_args *args = __args; > struct pt_regs *regs; > > if (!atomic_read(&active_events)) ===> With nmi_watchdog enabled, active_events > 0 > return NOTIFY_DONE; > > switch (cmd) { > case DIE_NMI: > case DIE_NMI_IPI: > break; > > default: > return NOTIFY_DONE; > } > > regs = args->regs; > > apic_write(APIC_LVTPC, APIC_DM_NMI); > /* > * Can't rely on the handled return value to say it was our NMI, two > * events could trigger 'simultaneously' raising two back-to-back NMIs. > * > * If the first NMI handles both, the latter will be empty and daze > * the CPU. > */ > x86_pmu.handle_irq(regs); > > return NOTIFY_STOP; > } Urgh,.. right, so what is the alternative? we don't seem to have a reliable way of telling where the NMI originated from. As that comment says, the PMU can raise the NMI and raise the pending NMI latch for a second over-run, at which point the first NMI will likely see the overflow status for both, clear both, and the second NMI will see a 0 overflow status, return it wasn't the PMU, but since the PMU did raise it, nobody else will claim it, and we get these silly dazed and confused thingies. What NMI source are you concerned about and can it reliably tell if it raised the NMI or not? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Robert Richter on 4 Aug 2010 06:10 On 04.08.10 05:21:10, Lin Ming wrote: > With nmi_watchdog enabled, perf_event_nmi_handler always return > NOTIFY_STOP(active_events > 0), and the notifier call chain will not > call further. > > If it was not perf NMI, does the perf nmi handler may stop the real NMI > handler get called because NOTIFY_STOP is returned?? There is no general mechanism for recording the NMI source (except if it was external triggered, e.g. by the southbridge). Also, all nmis are mapped to NMI vector 2 and therefore there is no way to find out the reason by using apic mask registers. Now, if multiple perfctrs trigger an nmi, it may happen that a handler has nothing to do because the counter was already handled by the previous one. Thus, it is valid to have unhandled nmis caused by perfctrs. So, with counters enabled we always have to return stop for *all* nmis as we cannot detect that it was an perfctr nmi. Otherwise we could trigger an unhandled nmi. To ensure that all other nmi handlers are called, the perfctr's nmi handler must have the lowest priority. Then, the handler will be the last in the chain. -Robert -- Advanced Micro Devices, Inc. Operating System Research Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Robert Richter on 4 Aug 2010 06:30 On 04.08.10 06:24:18, Peter Zijlstra wrote: > On Wed, 2010-08-04 at 12:01 +0200, Robert Richter wrote: > > To ensure that all other nmi handlers are > > called, the perfctr's nmi handler must have the lowest priority. Then, > > the handler will be the last in the chain. > > Well, unless another NMI handler has the exact same issue and also > starts eating all NMIs, just in case. In this case we will have to change the implementation for unhandled nmis. But I don't know of other sources with this issue. -Robert -- Advanced Micro Devices, Inc. Operating System Research Center -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on 4 Aug 2010 06:30
On Wed, 2010-08-04 at 12:01 +0200, Robert Richter wrote: > To ensure that all other nmi handlers are > called, the perfctr's nmi handler must have the lowest priority. Then, > the handler will be the last in the chain. Well, unless another NMI handler has the exact same issue and also starts eating all NMIs, just in case. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |