From: Maciej W. Rozycki on 14 Jul 2010 12:50 On Wed, 14 Jul 2010, Mathieu Desnoyers wrote: > This patch makes all faults, traps and exception safe to be called from NMI > context *except* single-stepping, which requires iret to restore the TF (trap > flag) and jump to the return address in a single instruction. Sorry, no kprobes Watch out for the RF flag too, that is not set correctly by POPFD -- that may be important for faulting instructions that also have a hardware breakpoint set at their address. > support in NMI handlers because of this limitation. This cannot be emulated > with popf/lret, because lret would be single-stepped. It does not apply to > "immediate values" because they do not use single-stepping. This code detects if > the TF flag is set and uses the iret path for single-stepping, even if it > reactivates NMIs prematurely. What about the VM flag for VM86 tasks? It cannot be changed by POPFD either. How about only using the special return path when a nested exception is about to return to the NMI handler? You'd avoid all the odd cases then that do not happen in the NMI context. Maciej -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Mathieu Desnoyers on 14 Jul 2010 14:20 * Maciej W. Rozycki (macro(a)linux-mips.org) wrote: > On Wed, 14 Jul 2010, Mathieu Desnoyers wrote: > > > This patch makes all faults, traps and exception safe to be called from NMI > > context *except* single-stepping, which requires iret to restore the TF (trap > > flag) and jump to the return address in a single instruction. Sorry, no kprobes > > Watch out for the RF flag too, that is not set correctly by POPFD -- that > may be important for faulting instructions that also have a hardware > breakpoint set at their address. > > > support in NMI handlers because of this limitation. This cannot be emulated > > with popf/lret, because lret would be single-stepped. It does not apply to > > "immediate values" because they do not use single-stepping. This code detects if > > the TF flag is set and uses the iret path for single-stepping, even if it > > reactivates NMIs prematurely. > > What about the VM flag for VM86 tasks? It cannot be changed by POPFD > either. > > How about only using the special return path when a nested exception is > about to return to the NMI handler? You'd avoid all the odd cases then > that do not happen in the NMI context. This is exactly what this patch does :-) It selects the return path with + testl $NMI_MASK,TI_preempt_count(%ebp) + jz resume_kernel /* Not nested over NMI ? */ In addition, about int3 breakpoints use in the kernel, AFAIK the handler does not explicitly set the RF flag, and the breakpoint instruction (int3) appears not to set it. (from my understanding of Intel's Intel Architecture Software Developer's Manual Volume 3: System Programming 15.3.1.1. INSTRUCTION-BREAKPOINT EXCEPTION C) So it should be safe to set a int3 breakpoint in a NMI handler with this patch. It's just the "single-stepping" feature of kprobes which is problematic. Luckily, only int3 is needed for code patching bypass. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Maciej W. Rozycki on 14 Jul 2010 15:30 On Wed, 14 Jul 2010, Mathieu Desnoyers wrote: > > How about only using the special return path when a nested exception is > > about to return to the NMI handler? You'd avoid all the odd cases then > > that do not happen in the NMI context. > > This is exactly what this patch does :-) Ah, OK then -- I understood you actually tested the value of TF in the image to be restored. > It selects the return path with > > + testl $NMI_MASK,TI_preempt_count(%ebp) > + jz resume_kernel /* Not nested over NMI ? */ > > In addition, about int3 breakpoints use in the kernel, AFAIK the handler does > not explicitly set the RF flag, and the breakpoint instruction (int3) appears > not to set it. (from my understanding of Intel's > Intel Architecture Software Developer's Manual Volume 3: System Programming > 15.3.1.1. INSTRUCTION-BREAKPOINT EXCEPTION C) The CPU only sets RF itself in the image saved in certain cases -- you'd see it set in the page fault handler for example, so that once the handler has finished any instruction breakpoint does not hit (presumably again, because the instruction breakpoint debug exception has the highest priority). You mentioned the need to handle these faults. > So it should be safe to set a int3 breakpoint in a NMI handler with this patch. > > It's just the "single-stepping" feature of kprobes which is problematic. > Luckily, only int3 is needed for code patching bypass. Actually the breakpoint exception handler should actually probably set RF explicitly, but that depends on the exact debugging scenario, so I can't comment on it further. I don't know how INT3 is used in this context, so I'm just noting this may be a danger zone. Maciej -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Mathieu Desnoyers on 14 Jul 2010 16:00 * Maciej W. Rozycki (macro(a)linux-mips.org) wrote: > On Wed, 14 Jul 2010, Mathieu Desnoyers wrote: > > > > How about only using the special return path when a nested exception is > > > about to return to the NMI handler? You'd avoid all the odd cases then > > > that do not happen in the NMI context. > > > > This is exactly what this patch does :-) > > Ah, OK then -- I understood you actually tested the value of TF in the > image to be restored. It tests it too. When it detects that the return path is about to return to a NMI handler, it checks if the TF flag is set. If it is set, then "iret" is really needed, because TF can only single-step an instruction when set by "iret". The popf/ret scheme would otherwise trap at the "ret" instruction that follows popf. Anyway, single-stepping is really discouraged in nmi handlers, because there is no way to go around the iret. > > > It selects the return path with > > > > + testl $NMI_MASK,TI_preempt_count(%ebp) > > + jz resume_kernel /* Not nested over NMI ? */ > > > > In addition, about int3 breakpoints use in the kernel, AFAIK the handler does > > not explicitly set the RF flag, and the breakpoint instruction (int3) appears > > not to set it. (from my understanding of Intel's > > Intel Architecture Software Developer's Manual Volume 3: System Programming > > 15.3.1.1. INSTRUCTION-BREAKPOINT EXCEPTION C) > > The CPU only sets RF itself in the image saved in certain cases -- you'd > see it set in the page fault handler for example, so that once the handler > has finished any instruction breakpoint does not hit (presumably again, > because the instruction breakpoint debug exception has the highest > priority). You mentioned the need to handle these faults. Well, the only case where I think it might make sense to allow a breakpoint in NMI handler code would be to temporarily replace a static branch, which should in no way be able to trigger any other fault. > > > So it should be safe to set a int3 breakpoint in a NMI handler with this patch. > > > > It's just the "single-stepping" feature of kprobes which is problematic. > > Luckily, only int3 is needed for code patching bypass. > > Actually the breakpoint exception handler should actually probably set RF > explicitly, but that depends on the exact debugging scenario, so I can't > comment on it further. I don't know how INT3 is used in this context, so > I'm just noting this may be a danger zone. In the case of temporary bypass, the int3 is only there to divert the instruction execution flow to somewhere else, and we come back to the original code at the address following the instruction which has the breakpoint. So basically, we never come back to the original instruction, ever. We might as well just clear the RF flag from the EFLAGS image before popf. Thanks, Mathieu -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Maciej W. Rozycki on 14 Jul 2010 16:40
On Wed, 14 Jul 2010, Mathieu Desnoyers wrote: > It tests it too. When it detects that the return path is about to return to a > NMI handler, it checks if the TF flag is set. If it is set, then "iret" is > really needed, because TF can only single-step an instruction when set by > "iret". The popf/ret scheme would otherwise trap at the "ret" instruction that > follows popf. Anyway, single-stepping is really discouraged in nmi handlers, > because there is no way to go around the iret. Hmm, with Pentium Pro and more recent processors there is actually a nasty hack that will let you get away with POPF/RET and TF set. ;) You can try it if you like and can arrange for an appropriate scenario. > In the case of temporary bypass, the int3 is only there to divert the > instruction execution flow to somewhere else, and we come back to the original > code at the address following the instruction which has the breakpoint. So > basically, we never come back to the original instruction, ever. We might as > well just clear the RF flag from the EFLAGS image before popf. Yes, if you return to elsewhere, then that's actually quite desirable IMHO. This RF flag is quite complicated to handle and there are some errata involved too. If I understand it correctly, all fault-class exception handlers are expected to set it manually in the image to be restored if they return to the original faulting instruction (that includes the debug exception handler if it was invoked as a fault, i.e. in response to an instruction breakpoint). Then all trap-class exception handlers are expected to clear the flag (and that includes the debug exception handler if it was invoked as a trap, e.g. in response to a data breakpoint or a single step). I haven't checked if Linux gets these bits right, but it may be worth doing so. For the record -- GDB hardly cares, because it removes any instruction breakpoints before it is asked to resume execution of an instruction that has a breakpoint set at, single-steps the instruction with all the other threads locked out and then reinserts the breakpoints so that they can hit again. Then it proceeds with whatever should be done next to fulfil the execution request. Maciej -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |