use of setjmp/longjmp in x86 emulator. [Kernel]

Prev: ARM: includecheck fix for mach-at91/board-sam9m10g45ek.c
Next: 2.6.33: Intel DP55KG - USB still not fixed: USBDEVFS_CONTROL failed cmd usbhid-ups rqt 161 rq 1 len 4 ret -110

From: Gleb Natapov on 1 Mar 2010 12:50

On Mon, Mar 01, 2010 at 06:13:53AM -1000, Zachary Amsden wrote:
> On 02/28/2010 11:18 PM, Gleb Natapov wrote:
> >I am looking at improving KVM x86 emulator. Current code does not
> >handle some special cases correctly (code execution from ROM, ins/outs
> >to/from MMIO) and many exception conditions during instruction emulation
> >are not handled correctly. There is a lot of code in emulator that is
> >there only for exception propagation. Using setjmp/longjmp will be very
> >beneficial here as exception condition during instruction execution
> >maps very naturally to setjmp/longjmp, so my question is what about
> >adding setjmp/longjmp implementation to the kernel, or alternatively,
> >if there is a fear that it can be abused, add it locally to emulator.c?
> >Note that instruction emulation is always done in process context.
>
> I'm all for radical ideas, but from a pragmatic point of view, you
> shouldn't use longjmp in the kernel. Seriously bad things are
> happening with it; it leaves local variables undefined, doesn't undo
> global state changes.
>
> So if you:
>
> spin_lock(&s->lock);
> if (!s->active)
> longjmp(buf, -1);
>
How is this different from goto that skips unlock? But in general I
agree with you and that is why I propose to implement local version of
setjmp/longjmp just for use inside emulator.c. The are no locks inside
this file, not even memory allocations only pure instruction emulation.

> ... you are broken. This case can be made very much more complex
> and hard to reason about by using local variables which are reset by
> the longjmp.
>
> Further, it requires use of the volatile keyword to interact
> properly with logic involving more than one variable, and thus, by
> definition is impossible to use in the kernel, which does not
> implement the volatile keyword. :)
volatile is a language keyword how it can be not implemented by the
kernel? And why volatile is needed to implement longjmp?

>
> Instead, for this case, use the fact that there is an
> architecturally designed finite number of exceptions that can be
> processed simultaneously. This means if you queue exceptions to a
> pending list of control-flow interrupting events to be processed, as
> long as the queue is appropriately sized, you will never overflow
> this queue and never require dynamic allocation. Further, you can
> then naturally follow the exception priority rules at the top-level
> of the emulator and never need to pass back complex exception
> structures, merely a simple return value which indicates whether to
> return to top-level control logic or continue with instruction
> emulation. I believe using this style of programming will make your
> need for setjmp/longjmp go away.
>
Of course it is possible to use return values instead. This is what code
does currently and this is completely unrelated to exception queue
depth. Code will be much simpler if we will be able to bail out from the
depth of emulator immediately if exception condition is met or exit to
userspace is required instead of passing the condition up the call
chain.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Zachary Amsden on 1 Mar 2010 13:50

On 03/01/2010 07:47 AM, Gleb Natapov wrote:
> On Mon, Mar 01, 2010 at 06:13:53AM -1000, Zachary Amsden wrote:
>
>
>> ... you are broken. This case can be made very much more complex
>> and hard to reason about by using local variables which are reset by
>> the longjmp.
>>
>> Further, it requires use of the volatile keyword to interact
>> properly with logic involving more than one variable, and thus, by
>> definition is impossible to use in the kernel, which does not
>> implement the volatile keyword. :)
>>
> volatile is a language keyword how it can be not implemented by the
> kernel? And why volatile is needed to implement longjmp?
>

Local variables which are not volatile are "undefined" after a longjmp.
Thus setjmp() return value is the only valid rvalue otherwise.

As I said, the kernel does not implement the volatile keyword :)
(i.e. its use is heavily discouraged to the point one can consider it
not implemented)

>> Instead, for this case, use the fact that there is an
>> architecturally designed finite number of exceptions that can be
>> processed simultaneously. This means if you queue exceptions to a
>> pending list of control-flow interrupting events to be processed, as
>> long as the queue is appropriately sized, you will never overflow
>> this queue and never require dynamic allocation. Further, you can
>> then naturally follow the exception priority rules at the top-level
>> of the emulator and never need to pass back complex exception
>> structures, merely a simple return value which indicates whether to
>> return to top-level control logic or continue with instruction
>> emulation. I believe using this style of programming will make your
>> need for setjmp/longjmp go away.
>>
>>
> Of course it is possible to use return values instead. This is what code
> does currently and this is completely unrelated to exception queue
> depth. Code will be much simpler if we will be able to bail out from the
> depth of emulator immediately if exception condition is met or exit to
> userspace is required instead of passing the condition up the call
> chain.
>

Anything that can generate exceptions is going to need logic to handle
error cases anyway... the depth can not be that bad. Especially if you
structure it so as to optimize for tail calling.

Zach
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Luca Barbieri on 1 Mar 2010 13:50

How about an interface that works like setjmp/longjmp, but requires to
pass a function pointer to setjmp, which calls that function, and
allows longjmp to work in that function only?

This avoids all concerns about local variables and should be cleaner,
faster and simpler to implement.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Gleb Natapov on 1 Mar 2010 14:10

On Mon, Mar 01, 2010 at 08:39:49AM -1000, Zachary Amsden wrote:
> On 03/01/2010 07:47 AM, Gleb Natapov wrote:
> >On Mon, Mar 01, 2010 at 06:13:53AM -1000, Zachary Amsden wrote:
> >
> >>... you are broken. This case can be made very much more complex
> >>and hard to reason about by using local variables which are reset by
> >>the longjmp.
> >>
> >>Further, it requires use of the volatile keyword to interact
> >>properly with logic involving more than one variable, and thus, by
> >>definition is impossible to use in the kernel, which does not
> >>implement the volatile keyword. :)
> >volatile is a language keyword how it can be not implemented by the
> >kernel? And why volatile is needed to implement longjmp?
>
> Local variables which are not volatile are "undefined" after a
> longjmp. Thus setjmp() return value is the only valid rvalue
> otherwise.
>
That is nothing special. This is how setjmp/longjmp works. If a
nonvolatile automatic variable local to the function in which
setjmp is called is changed between the setjmp and longjmp calls,
its state is indeterminate after the longjmp.

In practice return value from setjmp is all I need.

> As I said, the kernel does not implement the volatile keyword :)
> (i.e. its use is heavily discouraged to the point one can consider
> it not implemented)
>
> >>Instead, for this case, use the fact that there is an
> >>architecturally designed finite number of exceptions that can be
> >>processed simultaneously. This means if you queue exceptions to a
> >>pending list of control-flow interrupting events to be processed, as
> >>long as the queue is appropriately sized, you will never overflow
> >>this queue and never require dynamic allocation. Further, you can
> >>then naturally follow the exception priority rules at the top-level
> >>of the emulator and never need to pass back complex exception
> >>structures, merely a simple return value which indicates whether to
> >>return to top-level control logic or continue with instruction
> >>emulation. I believe using this style of programming will make your
> >>need for setjmp/longjmp go away.
> >>
> >Of course it is possible to use return values instead. This is what code
> >does currently and this is completely unrelated to exception queue
> >depth. Code will be much simpler if we will be able to bail out from the
> >depth of emulator immediately if exception condition is met or exit to
> >userspace is required instead of passing the condition up the call
> >chain.
>
> Anything that can generate exceptions is going to need logic to
> handle error cases anyway... the depth can not be that bad.
> Especially if you structure it so as to optimize for tail calling.
>
Tail call is not what usually happens. Usually emulation goes like this:
if (check some conditions) {
queue exception A
return exception queued
}
if (check other conditions) {
queue exception B
return exception queued
}
do some emulation
try to read guest memory
if (read failed) {
queue exception C
return exception queued
}
if (read needs exit to userspace for device emulation)
return please go out and retrieve me the data

continue emulation
try to write guest memory
if (write failed) {
queue exception C
return exception queued
}
if (write needs exit to userspace for device emulation)
return please go out and process the data

emulate some more.

return emulation done

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: john cooper on 1 Mar 2010 14:20

Gleb Natapov wrote:

> Think about what happens if in the middle of
> instruction emulation some data from device emulated in userspace is
> needed. Emulator should be able to tell KVM that exit to userspace is
> needed and restart instruction emulation when data is available.

setjmp/longjmp are useful constructs in general but
IME are better suited for infrequent exceptions vs.
routine usage.

If the issue is finding some clean and regular way
to back out from (and possibly reeneter) logic
expressed within nested function invocations, have
you considered turning the problem inside out and
using a state machine approach?

--
john.cooper(a)third-harmonic.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: ARM: includecheck fix for mach-at91/board-sam9m10g45ek.c
Next: 2.6.33: Intel DP55KG - USB still not fixed: USBDEVFS_CONTROL failed cmd usbhid-ups rqt 161 rq 1 len 4 ret -110