From: "Andy "Krazy" Glew" on 18 Jan 2010 11:38 Noob wrote: > Andy Glew wrote: > >> I wrote the following for my wiki, >> http://semipublic.comp-arch.net/wiki/SYSENTER/SYSEXIT_vs._SYSCALL/SYSRET >> and thought that USEnet comp.arch might be interested > > This old post by Linus Torvalds seems somewhat relevant. > http://lkml.org/lkml/2002/12/18/218 Looks like Linus figured out what I intended. Would have been easier if I had been allowed to talk to him, back then.
From: Jeremy Linton on 18 Jan 2010 12:40 On 1/18/2010 3:59 AM, Anton Ertl wrote: > "Andy \"Krazy\" Glew"<ag-news(a)patten-glew.net> writes: >> I still think that both Intel and AMD missed a big opportunity, to make >> system calls truly >> as fast as function calls. Chicken and egg. > > But given that system calls have to do much more sanity checking on > their arguments, and there is the common prelude that you mentioned > (what is it for?), I don't see system calls ever becoming as fast as > function calls, even with fast system call and system return > instructions. My experience (working on one of the commercial unix's) was that having a fast sysenter type instruction had the opposite effect and resulted in a fairly slow system call interface. That's because there was always demand to have some specific functionality faster than the general case. So, the syscall handler was chuck full of special case checking for one function or the other. Plus, since the main OS was written in C, it ended up saving all kinds "extra" context anyway. In the end, IIRC there were 150-200 instructions on the kernel side before it started processing a "normal" system call. Sure, one or two "critical" calls used in benchmarks were faster, but the whole system suffered. I was never convinced that a few percent faster for a few special cases, outweighed the few percent slower for everything else in the system.
From: EricP on 18 Jan 2010 14:48 Andy "Krazy" Glew wrote: > <big snip> From a usability point of view, when I was toying with this I found there to be 2 problems with SysEnter/SysExit. Firstly, and most critically, SysExit does not load the EFLAGS register, specifically the interrupt flag. This was a problem because I needed a small non-interruptible system service return sequence during the transition to test for user mode software interrupt delivery in a lossless manner. I wanted to disable interrupts, check a boolean, and return to user mode if nothing pending, with the interrupts being re-enabled by the return. This was a show stopper and made it unusable. Secondly, the problem with SysEnter is that it assumes that the EDX will be preloaded with the restart EIP but the x86 provides no easy method load load an arbitrary offset of the current EIP into EDX except that kludgey call +0, pop edx method. So to use SysEnter you have to preload EDX with a constant restart EIP and that presumes the entry sequence is at a predefined location and that limits the utility of the SysEnter somewhat. The position dependent code method: push ecx push edx mov esp, ecx // Save stack pointer mov eax, 123 // System service routine number mov edx, #RestartAddr // Constant restart address sysenter RestartAddr: pop edx pop ecx The position independent code method: push ecx push edx mov esp, ecx // Save stack pointer mov eax, 123 // System service routine number call +0 // Load restart address of pop edx pop edx add edx, 6 sysenter pop edx pop ecx What would have been nice is if there was an instruction to move EIP to a general register and add a constant offset at the same time. push ecx push edx mov esp, ecx // Save stack pointer mov eax, 123 // System service routine number mov edx, eip+4 // Load restart address of pop edx sysenter pop edx pop ecx Eric
From: Terje Mathisen "terje.mathisen at on 18 Jan 2010 16:07 EricP wrote: > What would have been nice is if there was an instruction to move > EIP to a general register and add a constant offset at the same time. > > push ecx > push edx > mov esp, ecx // Save stack pointer > mov eax, 123 // System service routine number > mov edx, eip+4 // Load restart address of pop edx > sysenter > pop edx > pop ecx Didn't DEC use to have a patent on IP-relative addressing, specifically to make PIC much easier? It must be more than 17 years ago at least! Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
From: Gavin Scott on 18 Jan 2010 16:33
"Andy \"Krazy\" Glew" <ag-news(a)patten-glew.net> wrote: > System calls are really just function calls. With security. They have > to switch stacks, etc. Well, except when they don't. I know of one significant, successful OS that really didn't have a kernel stack at all, and executed pretty much everything except hardware interrupt context on top of the user's own stack. This was MPE/XL (later MPE/iX) on PA-RISC, HP's operating system for their HP-3000 systems. The page-level protection mechanisms and explicit long addressing modes were used to produce a system with an effectively flat single address space in which any process could form and dereference any possible memory address. This mostly eliminated the distinction between user and kernel code. Some functions were in privileged libraries that were flagged to cause privilege promotion, but a call to a system library function encountered no fixed overhead relative to an ordinary call to an unprivileged user library. Each privileged function was required to enforce system security policy and could do so in any way that it liked rather than being forced to sanity-check every parameter before it was known that it would be dereferenced for example. And no complicated copy-in/out to deal with moving things in and out of "kernel" space, etc. Now any modern security architect would probably run screaming at this point, and there definitely were challenges in this area (user asynchronous unprivileged trap/event handlers being a rather obvious one), but I'm not aware of any dramatic failures resulting from this design. On the other hand I don't think anyone would be likely to make such design choices again in today's world. But the resulting system was a relative joy to use, develop for (both user and system-level code), and definitely easier to debug. G. |