From: Bernd Paysan on 18 Jan 2010 08:41 nmm1(a)cam.ac.uk wrote: > The key is to have a clean system design, so the amount of sanity > checking and the size of a standard prelude are minimal. For example, > a high proportion of system calls in many applications can be very > simple, 'unprivileged' ones like reading the clock or debugger hooks. Actually, with a clean system design, many of those unprivileged ones can be simple unprivileged library calls. rdtsc is unprivileged, all you need is a factor (clocks per second) and a global offset - then you can do your gettimeofday() completely in userland (AFAIK, people have already done that). This can go a lot further. In effect, you can do most system stuff in userland, including even reading and writing file data, and schedule file metadata for changes ("schedule" means that finally, when committed, the data is sanity checked by the kernel - but that doesn't need to be too frequently). All the system needs to do for you is to map those parts of the disk which you can read or write into your memory map - read only stuff read-only, read-write data stuff on the disk read-write. The actual reads and writes from and to the disk still happen in kernel land, but as long as the program works from cache, no OS intervention necessary. -- Bernd Paysan "If you want it done right, you have to do it yourself" http://www.jwdt.com/~paysan/
From: Noob on 18 Jan 2010 08:59 Terje Mathisen wrote: > Anton Ertl wrote: > >> Andy Glew wrote: >> >>> I still think that both Intel and AMD missed a big opportunity, to make >>> system calls truly >>> as fast as function calls. Chicken and egg. >>> Nobody wants to make the investment in hardware without a proven >>> software benefit, >>> but existing software is optimized to avoid expensive system call >>> privilege level changes. >> >> But given that system calls have to do much more sanity checking on >> their arguments, and there is the common prelude that you mentioned >> (what is it for?), I don't see system calls ever becoming as fast as >> function calls, even with fast system call and system return >> instructions. > > _Some_ system calls don't need that checking code! > > I.e. using a very fast syscall(), you can return an OS timestamp within > a few nanoseconds, totally obviating the need for application code to > develop their own timers, based on RDTSC() (single-core/single-cpu > systems only), ACPI timers or whatever else is available. > > Even if this is only possible for system calls that deliver very simple > result, and where the checking code is negligible, this is till an > important subset. > > The best solution today is to take away all attempts on security and > move all those calls into a user-level library, right? What about Linux VDSO / vsyscalls ? http://www.x86-64.org/pipermail/patches/2006-November/003498.html http://juliusdavies.ca/posix_clocks/clock_realtime_linux_faq.html Regards.
From: Noob on 18 Jan 2010 09:02 Andy Glew wrote: > I wrote the following for my wiki, > http://semipublic.comp-arch.net/wiki/SYSENTER/SYSEXIT_vs._SYSCALL/SYSRET > and thought that USEnet comp.arch might be interested This old post by Linus Torvalds seems somewhat relevant. http://lkml.org/lkml/2002/12/18/218
From: "Andy "Krazy" Glew" on 18 Jan 2010 11:30 Anton Ertl wrote: > "Andy \"Krazy\" Glew" <ag-news(a)patten-glew.net> writes: >> I still think that both Intel and AMD missed a big opportunity, to make >> system calls truly >> as fast as function calls. Chicken and egg. >> Nobody wants to make the investment in hardware without a proven >> software benefit, >> but existing software is optimized to avoid expensive system call >> privilege level changes. > > But given that system calls have to do much more sanity checking on > their arguments, and there is the common prelude that you mentioned > (what is it for?), I don't see system calls ever becoming as fast as > function calls, even with fast system call and system return > instructions. > > - anton There are ways to reduce the work involved in doing sanity checking on arguments. E.g. a properly designed capability machine architecture can do this. Or even just "perform access as if in user mode" in the VM. Rather like Sun's load/store alternate address space, except not for I/O. This may very well be one of the ways in which I fell short: I have an agenda to make syscalls faster, of which fast system call instructions such as SYSENTER/SYSEXIT and SYSCALL/SYSRET are just one step. It's not clear how much value one step done in isolation has. And all or nothing feature agendas tend not to happen, unless there is big demand. -- But as for the value: I think that it is there. Many of the security holes in modern systems arise because syscalls and cross domain transfers are too expensive: instead of putting things in different security domains, processes or the like, we put them in the same security domain. And then get surprised when, e.g., a bug in a graphics device driver can allow an OS level break in.
From: "Andy "Krazy" Glew" on 18 Jan 2010 11:35
mac wrote: >> I have observed that this concern about interrupts that cannot be >> blocked is a key source of complexity in system architecture. The RISC >> approach may be to assume that all interrupts, even NMIs, can be >> blocked, broefly, as the syscall code sets things up. But the advent of >> things like virtual machines, SMIs, etc., means that you can't make this >> assumption. > > > Didn't Alpha PALcode have someting like this? Special execution > environment, no interrupts, priveleged register access? > I don't know much about it, but it looked like a clever hook for CISC > operations. Yes. And, for that matter, microcode in machines like the x86 amounts to the same thing. |