From: kenney on 21 Jan 2010 07:23 In article <0db80478-326d-4b55-b6bd-33d75a811166(a)36g2000yqu.googlegroups.com>, robertwessel2(a)yahoo.com () wrote: > > running on separate cores can't tell that the order of time values > stored is actually slightly out of sync across the machine or > cluster. However nowadays there are external time sources that are accurate to milliseconds and guaranteed to be unique. A trivial example of their use is the self adjusting radio clock. I doubt that implementating the use of the time signal would be easier than anything suggested so far but each cluster could have it's own time source with no synchronisation problems. Ken Young
From: Morten Reistad on 21 Jan 2010 11:38 In article <rfKdncKLWbTF2sXWnZ2dnUVZ7qydnZ2d(a)giganews.com>, <kenney(a)cix.compulink.co.uk> wrote: >In article ><0db80478-326d-4b55-b6bd-33d75a811166(a)36g2000yqu.googlegroups.com>, >robertwessel2(a)yahoo.com () wrote: > >> >> running on separate cores can't tell that the order of time values >> stored is actually slightly out of sync across the machine or >> cluster. > > However nowadays there are external time sources that are accurate to >milliseconds and guaranteed to be unique. A trivial example of their use >is the self adjusting radio clock. I doubt that implementating the use >of the time signal would be easier than anything suggested so far but >each cluster could have it's own time source with no synchronisation >problems. And you can synchronise clocks on different processors by having simple counters incremented by pulses on a wire, from a coherent source; just see to it that the delay on the wire and electronics is stable, and the wires are all the same length; and that a timebase for the counters can be established. Then you tweak the speed of the source with ntp and adjtime-like behaviour. Such a counter should be able to run at about a quarter of the basic switching speed; way faster than any instructions or memory access. Reading it at L2 cache speeds should not be a problem, either. This is how the telco's synchronised public clocks half a century ago. -- mrr
From: MitchAlsup on 21 Jan 2010 12:51 After reading this thread several times, it seems that the timer one is looking for has several properties: A: can be read at least a billion times per second uniformly over a whole system of thousands of nodes B: always returns a unique number--this number related to time in some way C: this number is ultimatey used to determine order (i.e synchronization winners and loosers) D: uses all the fast access pathways in the system (i.e. cache hierarchy) E: but never uses any of the slow parts of the system (i.e. cache coherence mechanism, OS-calls) F: leverages off of fast access techniques (user mode instructions, TLB) G: which is safe, secure, fast, and <blah blah> This reminds me of what the physicists were probably talking about just after the turn of the previous century between the discovery of the photoelectric effect and the development of quantum mechanics. Mitch
From: Terje Mathisen "terje.mathisen at on 21 Jan 2010 13:48 kenney(a)cix.compulink.co.uk wrote: > In article > <0db80478-326d-4b55-b6bd-33d75a811166(a)36g2000yqu.googlegroups.com>, > robertwessel2(a)yahoo.com () wrote: > >> >> running on separate cores can't tell that the order of time values >> stored is actually slightly out of sync across the machine or >> cluster. > > However nowadays there are external time sources that are accurate to > milliseconds and guaranteed to be unique. A trivial example of their use The canonical "cheap but accurate" time source these days is a Garmin GPS18LVC: Together with an RS232 DB9 connector and a USB cable you have all the hw needed for a ~1us timing reference, at a total cost of around $60-80, plus half an hour's work. > is the self adjusting radio clock. I doubt that implementating the use > of the time signal would be easier than anything suggested so far but > each cluster could have it's own time source with no synchronisation > problems. Right, see above. :-) Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
From: Mayan Moudgill on 21 Jan 2010 13:49
Andy "Krazy" Glew wrote: > Tim McCaffrey wrote: > >> In article <4B540900.4060107(a)patten-glew.net>, ag-news(a)patten-glew.net >> says... >> >>> I wrote th following for my wiki, >>> http://semipublic.comp-arch.net/wiki/SYSENTER/SYSEXIT_vs._SYSCALL/SYSRET >>> and thought thgat USEnet comp.arch might be interested: >>> >>> >>> >>> Sorry to jump in late. One reason a process needs to cross protection/privilege domains is because it needs to execute an instruction sequence that is completely safe, but contains instructions that, in isolation, are unsafe, and therefore are unavailable in the processes original domain. A somewhat contrived example: assume that, in a multi-threaded processor, there is a register that controls an executing thread's priority. Since we don't want to allow threads to willy-nilly grab 100% of the CPU resources, writes to that register are privileged. However, lowering your own thread-priority is a safe operation. So, the function void decrement_thread_priority(int n) { int x = read_thread_priority(); if( n > x ) { x = 0; } else { x -= n; } write_thread_priority(x); } is a completely safe operation. Unfortunately, because it contains a privileged operation, a process must somehow change its priority before executing the operation. Now, one way to do this is to associate privileges with code-pages; so, if the CPU is executing a privileged operation and the page has EXECUTE-PRIVILEGED-CODE bit set, then its ok to execute the instruction. This "solution" suffers from the hole that a malicious process could branch to the middle of the protected code sequence. So, we have to guarantee that protection transitions only occur at the start (and end) of safe code fragments. Thus, we must have an operation that simultaneously changes privilege level AND instruction pointer. One possible solution is to have the privilege change operation always branch to the same address, and pass it the "address" of the function to be executed; this would be equivalent to saying: execute_at_elevated_priority(decrement_thread_priority, N); void execute_at_elevated_priority( (void (*fn)(int)), int arg) { if( safe_to_execute(fn)) { fn(arg) } } The initial privilege escalation+branch can be done by SYSCALLs or a software generated interrupt. This scheme can be extended to have multiple fixed entry points, by having a parameter to the SYSCALL-equivalent or providing for multiple interrupts. (Going back to the privileged-code-page approach) Alternatively, we could guarantee that every instruction on a page was the start of a safe code sequence. This could be done trivially by having each of the instructions be branches to the actual function. But then the function body itself would still need to be guarded somehow. A possible solution would be to have instructions that are available only in privilege mode, but having a page protect mode such that, if an instruction from that page is executed, the privilege level of the executing process is escalated. So, the page will have a EXECUTE-AND-CHANGE-PRIVILEGE bit set, if any instruction from that page is executed the privilege of the process is increased, and that page contains branches to the actual functions. Of course, the hardware could simply treat such a page as a vector of pointers, and the branch-and-change-privilege picks an instruction from the page to branch to. (Going back to the privileged-code-page approach) Another alternative is to allow entry to pages with EXECUTE-PRIVILEGED-CODE only at the beginning of the page. This has the drawback of requiring an entire page to be devoted to what might be a small function, which is not a big deal on a desktop processor; there may be a performance penalty associated with the additional TLB entries. |