From: nmm1 on 20 Jan 2010 15:47 In article <0db80478-326d-4b55-b6bd-33d75a811166(a)36g2000yqu.googlegroups.com>, robertwessel2(a)yahoo.com <robertwessel2(a)yahoo.com> wrote: >> >> The other is maintaining global uniqueness and monotonicity while >> increasing the precision to nanoseconds and the number of cores >> to thousands. =A0All are needed, but it is probably infeasible to >> deliver all of them, simultaneously :-( > >You only need to keep the clocks well enough synchronized that threads >running on separate cores can't tell that the order of time values >stored is actually slightly out of sync across the machine or >cluster. Basically this is approximately the physical propagation >delay between nodes, and synchronizing to less than that is relatively >straight-forward. Grrk. Not really. To get from one corner of a board to another and back is (say) 5 nanoseconds, and that's just the speed of light. But let's say that you can synchronise to 1 nanosecond. The killer is that two of the most close-coupled cores can often communicate faster than that, so you can end up with visible discrepancies. To solve that, you either have to constrain how often each core can get timestamps - or ensure that the closer each core is, the better synchronised its clocks are. I am still doubtful that you can deliver the global properties that are wanted (essentially sequential consistency, to a higher precision than any other communication mechanism). Perhaps it can be done, but I can't see how, and every real product I have seen has added some constraints. >Then making sure the values are unique just requires an extension at >the low end of the time value, and a fixed value per-core to be stored >there. So effectively core number 13 always stores time values of the >form "nnnnnnnn.nnnnnnnnn013" and two actually simultaneous stores have >an artificial difference inserted at the low end. And so long as the >prior condition (about time/event visibility) is met, you're covered >here too. Yes, you are right. It's too long since I worked in this area, and was forgetting! Regards, Nick Maclaren.
From: Terje Mathisen "terje.mathisen at on 20 Jan 2010 17:18 nmm1(a)cam.ac.uk wrote: > In article<b6gj27-5bn.ln1(a)ntp.tmsw.no>, > Terje Mathisen<"terje.mathisen at tmsw.no"> wrote: >> You and I have both written NTP-type code, so as I wrote in another >> message: Separate motherboards should use NTP to stay in sync, with or >> without hw assists like ethernet timing hw and/or a global PPS source. > > Yes, but I thinking of a motherboard with a thousand cores on it. > While it could use NTP-like protocols between cores, and for each > core to maintain its own clock, that's a fairly crazy approach. > > All right, realistically, it would be 64 groups of 16 cores, or > whatever, but the point stands. Having to use TWO separate > protocols on a single board isn't nice. I agree. Anything located on a single board should be able to share a common timing reference, i.e. core crystal. That only leaves the OS with the task of syncing up the base counter values during startup. Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
From: Terje Mathisen "terje.mathisen at on 20 Jan 2010 17:20 nmm1(a)cam.ac.uk wrote: > In article<1531844.zBA62FjkXi(a)elfi.zetex.de>, > Bernd Paysan<bernd.paysan(a)gmx.de> wrote: >> It's not so bad as you think. As long as your uncertainty of time is >> smaller than the communication delay between the nodes, you are fine, i.e. >> your values are unique - you only have to make sure that the adjustments >> propagate through the shortest path. > > Er, no. How do you stop two threads delivering the same timestamp > if they execute a 'call' at the same time without having a single > time server? Ensuring global uniqueness is the problem. No! Global uniqueness is a separate, but also quite important problem. It is NOT fair to saddle every single timestamp call with the overhead required for a globally unique value! Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
From: robertwessel2 on 20 Jan 2010 17:39 On Jan 20, 2:47 pm, n...(a)cam.ac.uk wrote: > In article <0db80478-326d-4b55-b6bd-33d75a811...(a)36g2000yqu.googlegroups.com>, > > robertwess...(a)yahoo.com <robertwess...(a)yahoo.com> wrote: > > >> The other is maintaining global uniqueness and monotonicity while > >> increasing the precision to nanoseconds and the number of cores > >> to thousands. =A0All are needed, but it is probably infeasible to > >> deliver all of them, simultaneously :-( > > >You only need to keep the clocks well enough synchronized that threads > >running on separate cores can't tell that the order of time values > >stored is actually slightly out of sync across the machine or > >cluster. Basically this is approximately the physical propagation > >delay between nodes, and synchronizing to less than that is relatively > >straight-forward. > > Grrk. Not really. To get from one corner of a board to another and > back is (say) 5 nanoseconds, and that's just the speed of light. But > let's say that you can synchronise to 1 nanosecond. The killer is > that two of the most close-coupled cores can often communicate > faster than that, so you can end up with visible discrepancies. > To solve that, you either have to constrain how often each core > can get timestamps - or ensure that the closer each core is, the > better synchronised its clocks are. I should have been clearer, but that's exactly right, the degree of synchronization required varies based on the distance between nodes, but it has to be such that no given pair of nodes can see the slop. In fact, zSeries clusters do just that - the degree of "real" synchronization within a single machine is substantially higher than between machines in a cluster. I don't know if there is a TOD clock synchronization hierarchy internal to a single zSeries machine, but it's possible - zSeries machines are built out of 1-4 "books," each of which contains five quad core chips, which provides a natural hierarchy. On the flip size, these are approximately 1000MIPS cores, and managing even 1ns synchronization across a meter or two of distance isn't that hard. But such a thing would clearly be possible - cores in separate books are clearly at least several tens of ns apart while two cores on a chip are much closer. But such a hierarchy of synchronization levels would almost have to naturally match the hardware architecture of a big system. > I am still doubtful that you can deliver the global properties that > are wanted (essentially sequential consistency, to a higher precision > than any other communication mechanism). Perhaps it can be done, but > I can't see how, and every real product I have seen has added some > constraints. I think the cluster wide TOD clock on zSeries clusters (Sysplex) comes pretty close, at least.
From: Stephen Fuld on 20 Jan 2010 18:31
On 1/20/2010 2:20 PM, Terje Mathisen wrote: > nmm1(a)cam.ac.uk wrote: >> In article<1531844.zBA62FjkXi(a)elfi.zetex.de>, >> Bernd Paysan<bernd.paysan(a)gmx.de> wrote: >>> It's not so bad as you think. As long as your uncertainty of time is >>> smaller than the communication delay between the nodes, you are fine, >>> i.e. >>> your values are unique - you only have to make sure that the adjustments >>> propagate through the shortest path. >> >> Er, no. How do you stop two threads delivering the same timestamp >> if they execute a 'call' at the same time without having a single >> time server? Ensuring global uniqueness is the problem. > > No! > > Global uniqueness is a separate, but also quite important problem. > > It is NOT fair to saddle every single timestamp call with the overhead > required for a globally unique value! There is a simple solution to this problem. Assume that the time stamp is updated every microsecond, and that it is a hardware register within the chip. Further assume that the timer field has enough bits to allow for say nanoseconds, but these bits are not guaranteed to be accurate. Then the hardware can use those bits as a "request counter". That is, the value is incremented once every request and reset to zero every time the clock increments the least significant bit (i.e microseconds in our example.) This guarantees uniqueness with a trivial amount of hardware and no additional overhead. Of course, you have to pick the sizes to allow for future implementations, etc. but this isn't hard. -- - Stephen Fuld (e-mail address disguised to prevent spam) |