Prev: CPU <> Memory chip communication interface
Next: interrupting for overflow and loop termination
From: Joe Seigh on 6 Sep 2005 09:49 David Hopwood wrote: > Joe Seigh wrote: > >> David Hopwood wrote: >>> >>> But OSes, thread libraries and language implementations *aren't* >>> portable >>> code. >> >> >> I do not think that word means what you think it means. >> >> Note that I am an ex-kernel developer and have created enough >> sychronization api's that run on totally different platforms. > > > You are totally missing the point. OSes, thread libraries and language > implementations have some code that needs to be adapted to each hardware > architecture. If the memory model were to change in future processors > that are otherwise x86-like, this code would have to change. It's not a > big deal, because this platform-specific code is maintained by people who > know how to change it, and because there are few enough OSes, thread > libraries, and language implementations for the total effort involved > not to be very great. It would, however, be a big deal if existing x86 > *applications* stopped working on an otherwise x86-compatible processor. > I am talking about that. You insist on maintaining that I advocate applications hardcode platform specific assembly code into their source. I never have advocated that. But when you design these api's you have to have a pretty good idea what kinds of things an be ported and what assumptions you are making about the memory model. Since I've actually done this kind of stuff I probably have a much better idea than you have what the actual issues are. And yes, there isn't any assumption about the memory model that can't be broken by a hardware designer. The only thing that keeps hardware companies from breaking widely used api's like Posix pthreads is they might go out of business if they did. Hence, shorting Intel stock might be a good idea if you believe they did do that. But saying that we should only use widespread api's and not ever create any new ones is ridiculous. -- Joe Seigh When you get lemons, you make lemonade. When you get hardware, you make software.
From: Eric P. on 6 Sep 2005 10:26 Alexander Terekhov wrote: > > My reading of the specs is that MFENCE is guaranteed to provide > store-load barrier. > > P1: X = 1; R1 = Y; > P2: Y = 1; R2 = X; > > (R1, R2) = (0, 0) is allowed under pure PC, but > > P1: X = 1; MFENCE; R1 = Y; > P2: Y = 1; MFENCE; R2 = X; > > (R1, R2) = (0, 0) is NOT allowed. Are you sure you are not being inconsistent in example 2 here? (wrt what you answered yesterday about S/LFENCE). If MFENCE is just an SFENCE+LFENCE, and neither of those guarantee delivery or receipt of invalidates, then P1 can have a stale Y and P2 a stale X. The MFENCE does nothing but prevent bypassing. Eric
From: Eric P. on 6 Sep 2005 10:58 "Eric P." wrote: > > Alexander Terekhov wrote: > > > > My reading of the specs is that MFENCE is guaranteed to provide > > store-load barrier. > > > > P1: X = 1; R1 = Y; > > P2: Y = 1; R2 = X; > > > > (R1, R2) = (0, 0) is allowed under pure PC, but > > > > P1: X = 1; MFENCE; R1 = Y; > > P2: Y = 1; MFENCE; R2 = X; > > > > (R1, R2) = (0, 0) is NOT allowed. > > Are you sure you are not being inconsistent in example 2 here? > (wrt what you answered yesterday about S/LFENCE). > > If MFENCE is just an SFENCE+LFENCE, and neither of those guarantee > delivery or receipt of invalidates, then P1 can have a stale Y > and P2 a stale X. The MFENCE does nothing but prevent bypassing. > > Eric Forget it, I see. With two processors Y can be stale on P1, or X stale on P2, but not both. Eric
From: Alexander Terekhov on 6 Sep 2005 11:29 "Eric P." wrote: > > Alexander Terekhov wrote: > > > > My reading of the specs is that MFENCE is guaranteed to provide > > store-load barrier. > > > > P1: X = 1; R1 = Y; > > P2: Y = 1; R2 = X; > > > > (R1, R2) = (0, 0) is allowed under pure PC, but > > > > P1: X = 1; MFENCE; R1 = Y; > > P2: Y = 1; MFENCE; R2 = X; > > > > (R1, R2) = (0, 0) is NOT allowed. > > Are you sure you are not being inconsistent in example 2 here? > (wrt what you answered yesterday about S/LFENCE). PC implies both LFENCE and SFENCE ordering constraints. I don't think that you've got invalidations stuff entirely accurate, but the basic logic is correct. > > If MFENCE is just an SFENCE+LFENCE, No. SFENCE is store-store barrier and LFENCE is load-load barrier. store-store + load-load != store-load. MFENCE ensures that preceding writes are made globally visible before subsequent reads are performed (store-load barrier)... plus it imposes all other PC ordering constraints (load-load + load-store + store-store). regards, alexander.
From: Alexander Terekhov on 14 Sep 2005 04:07
Hey Mr. andy.glew(a)intel.com, you better fix the specs, really. It's not funny anymore. http://msdn.microsoft.com/msdnmag/issues/05/10/MemoryModels/default.aspx "When multiprocessor systems based on the x86 architecture were being designed, the designers needed a memory model that would make most programs just work, while still allowing the hardware to be reasonably efficient. The resulting specification requires writes from a single processor to remain in order with respect to other writes, but does not constrain reads at all. Unfortunately, a guarantee about write order means nothing if reads are unconstrained. After all, it does not matter that A is written before B if every reader reading B followed by A has reads reordered so that the pre-update value of B and the post-update value of A is seen. The end result is the same: write order seems reversed. Thus, as specified, the x86 model does not provide any stronger guarantees than the ECMA model. It is my belief, however, that the x86 processor actually implements a slightly different memory model than is documented. While this model has never failed to correctly predict behavior in my experiments, and it is consistent with what is publicly known about how the hardware works, it is not in the official specification. New processors might break it." regards, alexander. |