Prev: Big OOO, SpMT, and possible designs
Next: Brain teaser: Can RLE compression work for interleaving streams ? For example R,G,B streams ?!
From: Alexei A. Frounze on 18 May 2010 03:02 On May 17, 4:05 pm, "Skybuck Flying" <IntoTheFut...(a)hotmail.com> wrote: > "MitchAlsup" <MitchAl...(a)aol.com> wrote in message > > news:eb7ac133-0067-4c2c-bdda-338cc9c66b45(a)d12g2000vbr.googlegroups.com... > On May 17, 11:48 am, "Skybuck Flying" <IntoTheFut...(a)hotmail.com> > wrote: > > > Why is there no LOCK prefix for BT instruction ??? > > " > What functionality do you think would a LOCK prefix add to BT anyways? > " > > Suppose processor 1 has bit 5 in cache. (Bit 5 could be 0) > > Suppose processor 2 has bit 5 in cache. (Bit 5 could be 1) According to the MESI protocol employed on x86, this is only possible when: P1's cache line is invalid, P2's cache line is invalid/exclusive/ shared/modified (or vice versa) Invalid cache lines must first transition to the shared or exclusive state prior to be used (they must contain valid data from the physical memory). > Processor 1 had bit 5 first. > > Processor 2 had bit 5 second. > > Maybe the system knows that processor 1 must update the memory from it's > cache first. > > Then processor 2 must update the memory from it's cache next. When a cache line transitions to the modified state on one of the CPUs (and this is how the physical memory can ever change in normal conditions), this same line gets invalidated on all the other CPUs. All other CPUs will first have to reload this line for use. The hardware synchronization protocols take care of this. So, no problem here, not at the cache level. > I was thinking maybe a LOCK will "flush" the processor's caches towards > memory ? LOCK only tells all other CPUs not to touch the involved cache line(s) while the read-modify-write instruction is using it(them). It doesn't flush. And again, this is done in hardware. > After which the BT instruction can execute on the memory to get a > consistent/updated view of the bit ? There's no problem of consistent/inconsistent, not at the cache level, at least. The caches are always consistent in the sense that the same cache line can't have different values on different CPUs in non- invalid states. If it's shared/exclusive/modified, it's valid. Invalid otherwise. If it's shared/exclusive, its value is the same as in the physical memory. If it's modified, its value isn't guaranteed to be the same as in the physical memory because an update to the memory hasn't occurred yet. The problem is in a different place. Once a CPU reads from a cache line into a register, this register's value is likely to soon become inconsistent with the memory. A register isn't a cache line. There's also no guaranteed order in which CPUs will execute their instructions one with respect to another (that is, P1 first or P2 first), which CPU will copy from a register to the cache to the memory. There're other quirks too. But for BT, there's no problem. It's just that one CPU may be unlucky to read from the memory to the cache to the register not long before another CPU writes to that same location, after which the first CPU will likely have something different in its register from what has propagated to the memory already from the second CPU. If BT sees a bit set, then this bit has been set at some recent point in the past. If it sees it reset, it has been reset at some point in the past. It won't lie. But it can't and won't guarantee that the value it has read is still the same value in the memory. A nice article on this is here: http://cobweb.ecn.purdue.edu/~eigenman/ECE563/Handouts/x86_memory.pdf Alex
From: MitchAlsup on 18 May 2010 12:12 On May 17, 6:05 pm, "Skybuck Flying" <IntoTheFut...(a)hotmail.com> wrote: > "MitchAlsup" <MitchAl...(a)aol.com> wrote in message > What functionality do you think would a LOCK prefix add to BT anyways? > " > > Suppose processor 1 has bit 5 in cache. (Bit 5 could be 0) > > Suppose processor 2 has bit 5 in cache. (Bit 5 could be 1) The only scenario where this can happen is when one uses diagnostic access to the cache. These accesses are Supervisor only, and incoherent, so if thsi happens its your fault, and you went way out of your way to inflict this upon yourself. Under normal cache coherence (both MESI and MOESI) this cannot happen. > I was thinking maybe a LOCK will "flush" the processor's caches towards > memory ? That is what the WBINVLD instructions are for. Mtich
From: Piotr Wyderski on 19 May 2010 05:28 MitchAlsup wrote: > The only scenario where this can happen is when one uses diagnostic > access to the cache. These accesses are Supervisor only, and > incoherent How does one perform that? Through JTAG or from kernel code? Best regards Piotr Wyderski
From: MitchAlsup on 19 May 2010 12:02 On May 19, 4:28 am, "Piotr Wyderski" <piotr.wyder...(a)mothers.against.spam.gmail.com> wrote: > MitchAlsup wrote: > > The only scenario where this can happen is when one uses diagnostic > > access to the cache. These accesses are Supervisor only, and > > incoherent > > How does one perform that? Through JTAG or from kernel code? The functionality exists, I'm not sure that how to do it has ever left the confines of AMD. Mitch
From: wolfgang kern on 19 May 2010 14:24
MitchAlsup said: >>> The only scenario where this can happen is when one uses diagnostic >>> access to the cache. These accesses are Supervisor only, and >>> incoherent >> How does one perform that? Through JTAG or from kernel code? > The functionality exists, I'm not sure that how to do it has ever left > the confines of AMD. Sorry Mitch, any attempt to use LOCK with an instruction which doesn't apply will invoke EXC06 on AMDs and AFAIK also on Intel-CPUs. This olde LOCK-ignore/skip exception default become museum-status yet. __ wolfgang |