Prev: Last CFP - International Conference WWW/Internet 2010: 3 September 2010
Next: CMov implementation (Was Re: What will Microsoft use its ARM license for?)
From: Paul A. Clayton on 12 Aug 2010 20:50 On Aug 12, 7:55 am, "nedbrek" <nedb...(a)yahoo.com> wrote: [snip] > Right, the mystery is resolved using physical register numbers. The renamer > provides the number for each source. You would like there to be one source > at this point, although you could make the bypass logic execute the cmov - > this would require the renamer to produce 3 numbers (remember the flags have > a producer!). For a simple single small FLAGS register, the source operation could use its operation number (ROB number) plus one and the consuming operation its operation number. The trick is then to elide any intermediate names (for x86, intermediate names would probably be rare since FLAGS consumers usually immediately follow the source operation--correct?). Because FLAGS is small, replication is relatively inexpensive; because it is singular, special handling might be simpler and/or more cost-effective--such features should be exploitable. (For non-selected consuming operations, writing the FLAGS value into the operation might make sense.) One might choose to handle the nearby consumers differently, inserting FLAGS 'reassert' operations to wake-up later consumers. (The above is severely lacking in several areas, but I think it contains a rough start for the creation of an idea.) Paul A. Clayton just a technophile
From: Terje Mathisen "terje.mathisen at on 13 Aug 2010 01:47 Paul A. Clayton wrote: > On Aug 12, 7:55 am, "nedbrek"<nedb...(a)yahoo.com> wrote: > [snip] >> Right, the mystery is resolved using physical register numbers. The renamer >> provides the number for each source. You would like there to be one source >> at this point, although you could make the bypass logic execute the cmov - >> this would require the renamer to produce 3 numbers (remember the flags have >> a producer!). > > For a simple single small FLAGS register, the source operation could > use its operation number (ROB number) plus one and the consuming > operation its operation number. The trick is then to elide any > intermediate names (for x86, intermediate names would probably be > rare since FLAGS consumers usually immediately follow the source In compiled code, (intentional) FLAGS consumers will almost often follow within the next 2-4 instructions. In hand-optimized code, and even 16-bit operating systems, you can even see FLAGS values that are used as part of the return value, i.e. they have to survive for quite a while. The TCPIP copy/checksum code is an example: next: ADC EAX,EDX ;; Uses previous carry, sets new MOV [EDI+ESI],EDX first_iteration: MOV EDX,[ESI] LEA ESI,[ESI+4] DEC ECX ;; Updates all flags EXCEPT carry JNZ next (The core loop is usually unrolled of course.) Terje -- - <Terje.Mathisen at tmsw.no> "almost all programming can be viewed as an exercise in caching"
From: Paul A. Clayton on 13 Aug 2010 19:05
On Aug 13, 7:28 am, "nedbrek" <nedb...(a)yahoo.com> wrote: [snip] > This assumes that physical registers are bound to ROB entries. That is true > for some machines, but not for all... as you increase ROB size, you like to > be able to size the physical register seperately (a lot of instructions > don't need regs - stores and jumps being the most common). Of course, there > are even more radical proposals (reference counted, shared values) that will > break this. The proto-idea might be simplest when the FLAGS are associated with the ROB entries (the ordinary registers are independently renamed). However, it might be possible that the name is not an address for storage. FLAGS (being small) could probably (?) be written into the instruction payload or forwarded if selected for execution immediately after the source instruction. A future file could perhaps handle the dependent operations which issue after the FLAGS producing operation has been selected for execution. (Small can be beautiful.) Paul A. Clayton just a technophile |