From: George Neuner on 22 Jun 2010 14:03 On Tue, 22 Jun 2010 07:35:30 -0700, Andy 'Krazy' Glew <ag-news(a)patten-glew.net> wrote: >More commonly: load with sign extension is usually slower than >loading without sign extension [*], since in normal >representation it involves waiting for the MSB and smearing it >over many upper bits. So many new instruction proposals have >proposed doing away with signed loads. I guess the question is: is a sign-extended load faster than code that zeros the register, performs a short load, tests the high bit of the value and possibly ANDs the value with (2's complement) -1? Depending on the ISA that's 4-7 instructions vs 1. George
From: Anton Ertl on 22 Jun 2010 13:55 MitchAlsup <MitchAlsup(a)aol.com> writes: >On Jun 22, 7:34=A0am, n...(a)cam.ac.uk wrote: >> From the viewpoint of a high-level language, that is insane behaviour. >> And, for better or worse, ISO C attempts to be a high-level language. > >This is one of those "for the worse" results. >C is and was supposed to be a portable assembler. I would put only a minor part of the blame on the standard itself. This kind of standard necessarily only standardizes a subset of the language. And they also try to identify what's portable to funny kinds of machines that are irrelevant for most programmers (such as machines with sign-magnitude representation for integers), and define a subset of the language that's portable even to that kind of machine. The mistake is when other people, especially compiler writers (in particular gcc maintainers), see such a standard as defining the whole of the language, and feel free to miscompile everything outside that subset. Example: The behaviour on integer overflow is different for machines with 2s-complement, 1s-complement, and sign-magnitude integers, so the C standard does not define what happens on overflow for signed integers. Now the gcc maintainers take this as excuse to miscompile "x-1>x" into "0", even for programs that were only ever intended to run on machines with wraparound on overflow, and on machines that most naturally work that way. - anton -- M. Anton Ertl Some things have to be seen to be believed anton(a)mips.complang.tuwien.ac.at Most things have to be believed to be seen http://www.complang.tuwien.ac.at/anton/home.html
From: Tim McCaffrey on 22 Jun 2010 14:12 In article <dd4f0201-da56-4baa-acfd-9798fa72e059(a)x27g2000yqb.googlegroups.com>, MitchAlsup(a)aol.com says... > >On Jun 17, 12:33=A0am, Brett Davis <gg...(a)yahoo.com> wrote: >> What am I missing. > >The measure of performance is not instructions, but the <lack of> time >it takes to execute the semantic contents of the program. Given the >program at hand, there will be more gate delays of logic to execute >the x86 program than to execute the RISC program. x86 instructions >take more gates to parse and decode, the x86 data path has an >additional operand formatting multiplexer in the integer data path, >and some additional logic after integer computations to deal with >partial word writes in the register files. So, instead of having about >an 80 gate delay per instruction pipeline, the x86 has about a 100 >gate delay instruction pipeline. An extra pipe stage and you can make >this added complexity almost vanish. > >Where x86 wins, is that they (now Intel; AMD used to do this too) can >throw billions at FAB development technology (i.e. making faster >transistors and faster interconnect wire). > >Secondarily, once you microarchitect a fully out-of-order processor, >it really does not mater what the native instruction set is. ReRead >that previous sentance until you understand. The native instruction >set no longer maters once the microarchitecture has gone fully OoO! > You say ISA doesn't matter, but you note several cases where extra gates were added to handle x86ism's. If an ISA was designed to make the underlying micro-architecture fast/easy what would be its characteristics? I would think something like the following: 1) Reduce code size: Increases the I-cache hit rate (which Andy noted reduces power consumption because off-chip accesses cost). 2) Easy to decode: reduces gate count, which reduces power consumption, and potentially removes a pipeline stage (maybe). AFAICT, every x86 has a limitation of only being able to decode/issue one instruction if it hasn't been executed before. It appears all x86 implementations use the I-cache to mark instruction boundaries for parallel decoding on the following passes. 3) No PSW, remove the need to merge flag values and the consequent logic. 4) No partial register updates. (I'm sure there are more). In general, if you reduce the number of gates, you reduce power consumption and allow the possibility of increasing clock speed. If you reduce the number of pipeline stages you reduce the effects of branch-misprediction. I'm not sure how many (ISA level) registers are useful for an OoO microarch. Too few and you have lots of instructions just moving stuff back and forth from memory (probably the stack), too many and it increases the code size without really adding performance. - Tim
From: Anton Ertl on 22 Jun 2010 14:13 MitchAlsup <MitchAlsup(a)aol.com> writes: >On Jun 22, 6:47=A0am, Andrew Reilly <areilly...(a)bigpond.net.au> wrote: >> No: I want the 2's compliment, fixed-point integers to wrap, just like >> the hardware does. > >Note this only when using 'unsigned' arithmetic. I don't understand what you mean here, but 2s-complement is a representation for signed numbers, so Andrew Reilly obviously had signed numbers in mind. - anton -- M. Anton Ertl Some things have to be seen to be believed anton(a)mips.complang.tuwien.ac.at Most things have to be believed to be seen http://www.complang.tuwien.ac.at/anton/home.html
From: nmm1 on 22 Jun 2010 14:31
In article <fcf2275e-29ab-46bb-8588-fd4b07f7e4fc(a)8g2000vbg.googlegroups.com>, MitchAlsup <MitchAlsup(a)aol.com> wrote: >On Jun 22, 7:34=A0am, n...(a)cam.ac.uk wrote: >> From the viewpoint of a high-level language, that is insane behaviour. >> And, for better or worse, ISO C attempts to be a high-level language. > >This is one of those "for the worse" results. >C is and was supposed to be a portable assembler. No, it wasn't. It was a semi-portable assembler - i.e. it was syntactically portable, and semantically portable PROVIDED that you stayed away from problematic areas (like overflow). There have been truly portable assemblers, dating from the 1970s and onwards. In the late 1980s, there was massive pressure from commercial application developers to improve C for use as a portable high level language. You may think that it was a mistake for ISO to accept that as a criterion, but the fact is that is what it did. I am not going to disagree with your view - merely your claimed facts. Regards, Nick Maclaren. |