From: nmm1 on 19 Apr 2010 10:48 In article <jwvhbn7bkri.fsf-monnier+comp.arch(a)gnu.org>, Stefan Monnier <monnier(a)iro.umontreal.ca> wrote: >>> 2.9: Register rotation, someone needs to be locked in a rubber room. ;) >> Yes and no. They can be VERY effective, as on the Hitachi SR2201. > >How did they work? Floating-point only, software-controlled, bypassing the cache. It was called pseudovectorisation, which describes it very well. Good, vectorisable, Fortran code ran like the clappers, as the technique almost completely eliminated memory latency. To achieve that, it dropped (most) support of IEEE denorms etc., and you had a heavyweight switch between vector and scalar modes. A slightly different form was used on the SR8000, but I have no personal experience of that. While it could have been extended to other forms of use, that wouldn't have been very far. Similarly, older systems that used register rotation for procedure calls had to put quite a lot of restrictions on that would fit badly with some modern languages. I think that it could be used on a general-purpose CPU, but ONLY if you designed the architecture round it - not, as on the Itanic, bolting it on together with every knob, frob, bell and whistle that you could find in the lumber room. Regards, Nick Maclaren.
From: Robert Myers on 19 Apr 2010 10:54 On Apr 18, 11:25 pm, Brett Davis <gg...(a)yahoo.com> wrote: > It is my opinion that Itanic is a disaster at any speed. ;) Andy seemed to express his opinion that there were too many opinions to begin with. If you could design-by-opinion, I'm sure that development would be much less expensive. Although Andy has not mentioned it, I suspect that the compiler-that- never-really-arrived played a significant role in keeping the design process a battle of opinions for a very long time. Intel managed to keep Itanium at or near the lead in a number of SPEC benchmarks, and I assume that it was compiler tuning that allowed them to do that. Intel's success at that enterprise, I'm sure, left end users puzzled as to why the chip never look as good in their applications at it did in the benchmarks. Internally, you could say, "See. The compiler *can* do it, if only in a limited number of cases." Burning watts at runtime to schedule and virtually mandating clever and sometimes obscure hand-coding in order to achieve acceptable performance are both inferior to having a compiler that can schedule naive code successfully enough to compete with run-time scheduling. Beating up on the feature set of Itanium seems pretty pointless. The question that still begs to be answered is: how much can you push out to a compiler (and not tricky hand coding) and how do you do it? The features that were added to Itanium, whatever their merits or obvious disadvantages, don't seem to have helped enough. The question remains whether one could do better. Robert.
From: Anton Ertl on 19 Apr 2010 11:40 nmm1(a)cam.ac.uk writes: >In article <jwvhbn7bkri.fsf-monnier+comp.arch(a)gnu.org>, >Stefan Monnier <monnier(a)iro.umontreal.ca> wrote: >>>> 2.9: Register rotation, someone needs to be locked in a rubber room. ;) >>> Yes and no. They can be VERY effective, as on the Hitachi SR2201. >> >>How did they work? > >Floating-point only, software-controlled, bypassing the cache. >It was called pseudovectorisation, which describes it very well. >Good, vectorisable, Fortran code ran like the clappers Sounds like IA-64, in particular Itanium II. >I think that it could be used on a general-purpose CPU, but ONLY >if you designed the architecture round it - not, as on the Itanic, >bolting it on together with every knob, frob, bell and whistle >that you could find in the lumber room. It was used on IA-64, an architecture intended to be general-purpose. And it did run vectorizable loops fast. The problem is that the performance for most other stuff is mediocre, mostly because the clock rate does not keep up with the competition. - anton -- M. Anton Ertl Some things have to be seen to be believed anton(a)mips.complang.tuwien.ac.at Most things have to be believed to be seen http://www.complang.tuwien.ac.at/anton/home.html
From: Stefan Monnier on 19 Apr 2010 13:28 >>>> 2.9: Register rotation, someone needs to be locked in a rubber room. ;) >>> Yes and no. They can be VERY effective, as on the Hitachi SR2201. >> How did they work? > Floating-point only, software-controlled, bypassing the cache. > It was called pseudovectorisation, which describes it very well. By "work" I meant: what were the instructions provided to setup/control the register rotation feature and what were their semantics? Stefan
From: nmm1 on 19 Apr 2010 13:48
In article <jwviq7n9wwm.fsf-monnier+comp.arch(a)gnu.org>, Stefan Monnier <monnier(a)iro.umontreal.ca> wrote: >>>>> 2.9: Register rotation, someone needs to be locked in a rubber room. ;) >>>> Yes and no. They can be VERY effective, as on the Hitachi SR2201. >>> How did they work? >> Floating-point only, software-controlled, bypassing the cache. >> It was called pseudovectorisation, which describes it very well. > >By "work" I meant: what were the instructions provided to setup/control >the register rotation feature and what were their semantics? It was 12+ years ago now, and I didn't program in assembler on that system, anyway. I might have a specification somewhere, but I would have to search. Regards, Nick Maclaren. |