From: Bill Todd on 24 Oct 2009 04:55 Terje Mathisen wrote: > On Oct 23, 8:11 pm, ga...(a)allegro.com (Gavin Scott) wrote: >> For PA-RISC capability HP had very high hopes for dynamic translation. >> One slide from fairly early on suggests they expected to get to 50% >> of native performance using translation. In reality they failed to >> scrounge up enough cleverness to do it well, and the PA-RISC >> compatibility on IPF has always been poor enough that the performance >> is commonly considered unacceptable even for business applications. > > That's interesting: > > IA64 seemed to have a close to complete superset of all PA-RISC > features/instructions, including some very funky address shift/ > combination operations specifically claimed to be there to support PS- > RISC features. > > The register set was so much larger that it could be mapped > statically. > > If all this hw support didn't get at least 50%, then the clock rate > must have been very disappointing (which it was, right?). In a general sense, yes - but not by comparison with PA-RISC's clock rate, which was always slower (only about 2/3 the Itanic clock rate since 2003). - bill
From: Mayan Moudgill on 24 Oct 2009 06:24 Terje Mathisen wrote: > On Oct 23, 1:45 pm, Mayan Moudgill <ma...(a)bestweb.net> wrote: > >>Andy "Krazy" Glew wrote: >> >>>E.g. Terje, you're known to be a Larrabee fan. Can you vectorize CABAC? >> >>Not a chance. >> >> >>>For example: divide the image up into subblocks, and run CABAC on each >>>subblock in parallel. >> >>Problem is with the standard. H.264 specifies that the frame is CABAC >>encoded. > > > Not quite: > > H.264 defines two alternate encoding schemes, of which CABAC gets the > better compression, but it is fully compliant to use the other (I > don't remember the name of it) if the encoder wants to. VLC. IIRC, also defined on a per frame basis, also requires inherent serial decoding (i.e. non-vectorizable), just has a better constant factor. > However, since a decoder has to be able to handle CABAC as well, that > limits the maximum bitrate that you can support in sw. > > Terje
From: jgd on 24 Oct 2009 09:12 In article <46ednbx9zPp4JULXnZ2dnUVZ_tqdnZ2d(a)metrocastcablevision.com>, billtodd(a)metrocast.net (Bill Todd) wrote: > Why not? It ran x86 code natively in an integrated manner on a > native Itanic OS. As with most things Merced the original cut wasn't > impressive in terms of speed, but the relative sizes of the x86 and > Itanic processors (especially given the amount of the chip area > dedicated to cache) made it clear that full-fledged x86 cores could > be included later if necessary as soon as the next process > generations appeared. I used it bit. On both Merced and McKinley, the x86 had about one-third of the throughput of native Itanium code: I was benchmarking with the same source built both ways. The reasons for the poor performance seemed to be: (a) It was an x86 front-end driving the Itanium back-end execution units. This didn't allow for the kind of speculative and out-of-order execution that was normal in the x86 world by that time with the Pentium Pro/II/III family, Athlon and Pentium 4. You were dropping back to something that was essentially a fast-clocked 486. (b) At least under Windows, you had to go through a complete execution transition to Itanium mode and back again on every system call. This was kind of slow, and meant that running the compilers that ran on x86 and generated Itanium code on an Itanium was much slower than 1/3 performance. The only other Itanium platform I ever used was HP-UX, where the x86 was not significant. By the time people were asking for our software on Itanium Linux, our answer was "That's going to cost you more than you are willing to pay." The kind of guys who take pride in being corporate "power users", who often drive uptake of technology, even if they don't have much insight into it, hit severe problems with Itanium. They thought "Wow, here's this amazing new 64-bit thing that also runs my MS Office work", got one, and found that Office had slowed down a lot for them. That kind of ego-driven customer really hates being wrong, and holds it against the platform, rather than questioning their own judgement. By contrast, AMD64 gave them just what they wanted. These people can be quite significant, even if they are basically idiots: my employers used to belong to EDS, and there were regular corporate edicts against buying Alpha boxes in the nineties and Itania around the turn of the millennium, to prevent those guys wasting money. If you had a real need for the kit, and could explain why, you could buy them through the company, but it was a long explanation each time. -- John Dallman, jgd(a)cix.co.uk, HTML mail is treated as probable spam.
From: jgd on 24 Oct 2009 09:12 In article <7kfe6aF39sdhsU1(a)mid.individual.net>, delcecchiofthenorth(a)gmail.com (Del Cecchi) wrote: > "Bill Todd" <billtodd(a)metrocast.net> wrote in message > > Save for the grace of AMD it still might have: without a credible, > > inexpensive, and pervasive 64-bit alternative Intel could have just > > waited until desktops began to demand 64-bit processors. Yup. I remain grateful to AMD for saving me from a lifetime of Itanium low-level debugging. > I don't put the death of PA-Risc at Itaniums door, since HP was from > all appearances one of the parents of the Itanium architecture and > perhaps the ones that sold it to Intel, rather than vice versa. > > They certainly were co-conspirators, so to speak. As the Intel porting training course explained it in mid-1999, the project had started as PA-RISC 3.0 at HP. HP had realised that it would be too expensive to develop just for the PA-RISC replacement market, and sought a partnership with Intel. -- John Dallman, jgd(a)cix.co.uk, HTML mail is treated as probable spam.
From: Robert Myers on 24 Oct 2009 09:12
On Oct 24, 3:08 am, Terje Mathisen <terje.wiig.mathi...(a)gmail.com> wrote: > On Oct 23, 8:11 pm, ga...(a)allegro.com (Gavin Scott) wrote: > > > For PA-RISC capability HP had very high hopes for dynamic translation. > > One slide from fairly early on suggests they expected to get to 50% > > of native performance using translation. In reality they failed to > > scrounge up enough cleverness to do it well, and the PA-RISC > > compatibility on IPF has always been poor enough that the performance > > is commonly considered unacceptable even for business applications. > > That's interesting: > > IA64 seemed to have a close to complete superset of all PA-RISC > features/instructions, including some very funky address shift/ > combination operations specifically claimed to be there to support PS- > RISC features. > > The register set was so much larger that it could be mapped > statically. > > If all this hw support didn't get at least 50%, then the clock rate > must have been very disappointing (which it was, right?). > If I remember the numbers Anton provided, 50% per clock for untuned code and a less than optimal compiler seems about right, even without accounting for translation overhead, and I doubt that the existence of a natural mapping to the instruction set provides much relief. As I'm writing this, I'm wondering how code translators interact with branch predictors. It seems like a hard problem to me, and Itanium doesn't like surprises. Robert. |