Notes on two presentations by Gordon Bell ca. 1998 [Computer Architecture]

Prev: Want to Make some extra cash $200+ Daily?
Next: Ignorant TSV questions

From: Eugene Miya on 1 Feb 2010 20:44

In article <2eec6144-f9ea-449c-aa2e-ec383c4fb610(a)f12g2000yqn.googlegroups.com>,
Robert Myers <rbmyersusa(a)gmail.com> wrote:
>On Jan 31, 8:24=A0pm, "nedbrek" <nedb...(a)yahoo.com> wrote:
>> "Robert Myers" <rbmyers...(a)gmail.com> wrote in message
>> news:ea2c8491-2403-499f-94dd-1fb3d37cd8f5(a)o28g2000yqh.googlegroups.com...
>> > Three R Myers prizes to G Bell for the second presentation.
>> >http://research.microsoft.com/en-us/um/people/gbell/ISHPC99.ppt
>>
>> > What is the Processor Architecture (slam dunk for vectors, good reason
>> > to close down CS departments wholesale)
>>
>> Are you saying vectors are good or bad? =A0And is it short vectors (SSE) =
>>or long vectors (Arana)?

What's Arana?

>That was my comment on Gordon Bell's slide, which summarized two
>artificially-opposed points of view. I like his false dichotomy,
>though.

I remember when our computational chemists complained when our 205 went away.
That was over 2 decades ago. We briefly had an ETA-10. And I've run on
some of the Japanese machines. We have never looked back, some people
people looked at serious jail time, not just money about this issue.

>SSE is better than nothing, but my long-term bet is on streaming
>architectures, which are a generalization of Cray-style vector
>parallelism.

No, those are different things. You are confusing the stream for a long
bunch, use whatever word you want, linking lots of short vectors.

The argument for vectors of the first order is that many problems have a
nice dense structure. The second order is attained with gather-scatter
hardware in memory, software to use it, masking, all kinds of extra stuff.

The most impassioned defense I saw of this was in Willy Schnoauer's (sp) book.
Any person would say that "it makes <common> sense". Our users aren't
any person. You start getting into problems like multigrid, asymmetrics, etc.

>CS departments and much of the discussion here have focused on
>approaches that are essentially dead-ends--academic masturbation for
>those with CS degrees.

Most of the guys who started computational science depts. look at
computer science depts. as prostitutes and drug dealers. I am using
kinder language. A few of them would love to shut CS Dept. down.
I would wonder who would write their systems code for them.

>NO MORE NEW LANGUAGES. Asm only, if necessary, but figure out how to
>expand the space that can be handled with a streaming paradigm.

8^)
Dream on.

>Fortunately, GPGPU will save us from all of the expensive mistakes the
>US has been making ever since Seymour left the scene. That's my hope,
>anyway.

Likley insufficient.

--

Looking for an H-912 (container).

From: Eugene Miya on 1 Feb 2010 20:52

In article <hk6iv7$jlr$1(a)news.eternal-september.org>,
nedbrek <nedbrek(a)yahoo.com> wrote:
>"Terje Mathisen" <"terje.mathisen at tmsw.no"> wrote in message
>news:20bi37-lic2.ln1(a)ntp.tmsw.no...
>> Robert Myers wrote:
>>> SSE is better than nothing, but my long-term bet is on streaming
>>> architectures, which are a generalization of Cray-style vector
>>> parallelism.
>> >
>>> Fortunately, GPGPU will save us from all of the expensive mistakes the
>>> US has been making ever since Seymour left the scene. That's my hope,
>>> anyway.
>>
>> Pure streaming architectures, even in the form of GPGPU, are dead.
>
>Terje is right. Even GPUs are moving to short vectors (for ray tracing).

I'm uncertain. I'm not clear I agree with either Robert or Terje.
I defer a decision.

>Long vectors are just so specialized, there is no market for them.

No, there are markets. It's just that no one is willing to pay for them.

>The super guys are riding the coat tails of product oriented towards
>consumers.

Sort of.

They mostly buy Intel and derviatives.

A few people, who can really afford supercomputers, can pay for
specialized hardware. If you have to ask, you can't play.

>That's why clusters are so important. They're the only way to
>get more power.

They are commodity components, and they are quick plug and play.
Except software.

> Long vectors are "embarassingly parallel", so you can use
>just about any method to exploit them.

You have to know how to program them.
The last big long vector compiler died.
Your average programmer is incapable of maintaining tight parallelism.
I'm not clear that I would put the multiflow, bulldog compiler in the
long vector category. The ftn200 compiler was behind in optimization;
if John McC were still around, he would try to defend it (the ETA was his
favorite machine). But you are talking decades old technology.

--

Looking for an H-912 (container).

From: Andy (Krazy) Glew on 1 Feb 2010 22:33

> > Pure streaming architectures, even in the form of GPGPU, are dead.
>
> Terje is right. Even GPUs are moving to short vectors (for ray tracing).
> Long vectors are just so specialized, there is no market for them.

AFAIK no GPU has ever implemented long vectors or vector streaming.
(They have implemented streaming in the sense of get/put engines.)

In fact, AFAIK no GPU has ever implemented vectors longer than, say,
128 bits. 4x32b.

One wonders whether part of the problem with Larrabee was/is that it
was implementing moderately long vectors. 512 bits.

---

Now, of course, the GPUs gain the effect of long vectors via SIMT /
Coherent Threading. Instead of implementing a 64 entry x 32b/entry
vector (2048 bits) they implement a 64-wide warp or wavefront. That
can also be used more flexibly for stuff that cannot be described as a
strided vector. With some, but not all, of the issues of scatter/
gather.

---

However, while I substantially agree with Terje wrt streaming and
vectors, and more substantially wrt the importance of caches (cache:
get used to it!), I do want to point out that the SIMT/CT
microarchitecture is not incompatible with long vector instruction
sets. Just sequenced or pipelined over the vector lanes. Indeed,
long vector instructions can amortize the instruction decode, and
reduce the performance loss due to SIMT/CT divergence and
fragmentation.

Therefore, my take is that wide parallel vector microarchitectures are
a dead end. Better to go SIMT/CT.

Also, fixed width vector ISAs are a dead end. At the least have a VL
(vector length) register. So that you can use an efficient sequenced
time domain vector implementation.

But since all instructions are a slippery slope, maybe better not to
have any vector instructions, and just have a SIMT/CT
microarchitecture.

From: Robert Myers on 1 Feb 2010 23:19

On Feb 1, 10:33 pm, "Andy (Krazy) Glew" <comp.a...(a)patten-glew.net>
wrote:

>
> Now, of course, the GPUs gain the effect of long vectors via SIMT /
> Coherent Threading. Instead of implementing a 64 entry x 32b/entry
> vector (2048 bits) they implement a 64-wide warp or wavefront. That
> can also be used more flexibly for stuff that cannot be described as a
> strided vector. With some, but not all, of the issues of scatter/
> gather.
>
ayup.

> ---
>
> However, while I substantially agree with Terje wrt streaming and
> vectors, and more substantially wrt the importance of caches (cache:
> get used to it!), I do want to point out that the SIMT/CT
> microarchitecture is not incompatible with long vector instruction
> sets. Just sequenced or pipelined over the vector lanes. Indeed,
> long vector instructions can amortize the instruction decode, and
> reduce the performance loss due to SIMT/CT divergence and
> fragmentation.
>
> Therefore, my take is that wide parallel vector microarchitectures are
> a dead end. Better to go SIMT/CT.
>
> Also, fixed width vector ISAs are a dead end. At the least have a VL
> (vector length) register. So that you can use an efficient sequenced
> time domain vector implementation.
>
> But since all instructions are a slippery slope, maybe better not to
> have any vector instructions, and just have a SIMT/CT
> microarchitecture.

ayup.

Robert.

From: nmm1 on 2 Feb 2010 03:44

In article <4b6780bb$1(a)darkstar>, Eugene Miya <eugene(a)cse.ucsc.edu> wrote:
>Robert Myers <rbmyersusa(a)gmail.com> wrote:
>
>> My desktop will run circles around the
>>supercomputers I used to use and that I still think of as the genuine
>>article.)
>
>That's because of the past tense. That little 'd' at the end of "use."
>You could go back....

Each of my hearing aids is tens of thousands of times more powerful
than the first supercomputer I used :-)

Regards,
Nick Maclaren.

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13
Prev: Want to Make some extra cash $200+ Daily?
Next: Ignorant TSV questions