VLIW pre-history [Computer Architecture]

Prev: Multiple Clock Domains on UP3
Next: Fast string functions

From: Mark Smotherman on 11 May 2007 18:39

I came across P.M. Melliar-Smith, "A design for a fast computer
for scientific calculations," in 1969 AFIPS FJCC, pp. 201-208.
He proposes "direct functional control" for inner loops in array
processing applications, by which he means a noninterlocked VLIW
design with exposed pipelining. (He's writing in reaction to
execution resources "squandered" and "wasted" by a Tomasulo-like
E-box coupled with a one-instruction-decode-per-cycle I-box.)

I wonder if there are other VLIW designs prior to the definition
of the term in the 1980's. I am aware of:

- IBM SSEC (1948) - two instructions in a "line of sequence",
which could could be used to specify two separate operations
within the same program or duplicate operations using separate
resources to provide checking [see US 2,636,672]

- various horizontal microprogramming approaches (e.g., the
1953 paper suggesting horizontal microcode by Wilkes and
Stringer; some suggest Turing's ACE, ca. 1946)

- Elliott 152 (1950) and Elliott 153 (1954) - the 154 had a
64-bit instruction specifying multiple register transfers
(ALU, multiplier, I/O, branching, and control of two
scratchpad memories); I'm hoping to locate more details on
the 152

- array processors, including
IBM 2938 Array Processor (1969)
IBM 3838 Array Processor (1974)
FPS AP-120B (1975)

- Culler patent (1973) - "Data processor with parallel operations
per instruction" [US 3,771,141]

- Pomerene patent (1981) - "Machine for multiple instruction
execution" [US 4,295,193]

- Rau's Polycyclic Architecture project at TRW/ESL (1981)
and then his work at Cydrome (1983-1988)

- Fisher's ELI-512 design (1983)
and then his work at Multiflow (1984-1990)

Anyone know of other VLIW-like designs pre-1980?

(Rau and Fisher say "The earliest VLIW processors built were the
so-called attached array processors ..." [IBM 2938/3838, AP-120B],
"Instruction-level parallel processing: History, overview, and
perspective," J. Supercomputing, 1993.)

From: Stephen Fuld on 12 May 2007 01:13

Mark Smotherman wrote:
> I came across P.M. Melliar-Smith, "A design for a fast computer
> for scientific calculations," in 1969 AFIPS FJCC, pp. 201-208.
> He proposes "direct functional control" for inner loops in array
> processing applications, by which he means a noninterlocked VLIW
> design with exposed pipelining. (He's writing in reaction to
> execution resources "squandered" and "wasted" by a Tomasulo-like
> E-box coupled with a one-instruction-decode-per-cycle I-box.)
>
> I wonder if there are other VLIW designs prior to the definition
> of the term in the 1980's. I am aware of:
>
> - IBM SSEC (1948) - two instructions in a "line of sequence",
> which could could be used to specify two separate operations
> within the same program or duplicate operations using separate
> resources to provide checking [see US 2,636,672]
>
> - various horizontal microprogramming approaches (e.g., the
> 1953 paper suggesting horizontal microcode by Wilkes and
> Stringer; some suggest Turing's ACE, ca. 1946)
>
> - Elliott 152 (1950) and Elliott 153 (1954) - the 154 had a
> 64-bit instruction specifying multiple register transfers
> (ALU, multiplier, I/O, branching, and control of two
> scratchpad memories); I'm hoping to locate more details on
> the 152
>
> - array processors, including
> IBM 2938 Array Processor (1969)
> IBM 3838 Array Processor (1974)
> FPS AP-120B (1975)
>
> - Culler patent (1973) - "Data processor with parallel operations
> per instruction" [US 3,771,141]
>
> - Pomerene patent (1981) - "Machine for multiple instruction
> execution" [US 4,295,193]
>
> - Rau's Polycyclic Architecture project at TRW/ESL (1981)
> and then his work at Cydrome (1983-1988)
>
> - Fisher's ELI-512 design (1983)
> and then his work at Multiflow (1984-1990)
>
>
> Anyone know of other VLIW-like designs pre-1980?

I'm not sure if you want to count these, but in the 1970s, there were a
series of parts by AMD called bit slice components (the 2900 series).
You could build your own "computer" by wiring together the number of
four bit ALU slices you needed for your desired register width, and the
number of 4 bit sequencer slices you needed for your desired addressing
range (you could also choose a fixed IIRC 10 bit sequencer part). Since
these were independent, you could design an instruction format that had
the commands to drive the ALU in one part of the instruction and the
sequencer in another, essentially allowing an ALU command simultaneously
with a sequence type command.

I worked on such a system, that was used as the main CPU for a high end
PCM disk controller. It had a 40 bit instruction word, of which 20 bits
were ALU and 20 were mostly used for sequencing. I say mostly, since
you could use a field in the sequence half as an operand to the ALU
instead of a jump target, in which case you could only use the sequence
part for a NOP or a return to the address on the stack instruction, etc.
Of course, you couldn't always fill both parts with useful work, so,
just as with other VLIW machines, you had a lot of empty instruction parts.

I am sure other companies used these parts in similar ways, though since
they were frequently embedded systems, they weren't publically documented.

> (Rau and Fisher say "The earliest VLIW processors built were the
> so-called attached array processors ..." [IBM 2938/3838, AP-120B],
> "Instruction-level parallel processing: History, overview, and
> perspective," J. Supercomputing, 1993.)

I'm sure that depends upon the definition of VLIW.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

From: Greg Lindahl on 12 May 2007 01:26

In article <f22rat$o79$1(a)hubcap.clemson.edu>,
Mark Smotherman <mark(a)clemson.edu> wrote:

>I came across P.M. Melliar-Smith, "A design for a fast computer
>for scientific calculations," in 1969 AFIPS FJCC, pp. 201-208.
>He proposes "direct functional control" for inner loops in array
>processing applications, by which he means a noninterlocked VLIW
>design with exposed pipelining.

Sounds like ordinary microcode on a pipelined processor. The FPS
AP120B array processor (which didn't come along until the early 80s)
let you microcode it directly. But since it's just microcode, it's not
exactly a novel idea, it's just unusual today to think of directly
programming it that way.

-- greg

From: Nick Maclaren on 12 May 2007 05:13

In article <f22rat$o79$1(a)hubcap.clemson.edu>,
Mark Smotherman <mark(a)clemson.edu> writes:
|>
|> - various horizontal microprogramming approaches (e.g., the
|> 1953 paper suggesting horizontal microcode by Wilkes and
|> Stringer; some suggest Turing's ACE, ca. 1946)

And lots of microcode from then on - for example, many of the
IBM System/370 range (e.g. the 165) used VLIW for microcode.

Regards,
Nick Maclaren.

From: Eric Smith on 12 May 2007 06:01

Nick Maclaren wrote:
> And lots of microcode from then on - for example, many of the
> IBM System/370 range (e.g. the 165) used VLIW for microcode.

Do you mean that in some sense other than that it was
horizontal microcode?

I was under the impression that the 370/165 and 370/168 microarchitecture
was fairly similar to the 360/65 microarchitecture. Perhaps I'm
mistaken. I haven't seen non-trivial portions of the microcode for any
360 or 370 other than the 360/30.

Eric

| Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: Multiple Clock Domains on UP3
Next: Fast string functions