Intel details future Larrabee graphics chip [Design]

Prev: LM3478 design gets insanely hot
Next: 89C51ED2

From: John Larkin on 9 Aug 2008 12:09

On Sat, 09 Aug 2008 09:02:53 -0700, JosephKK <quiettechblue(a)yahoo.com>
wrote:

>On Wed, 06 Aug 2008 19:57:23 -0700, John Larkin
><jjlarkin(a)highNOTlandTHIStechnologyPART.com> wrote:
>
>>On Tue, 5 Aug 2008 12:54:14 -0700, "Chris M. Thomasson"
>><no(a)spam.invalid> wrote:
>>
>>>"John Larkin" <jjlarkin(a)highNOTlandTHIStechnologyPART.com> wrote in message
>>>news:rtrg9458spr43ss941mq9p040b2lp6hbgg(a)4ax.com...
>>>> On Tue, 5 Aug 2008 13:30:52 +0200, "Skybuck Flying"
>>>> <BloodyShame(a)hotmail.com> wrote:
>>>>
>>>>>As the number of cores goes up the watt requirements goes up too ?
>>>>
>>>> Not necessarily, if the technology progresses and the clock rates are
>>>> kept reasonable. And one can always throttle down the CPUs that aren't
>>>> busy.
>>>>
>>>>>
>>>>>Will we need a zillion watts of power soon ?
>>>>>
>>>>>Bye,
>>>>> Skybuck.
>>>>>
>>>>
>>>> I saw suggestions of something like 60 cores, 240 threads in the
>>>> reasonable future.
>>>
>>>I can see it now... A mega-core GPU chip that can dedicate 1 core per-pixel.
>>>
>>>lol.
>>>
>>>
>>>
>>>
>>>> This has got to affect OS design.
>>>
>>>They need to completely rethink their multi-threaded synchronization
>>>algorihtms. I have a feeling that efficient distributed non-blocking
>>>algorihtms, which are comfortable running under a very weak cache coherency
>>>model will be all the rage. Getting rid of atomic RMW or StoreLoad style
>>>memory barriers is the first step.
>>
>>Run one process per CPU. Run the OS kernal, and nothing else, on one
>>CPU. Never context switch. Never swap. Never crash.
>>
>>John
>
>OK. How do you deal with I/O devices, user input and hot swap?
>

I/O and user interface, just like now: device drivers and GUI's. Just
run them on separate CPUs, and have hardware control over anything
that could crash the system, specifically global memory mapping. There
have been OS's that, for example, pre-qualified the rights of DMA
controllers so even a rogue driver couldn't punch holes in memory at
random.

But hot swap? What do you mean? All the CPUs are on one chip.

John

From: JosephKK on 9 Aug 2008 12:15

On Fri, 08 Aug 2008 07:40:53 -0700, John Larkin
<jjlarkin(a)highNOTlandTHIStechnologyPART.com> wrote:

>On Fri, 08 Aug 2008 11:30:04 GMT, Jan Panteltje
><pNaonStpealmtje(a)yahoo.com> wrote:
>
>>On a sunny day (Fri, 08 Aug 2008 13:02:15 +0200) it happened Bernd Paysan
>><bernd.paysan(a)gmx.de> wrote in <nmltm5-4hq.ln1(a)annette.zetex.de>:
>>
>>>Nick Maclaren wrote:
>>>> In theory, the kernel doesn't have to do I/O or networking, but
>>>> have you ever used a system where they were outside it? I have.
>>>
>>>Actually, doing I/O or networking in a "main" CPU is waste of resources. Any
>>>sane architecture (CDC 6600, mainframes) has a bunch of multi-threaded IO
>>>processors, which you program so that the main CPU has little effort to
>>>deal with IO.
>>>
>>>This works well even when you do virtualization. The main CPU sends a
>>>pointer to an IO processor program ("high-level abstraction", not the
>>>device driver details) to the IO processor, which in turn runs the device
>>>driver to get the data in or out. In a VM, the VM monitor has to
>>>sanity-check the command, maybe rewrites it ("don't write to track 3 of
>>>disk 5, write it to the 16 sectors starting at sector 8819834 in disk 1,
>>>which is where the virtual volume of this VM sits").
>>>
>>>The fact that in PCs the main CPU is doing IO (even down to the level of
>>>writing to individual IO ports) is a consequence of saving CPUs - no money
>>>for an IO processor, the 8088 can do that itself just fine. Why we'll soon
>>>have 32 x86 cores, but still no IO processor is beyond what I can
>>>understand.
>>>
>>>Basically all IO in a modern PC is sending fixed- or variable-sized packets
>>>over some sort of network - via SATA/SCSI, via USB, Firewire, or Ethernet,
>>>etc.
>>
>>Do not forget, since the days of 8088, and maybe CPUs running at about 13 MHz,
>>we now run at 3.4 GHz, 3400 / 13 = 261 x faster.
>>Also even faster because of better architectures.
>>This leaves plenty of time for a CPU to do normal IO.
>>And in fact the IO has been hardware supported always.
>>For example, although you can poll a serial port bit by bit, there is a hardware shift register,
>>hardware FIFO too.
>>Although you can construct sectors for a floppy in software bit by bit, there is a floppy controller
>>with write pre-compensation etc.. all in hardware.
>>Although you could do graphics there is a graphics card with hardware acceleration.
>>the first 2 are included in the chip set, maybe the graphics too.
>>The same thing for Ethernet, it is a dedicated chip, or included in the chip set,
>>taking the place of your 'IO processor'.
>>Same thing for hard disks, and those may even have on board encryption, all you
>>have to do is specify a sector number and send the sector data.
>>
>>So.. no real need for a separate IO processor, in fact you likely find a processor
>>in all that dedicated hardware, or maybe a FPGA.
>>
>
>That's the IBM "channel controller" concept: add complexm specialized
>dma-based i/o controllers to take the load off the CPU. But if you
>have hundreds of CPU's, the strategy changes.
>
>John

Why would it? The design could also use hundreds or thousands of
dedicated I/O controllers. If you want to talk about real
bottlenecks look at memory and data bus limitations.

From: John Larkin on 9 Aug 2008 12:28

On Sat, 09 Aug 2008 09:15:28 -0700, JosephKK <quiettechblue(a)yahoo.com>
wrote:

>On Fri, 08 Aug 2008 07:40:53 -0700, John Larkin
><jjlarkin(a)highNOTlandTHIStechnologyPART.com> wrote:
>
>>On Fri, 08 Aug 2008 11:30:04 GMT, Jan Panteltje
>><pNaonStpealmtje(a)yahoo.com> wrote:
>>
>>>On a sunny day (Fri, 08 Aug 2008 13:02:15 +0200) it happened Bernd Paysan
>>><bernd.paysan(a)gmx.de> wrote in <nmltm5-4hq.ln1(a)annette.zetex.de>:
>>>
>>>>Nick Maclaren wrote:
>>>>> In theory, the kernel doesn't have to do I/O or networking, but
>>>>> have you ever used a system where they were outside it? I have.
>>>>
>>>>Actually, doing I/O or networking in a "main" CPU is waste of resources. Any
>>>>sane architecture (CDC 6600, mainframes) has a bunch of multi-threaded IO
>>>>processors, which you program so that the main CPU has little effort to
>>>>deal with IO.
>>>>
>>>>This works well even when you do virtualization. The main CPU sends a
>>>>pointer to an IO processor program ("high-level abstraction", not the
>>>>device driver details) to the IO processor, which in turn runs the device
>>>>driver to get the data in or out. In a VM, the VM monitor has to
>>>>sanity-check the command, maybe rewrites it ("don't write to track 3 of
>>>>disk 5, write it to the 16 sectors starting at sector 8819834 in disk 1,
>>>>which is where the virtual volume of this VM sits").
>>>>
>>>>The fact that in PCs the main CPU is doing IO (even down to the level of
>>>>writing to individual IO ports) is a consequence of saving CPUs - no money
>>>>for an IO processor, the 8088 can do that itself just fine. Why we'll soon
>>>>have 32 x86 cores, but still no IO processor is beyond what I can
>>>>understand.
>>>>
>>>>Basically all IO in a modern PC is sending fixed- or variable-sized packets
>>>>over some sort of network - via SATA/SCSI, via USB, Firewire, or Ethernet,
>>>>etc.
>>>
>>>Do not forget, since the days of 8088, and maybe CPUs running at about 13 MHz,
>>>we now run at 3.4 GHz, 3400 / 13 = 261 x faster.
>>>Also even faster because of better architectures.
>>>This leaves plenty of time for a CPU to do normal IO.
>>>And in fact the IO has been hardware supported always.
>>>For example, although you can poll a serial port bit by bit, there is a hardware shift register,
>>>hardware FIFO too.
>>>Although you can construct sectors for a floppy in software bit by bit, there is a floppy controller
>>>with write pre-compensation etc.. all in hardware.
>>>Although you could do graphics there is a graphics card with hardware acceleration.
>>>the first 2 are included in the chip set, maybe the graphics too.
>>>The same thing for Ethernet, it is a dedicated chip, or included in the chip set,
>>>taking the place of your 'IO processor'.
>>>Same thing for hard disks, and those may even have on board encryption, all you
>>>have to do is specify a sector number and send the sector data.
>>>
>>>So.. no real need for a separate IO processor, in fact you likely find a processor
>>>in all that dedicated hardware, or maybe a FPGA.
>>>
>>
>>That's the IBM "channel controller" concept: add complexm specialized
>>dma-based i/o controllers to take the load off the CPU. But if you
>>have hundreds of CPU's, the strategy changes.
>>
>>John
>
>Why would it? The design could also use hundreds or thousands of
>dedicated I/O controllers. If you want to talk about real
>bottlenecks look at memory and data bus limitations.
>

A lot of hardware sorts of stuff, like tcp/ip stack accelerators,
coule be done in a dedicated cpu. Sort of like using a PIC to blink an
LED. Part of the channel-controller thing was driven by mot wanting to
burden an expensive CPU with scut work and interrupts and context
switching overhead. All that stops mattering when cpu's are free. Of
course, disk controllers and graphics processors would still be
needed, but simpler ones and fewer of them.

Multicore is especially interesting for embedded systems, where there
are likely a modest number of processes and no dynamic add/drop of
tasks. The most critical ones, like an important servo loop, could be
dedicated and brutally simple. Freescale is already going multicore on
embedded chips, and I think others are, too. The RTOS boys are *not*
going to like this.

John

From: Robert Myers on 9 Aug 2008 12:30

On Aug 9, 12:15 pm, JosephKK <quiettechb...(a)yahoo.com> wrote:

>
> Why would it? The design could also use hundreds or thousands of
> dedicated I/O controllers. If you want to talk about real
> bottlenecks look at memory and data bus limitations.

mmhmm.

Bandwidth per flop is headed toward zero.

Robert.

From: John Larkin on 9 Aug 2008 12:36

On Sat, 09 Aug 2008 09:15:28 -0700, JosephKK <quiettechblue(a)yahoo.com>
wrote:

>On Fri, 08 Aug 2008 07:40:53 -0700, John Larkin
><jjlarkin(a)highNOTlandTHIStechnologyPART.com> wrote:
>
>>On Fri, 08 Aug 2008 11:30:04 GMT, Jan Panteltje
>><pNaonStpealmtje(a)yahoo.com> wrote:
>>
>>>On a sunny day (Fri, 08 Aug 2008 13:02:15 +0200) it happened Bernd Paysan
>>><bernd.paysan(a)gmx.de> wrote in <nmltm5-4hq.ln1(a)annette.zetex.de>:
>>>
>>>>Nick Maclaren wrote:
>>>>> In theory, the kernel doesn't have to do I/O or networking, but
>>>>> have you ever used a system where they were outside it? I have.
>>>>
>>>>Actually, doing I/O or networking in a "main" CPU is waste of resources. Any
>>>>sane architecture (CDC 6600, mainframes) has a bunch of multi-threaded IO
>>>>processors, which you program so that the main CPU has little effort to
>>>>deal with IO.
>>>>
>>>>This works well even when you do virtualization. The main CPU sends a
>>>>pointer to an IO processor program ("high-level abstraction", not the
>>>>device driver details) to the IO processor, which in turn runs the device
>>>>driver to get the data in or out. In a VM, the VM monitor has to
>>>>sanity-check the command, maybe rewrites it ("don't write to track 3 of
>>>>disk 5, write it to the 16 sectors starting at sector 8819834 in disk 1,
>>>>which is where the virtual volume of this VM sits").
>>>>
>>>>The fact that in PCs the main CPU is doing IO (even down to the level of
>>>>writing to individual IO ports) is a consequence of saving CPUs - no money
>>>>for an IO processor, the 8088 can do that itself just fine. Why we'll soon
>>>>have 32 x86 cores, but still no IO processor is beyond what I can
>>>>understand.
>>>>
>>>>Basically all IO in a modern PC is sending fixed- or variable-sized packets
>>>>over some sort of network - via SATA/SCSI, via USB, Firewire, or Ethernet,
>>>>etc.
>>>
>>>Do not forget, since the days of 8088, and maybe CPUs running at about 13 MHz,
>>>we now run at 3.4 GHz, 3400 / 13 = 261 x faster.
>>>Also even faster because of better architectures.
>>>This leaves plenty of time for a CPU to do normal IO.
>>>And in fact the IO has been hardware supported always.
>>>For example, although you can poll a serial port bit by bit, there is a hardware shift register,
>>>hardware FIFO too.
>>>Although you can construct sectors for a floppy in software bit by bit, there is a floppy controller
>>>with write pre-compensation etc.. all in hardware.
>>>Although you could do graphics there is a graphics card with hardware acceleration.
>>>the first 2 are included in the chip set, maybe the graphics too.
>>>The same thing for Ethernet, it is a dedicated chip, or included in the chip set,
>>>taking the place of your 'IO processor'.
>>>Same thing for hard disks, and those may even have on board encryption, all you
>>>have to do is specify a sector number and send the sector data.
>>>
>>>So.. no real need for a separate IO processor, in fact you likely find a processor
>>>in all that dedicated hardware, or maybe a FPGA.
>>>
>>
>>That's the IBM "channel controller" concept: add complexm specialized
>>dma-based i/o controllers to take the load off the CPU. But if you
>>have hundreds of CPU's, the strategy changes.
>>
>>John
>
>Why would it? The design could also use hundreds or thousands of
>dedicated I/O controllers. If you want to talk about real
>bottlenecks look at memory and data bus limitations.
>

What bottlenecks? Most PC's have speed to burn. What they don't have
is security, reliability, or simplicity. But more cpu's, each with a
little local ram, surrounding a shared cache, have got to be more
efficient than a single CPU thrashing between 60 or so processes.

Or maybe things will never change, just like they never changed in
past years.

John

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Prev: LM3478 design gets insanely hot
Next: 89C51ED2