From: Morten Reistad on 31 May 2010 13:16 In article <htobl4$lv1$1(a)usenet01.boi.hp.com>, FredK <fred.nospam(a)dec.com> wrote: > >"Terje Mathisen" <"terje.mathisen at tmsw.no"> wrote in message >news:89c4d7-5go.ln1(a)ntp.tmsw.no... >> nmm1(a)cam.ac.uk wrote: >>> In article<htmoal$u5$1(a)usenet01.boi.hp.com>, >>> FredK<fred.nospam(a)dec.com> wrote: >>>> > >snip > >> >> With proper non-blocking queue handling, those working cores can run flat >> out with no interrupts as long as there is any work at all to be done, >> then go to sleep. >> >> Using an interrupt from an IO core to get out of sleep and start >> processing again is a good idea from a power efficiency viewpoint. >> > >The question being - how fast can you bring the CPU out of it's "sleep" >state, and how do you schedule servicing of the non-blocking queues without >dedicating one or more cores strictly to handling them. The clock interrupt >for example is typically the mechanism used for scheduling multiple >processes competing for CPU time. In this "USMP" (UnSymmetric MultiProcessing) Terje describes, let the (many) "small" processors handle I/O; clocks, graphics rendering etc. If they are ISA-compatible with the (few) fast processors you can even keep running on a few of the "small" ones if your cpu-requirement is not too big, and the system can power down a lot of processors when idle. The interrupts sent to the "large" processors will them be mostly "attention" interrupts, either to schedule a new process, wake up, etc. Since Linux sends processors to sleeps and wakes them with an interrupt already the wakeup speed has been solved. This was a small issue around the 80486 clones and Linux 1.2.something; hasn't surfaced in a decade. My laptop has been to sleep more than 10k times since I started the previous paragraph. We should also address the worst bits of the von Neumann bottleneck. Having queue support in hardware, handling a few k of data in each instant would be a huge help. I cannot see that these ideas would require a lot of hardware to implement. -- mrr
From: Morten Reistad on 31 May 2010 13:18 In article <009ddc0e-0446-48f2-985a-5a06f12e07f7(a)k31g2000vbu.googlegroups.com>, Robert Myers <rbmyersusa(a)gmail.com> wrote: >On May 28, 4:42�am, Terje Mathisen <"terje.mathisen at tmsw.no"> >wrote: > >> When/if we finally get lots of cores, some of which are really >> low-power, in-order, with very fast context switching, then it makes >> even more sense to allocate all IO processing to such cores and let the >> big/power-hungry/OoO cores do the "real" processing. > >But it would likely take Microsoft to make such a step of any value in >the desktop/notebook space, no? > >Servers not only have different workloads, they use different >operating systems, and I'll take a wild guess that almost any server >OS can take advantage of intelligent I/O better than Desktop Windows, >which, I speculate, could take advantage of it hardly at all without a >serious rewrite. For the I/O handling we would probably have to make a hypervisor, to be an "os for operating systems". Where the hypervisor presents services to the OS, just like graphics processors do today. Windows does not have to know about all the cpus at all. Nor will Linux, for that matter. Or you can have them coexist on the same machine. -- mrr
From: Robert Myers on 31 May 2010 14:10 On May 31, 1:18 pm, Morten Reistad <fi...(a)last.name> wrote: > In article <009ddc0e-0446-48f2-985a-5a06f12e0...(a)k31g2000vbu.googlegroups..com>, > Robert Myers <rbmyers...(a)gmail.com> wrote: > > > > > > >On May 28, 4:42 am, Terje Mathisen <"terje.mathisen at tmsw.no"> > >wrote: > > >> When/if we finally get lots of cores, some of which are really > >> low-power, in-order, with very fast context switching, then it makes > >> even more sense to allocate all IO processing to such cores and let the > >> big/power-hungry/OoO cores do the "real" processing. > > >But it would likely take Microsoft to make such a step of any value in > >the desktop/notebook space, no? > > >Servers not only have different workloads, they use different > >operating systems, and I'll take a wild guess that almost any server > >OS can take advantage of intelligent I/O better than Desktop Windows, > >which, I speculate, could take advantage of it hardly at all without a > >serious rewrite. > > For the I/O handling we would probably have to make a hypervisor, to > be an "os for operating systems". Where the hypervisor presents services > to the OS, just like graphics processors do today. Windows does not > have to know about all the cpus at all. Nor will Linux, for that matter. > Or you can have them coexist on the same machine. > Ok. Thanks. Robert.
From: FredK on 31 May 2010 14:36 "Morten Reistad" <first(a)last.name> wrote in message news:rf7dd7-r02.ln1(a)laptop.reistad.name... > In article <htobl4$lv1$1(a)usenet01.boi.hp.com>, > FredK <fred.nospam(a)dec.com> wrote: >> >>"Terje Mathisen" <"terje.mathisen at tmsw.no"> wrote in message >>news:89c4d7-5go.ln1(a)ntp.tmsw.no... >>> nmm1(a)cam.ac.uk wrote: >>>> In article<htmoal$u5$1(a)usenet01.boi.hp.com>, >>>> FredK<fred.nospam(a)dec.com> wrote: >>>>> >> >>snip >> >>> >>> With proper non-blocking queue handling, those working cores can run >>> flat >>> out with no interrupts as long as there is any work at all to be done, >>> then go to sleep. >>> >>> Using an interrupt from an IO core to get out of sleep and start >>> processing again is a good idea from a power efficiency viewpoint. >>> >> >>The question being - how fast can you bring the CPU out of it's "sleep" >>state, and how do you schedule servicing of the non-blocking queues >>without >>dedicating one or more cores strictly to handling them. The clock >>interrupt >>for example is typically the mechanism used for scheduling multiple >>processes competing for CPU time. > > In this "USMP" (UnSymmetric MultiProcessing) Terje describes, let > the (many) "small" processors handle I/O; clocks, graphics rendering etc. > If they are ISA-compatible with the (few) fast processors you can even > keep running on a few of the "small" ones if your cpu-requirement is not > too big, and the system can power down a lot of processors when idle. > > The interrupts sent to the "large" processors will them be mostly > "attention" interrupts, either to schedule a new process, wake up, > etc. > Why not "Asymmetric" SMP? Something sounds funny about "UnSymmetric". What is large vs small? With cores apparently becoming "cheap" why differentiate or build variations? Isn't it easier to stamp out many of th same kind? Why the need for a new paradigm? If the cores are all identical, then simply take N out of the scheduling domain of executable user threads and direct interrupts to only those CPUs. Now your applications are only interrupted by the clock and faults. > Since Linux sends processors to sleeps and wakes them with an interrupt > already the wakeup speed has been solved. This was a small issue around > the 80486 clones and Linux 1.2.something; hasn't surfaced in a decade. > > My laptop has been to sleep more than 10k times since I started the > previous paragraph. > Well, in some p-state. Most OSes will enter a light sleep state in the idle loop. > We should also address the worst bits of the von Neumann bottleneck. > Having queue support in hardware, handling a few k of data in each > instant would be a huge help. > You need to explain this one to me more fully. > I cannot see that these ideas would require a lot of hardware to > implement. > Or none at all.
From: FredK on 31 May 2010 14:43
"Morten Reistad" <first(a)last.name> wrote in message news:bk7dd7-r02.ln1(a)laptop.reistad.name... > In article > <009ddc0e-0446-48f2-985a-5a06f12e07f7(a)k31g2000vbu.googlegroups.com>, > Robert Myers <rbmyersusa(a)gmail.com> wrote: >>On May 28, 4:42 am, Terje Mathisen <"terje.mathisen at tmsw.no"> >>wrote: >> >>> When/if we finally get lots of cores, some of which are really >>> low-power, in-order, with very fast context switching, then it makes >>> even more sense to allocate all IO processing to such cores and let the >>> big/power-hungry/OoO cores do the "real" processing. >> >>But it would likely take Microsoft to make such a step of any value in >>the desktop/notebook space, no? >> >>Servers not only have different workloads, they use different >>operating systems, and I'll take a wild guess that almost any server >>OS can take advantage of intelligent I/O better than Desktop Windows, >>which, I speculate, could take advantage of it hardly at all without a >>serious rewrite. > > For the I/O handling we would probably have to make a hypervisor, to > be an "os for operating systems". Where the hypervisor presents services > to the OS, just like graphics processors do today. Windows does not > have to know about all the cpus at all. Nor will Linux, for that matter. > Or you can have them coexist on the same machine. > I hate hypervisors. Yet another scheduling and abstraction layer to make things slower and less responsive. The core of the Windows kernel (NT) and it's IO subsystem are pretty much as "modern" as it gets. The issues with "intellegent IO" really has more to do with the software stack that it is trying to plug into - like TCPIP - and little to do with "interrupts". It's the old "wheel of reincarnation" where do you push what functionality and at what cost. I've seen server designs on the boards for decades with all sorts of smart IO and high speed fabrics, etc, etc. |