Prev: Last Call for Papers Reminder (extended): World Congress on Engineering and Computer Science WCECS 2010
Next: ARM-based desktop computer ? (Hybrid computers ?: Low + High performance ;))
From: nmm1 on 26 Jul 2010 03:05 In article <8b4eibF2ppU1(a)mid.individual.net>, Andrew Reilly <areilly---(a)bigpond.net.au> wrote: >On Sat, 24 Jul 2010 16:52:22 -0700, MitchAlsup wrote: > >> I think what Robert is getting at is that lumping everything under a >> coherent cache is running into a vonNeumann wall. > >Coherence is clearly complicated, but it doesn't seem necessarily to be >sequential. Are there theoretical limits to how parallelisable coherence >can be? Is the main issue speed-of-light limits to round-trip >communication between distributed cache controllers? Yes, and no, respectively. However, the theoretical limits that I know of in this area are much weaker than the practical ones. Regards, Nick Maclaren.
From: nmm1 on 26 Jul 2010 04:31 In article <70ath7-gh8.ln1(a)ntp.tmsw.no>, Terje Mathisen <"terje.mathisen at tmsw.no"> wrote: >Robert Myers wrote: >> Maybe you want more programmable control over coherence domains. If >> you're not going to scrap cache and cache snooping, maybe you can >> wrestle some control away from the hardware and give it to the >> software. > >That sounds like software-controlled distributed shared memory, a >concept that generates a lot more research papers and PhDs than actual >useful products, at least so far. :-( I believe that tackling it as a "computer science" problem is a large part of the reason that it has never got anywhere. The thesis posted later is a fairly typical example of the better research - let's skip over the worse research, holding our noses and averting our gaze! The killer isn't that it wouldn't work. The killer is how to map a sufficient class of problems to it to make it worthwhile - the three examples used are all well-known to be easily optimised by a wide range of architectures (parallel and other). And, like Robert, I don't see it doing so - AS IT STANDS - it might well be a starting point for a viable design. I believe that something COULD be done, but I don't believe that anything WILL be done for the forseeable future. Benchmarketing and existing spaghetti code rule too much decision making. Also, as I have posted ad tedium, the architecture is of little use without tackling the programming paradigms used. Regards, Nick Maclaren.
From: Rick Jones on 26 Jul 2010 13:16 Brett Davis <ggtgp(a)yahoo.com> wrote: > I have never in my life seen a compiler issue a PREFETCH instruction. > I have several times mocked the usefulness of PREFETCH as implemented > for CPUs in the embedded market. (Locking up one of the two read > ports makes good performance impossible without resorting to assembly.) > I would think that the fetch ahead engine on high end x86 and POWER > would make PREFETCH just about as useless, except to prime the pump > at the start of a new data set being streamed in. > How is PREFETCH used by which compilers today? Exactly how it is used I do not know, but this: http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100301-09740.html and the linked description of the -opt-prefetch flag: http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100301-09740.flags.html#user_CXXbase_f-opt-prefetch Suggests that compilers to have that feature. rick jones -- portable adj, code that compiles under more than one compiler these opinions are mine, all mine; HP might not want them anyway... :) feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: nmm1 on 26 Jul 2010 13:47 In article <i2kft0$9p7$1(a)usenet01.boi.hp.com>, Rick Jones <rick.jones2(a)hp.com> wrote: >Brett Davis <ggtgp(a)yahoo.com> wrote: >> I have never in my life seen a compiler issue a PREFETCH instruction. >> I have several times mocked the usefulness of PREFETCH as implemented >> for CPUs in the embedded market. (Locking up one of the two read >> ports makes good performance impossible without resorting to assembly.) > >> I would think that the fetch ahead engine on high end x86 and POWER >> would make PREFETCH just about as useless, except to prime the pump >> at the start of a new data set being streamed in. > >> How is PREFETCH used by which compilers today? > >Exactly how it is used I do not know, but this: > >http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100301-09740.html > >and the linked description of the -opt-prefetch flag: > >http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100301-09740.flags.html#user_CXXbase_f-opt-prefetch > >Suggests that compilers to have that feature. Don't believe everything that you are told! I have set such flags for several compilers on several architectures, and looked for the inserted instructions. Sometimes they are inserted, but often not. My tests included comparing the sizes of a large number of modules of typical scientific code, and giving them trivial examples which were ideally suited for the technique. A suspicious and cynical old sod, aren't I? Regards, Nick Maclaren.
From: George Neuner on 26 Jul 2010 15:13
On Sun, 25 Jul 2010 10:42:12 +0200, Terje Mathisen <"terje.mathisen at tmsw.no"> wrote: >Robert Myers wrote: >> Maybe you want more programmable control over coherence domains. If >> you're not going to scrap cache and cache snooping, maybe you can >> wrestle some control away from the hardware and give it to the >> software. > >That sounds like software-controlled distributed shared memory, a >concept that generates a lot more research papers and PhDs than actual >useful products, at least so far. :-( The hardware controlled version: KSR-1, went belly up. George |