Prev: Online Exams for Certification, Free Practice Exams, Study Material, Dumps
Next: Motherboard unusuable because of 1 millimeter of missing plastic ?!?!?!?!
From: Chris Gray on 8 Nov 2009 18:34 Terje Mathisen <Terje.Mathisen(a)tmsw.no> writes: > The usual programming paradigm for such a system is to have many > threads running the same algorithm, which means that training > information from one thread is likely to be useful for another, or at > least not detrimental. That doesn't mean that the ability to have multiple predictor states is bad. You need a way for the OS to tell the CPU "thread with key Y is going to be very similar to thread with key X". That means that the key Y state should be seeded with the key X state, or that the two states can be merged into one larger, more detailed state. I guess it comes down to the question of just how much value is a more accurate predictor - how many gates can you afford, and is it worthwhile to need a few extra instructions to initialize it? -- Experience should guide us, not rule us. Chris Gray cg(a)GraySage.COM http://www.Nalug.ORG/ (Lego) http://www.GraySage.COM/cg/ (Other)
From: Quadibloc on 8 Nov 2009 22:58 On Nov 8, 2:00 pm, Terje Mathisen <Terje.Mathi...(a)tmsw.no> wrote: > The usual programming paradigm for such a system is to have many threads > running the same algorithm, which means that training information from > one thread is likely to be useful for another, or at least not detrimental. Ah, I thought the usual operation of a multicore system is to have an operating system running multiple different applications at once, and the operating system itself, so that the system would have more different applications providing threads than there were cores. Thus, on a Windows PC, when I look at Task Manager, under the Processes tab, I usually find more than four things listed there. Admittedly, if I was using multicore chips in a supercomputer in order to do massively-parallel number-crunching, I probably _would_ be using the system as you describe. In fact, even on a PC, if I was playing certain graphics-intensive computer games, that may well be what I want. So the situation you describe, even if not "usual", is the one that applies... the only times when performance is critical. John Savard
From: Robert Myers on 9 Nov 2009 02:32 On Nov 8, 10:58 pm, Quadibloc <jsav...(a)ecn.ab.ca> wrote: > > Ah, I thought the usual operation of a multicore system is to have an > operating system running multiple different applications at once, and > the operating system itself, so that the system would have more > different applications providing threads than there were cores. > > Thus, on a Windows PC, when I look at Task Manager, under the > Processes tab, I usually find more than four things listed there. > > Admittedly, if I was using multicore chips in a supercomputer in order > to do massively-parallel number-crunching, I probably _would_ be using > the system as you describe. In fact, even on a PC, if I was playing > certain graphics-intensive computer games, that may well be what I > want. So the situation you describe, even if not "usual", is the one > that applies... the only times when performance is critical. > Windows, VNC, an embedded virtual Linux machine, and Chrome with many open tabs keep this i7 pretty busy. Add in bloated messengers, and it's sometimes not enough. Going back to anything less is sort of depressing, actually. Most of what has been said here and elsewhere about the uselessness of multiple cores/ multi-threading has been "all computing is like the computing I'm used to, and it always will be." Robert.
From: Ken Hagan on 9 Nov 2009 05:03 On Sun, 08 Nov 2009 16:52:21 -0000, Quadibloc <jsavard(a)ecn.ab.ca> wrote: > If one has a multithreaded core, branch predictor information should > be labelled by thread, so that information gathered about the branches > in one thread isn't used to control how branches in another thread are > handled. The branch predictor should not simply ignore the fact that > multiple different threads are being executed in the core. I'm still slightly confused, perhaps as much by other people's responses as by your suggestion. When we speak of "thread" here, are these CPU hyper-threads or OS threads (or indeed, some other OS-supplied tag, allowing for groups of behaviourally similar threads to learn from one another)? Since the subject came up in the context of multithreaded cores, I presumed the former, but possibly you were thinking of the latter. If so, would that be useful even on a single-threaded core?
From: Quadibloc on 9 Nov 2009 09:10
On Nov 9, 3:03 am, "Ken Hagan" <K.Ha...(a)thermoteknix.com> wrote: > On Sun, 08 Nov 2009 16:52:21 -0000, Quadibloc <jsav...(a)ecn.ab.ca> wrote: > > If one has a multithreaded core, branch predictor information should > > be labelled by thread, so that information gathered about the branches > > in one thread isn't used to control how branches in another thread are > > handled. The branch predictor should not simply ignore the fact that > > multiple different threads are being executed in the core. > I'm still slightly confused, perhaps as much by other people's responses > as by your suggestion. > When we speak of "thread" here, are these CPU hyper-threads or OS threads > (or indeed, some other OS-supplied tag, allowing for groups of > behaviourally similar threads to learn from one another)? Since the > subject came up in the context of multithreaded cores, I presumed the > former, but possibly you were thinking of the latter. If so, would that be > useful even on a single-threaded core? My thinking was that at a given moment, there might be, say, 128 OS threads... and in a commercial CPU, eight of those threads might actually be executing at that moment - two in each core of a quad-core CPU. I was thinking that since the threads were likely to be unrelated in a conventional Windows PC environment, if the cores are "hyper- threaded" with the grand total of two threads each, they should have two separate branch predictors each. Allocating a different OS thread to a core thread slot, I figured, would take place "infrequently", say during a timer interrupt 60 times a second (that's how often they had timer interrupts on my grandpappy's IBM 360...) and so I viewed flushing the branch predictor, rather than trying to give it the ability to cope with the operating system's idea of what constitutes a thread, as an acceptable departure from optimization. John Savard |