From: Paul Keinanen on 28 May 2010 07:21 On Fri, 28 May 2010 09:31:01 GMT, Jan Panteltje <pNaonStpealmtje(a)yahoo.com> wrote: > >>For instance the HDTV 1920x1080 picture can be divided into 8100 >>macro-blocks of 16x16 each. With only 300 cores, each core would have >>to handle a slice of macro-blocks :-). > >Thank you for the deep insight. >Just a quick question: > How do you transfer data between those 300 cores? A camera generating 1920x1080p60 requires only 125 Mpix/s (8ns/pixel), assuming 3x10 bits/pixel, this would fit into a 32 bit bus. Using the producer/consumer model, in which each node only pics the data it is interested in, only a single 32 bit bus connected to all cores would be required running at 125 MHz. This does not seem too hard these days. With 16x16 macro blocks, you need to communicate with 8 neighbor macro-blocks (N, NE, E, SE, S, SW, W, NW). With slices, you only need to communicate with the slice above and below. This is a simple case that would benefit from al large number of processors, making a general product that could use effectively the large number of cores is of course much harder.
From: Jan Panteltje on 28 May 2010 11:32 On a sunny day (Fri, 28 May 2010 14:21:10 +0300) it happened Paul Keinanen <keinanen(a)sci.fi> wrote in <ck8vv5hc3l219kgp945smpu50v20sfahil(a)4ax.com>: >On Fri, 28 May 2010 09:31:01 GMT, Jan Panteltje ><pNaonStpealmtje(a)yahoo.com> wrote: > >> >>>For instance the HDTV 1920x1080 picture can be divided into 8100 >>>macro-blocks of 16x16 each. With only 300 cores, each core would have >>>to handle a slice of macro-blocks :-). >> >>Thank you for the deep insight. >>Just a quick question: >> How do you transfer data between those 300 cores? > >A camera generating 1920x1080p60 requires only 125 Mpix/s (8ns/pixel), >assuming 3x10 bits/pixel, this would fit into a 32 bit bus. > >Using the producer/consumer model, in which each node only pics the >data it is interested in, only a single 32 bit bus connected to all >cores would be required running at 125 MHz. This does not seem too >hard these days. > >With 16x16 macro blocks, you need to communicate with 8 neighbor >macro-blocks (N, NE, E, SE, S, SW, W, NW). Yes. Do you remember the 'transputer'?
From: Tim Williams on 28 May 2010 17:33 "Jan Panteltje" <pNaonStpealmtje(a)yahoo.com> wrote in message news:hto3cd$udv$1(a)news.albasani.net... > Often video comes in ONE FRAME AT THE TIME. > So I also do real time procesing. > In a real broadcast say HD environment you have several HD cameras > streaming, > encoding is needed, recording is needed. Well in that case, you can't do better than transcoding 1 frame in 1/FPS seconds, so you only need a limited amount of processing power regardless. And it's going to be a lot less power than e.g. transcoding the Library of Congress in an hour. > Bu tnot always easy to do, say you render a sequence with Blender? > But for pure transcoding it could work. What is significant about a "sequence with Blender" that can't be evaluated for all time? Are there animations composed of difference equations, rather than predefined equations (e.g. "live" physics simulation)? Even if so, the geometry can be solved beforehand, or evaluated to a certain point so that each process can evaluate its section of time independently. Lots of differentials problems (think FEA) are subject to useful parallelism (if not always as embarrassingly so as graphics tends to be), so I fail to see how it ends here. Just like ever... it is subject to the skill of the programmer, and how much foresight he has. Tim -- Deep Friar: a very philosophical monk. Website: http://webpages.charter.net/dawill/tmoranwms
From: Jan Panteltje on 29 May 2010 06:19 On a sunny day (Fri, 28 May 2010 16:33:48 -0500) it happened "Tim Williams" <tmoranwms(a)charter.net> wrote in <htpcs3$188$1(a)news.eternal-september.org>: >"Jan Panteltje" <pNaonStpealmtje(a)yahoo.com> wrote in message >news:hto3cd$udv$1(a)news.albasani.net... >> Often video comes in ONE FRAME AT THE TIME. >> So I also do real time procesing. >> In a real broadcast say HD environment you have several HD cameras >> streaming, >> encoding is needed, recording is needed. > >Well in that case, you can't do better than transcoding 1 frame in 1/FPS >seconds, so you only need a limited amount of processing power regardless. That last remark I fail to see, we are moving towards higher and higher resolutions, more complex encoding schemes, more frames per second with 3D, and more then one stream at the time. >And it's going to be a lot less power than e.g. transcoding the Library of >Congress in an hour. Well, I dunno how big that is, but one day maybe it will fit all on a postage stamp size medium, but that is an other subject, and interesting one, it reminds me of the Alien anecdote: Alien came to earth, looked a around a bit, found it all very interesting, and wanted to take the accumulated knowledge of the earthlings home with him (it?) to the home-planet. So he got Encyclopedia Britannica, but it was too big and too heavy to fit in the flying sourcer. I have to change sentence construction constantly as to avoid the him / her / it dilemma with that alien. not sure how aliens reproduce, maybe of well, back to that problem with weight and size, hehe So, anyways, Alien writes down all of the characters in those books as one long hex ASCII string. Very long. Then did a 1 / number, took a stick, and put a mark on it from one side to represent that ratio, and took the stick with the (hehe) flying wok or sourcer or dishwasher or whatever, > >> Bu tnot always easy to do, say you render a sequence with Blender? >> But for pure transcoding it could work. > >What is significant about a "sequence with Blender" that can't be evaluated >for all time? Are there animations composed of difference equations, rather >than predefined equations (e.g. "live" physics simulation)? One of the points and advantages of NOT going multi core is that you can use existing programs www.blender.org I can just have it render an AVI movie from some thing I designed. >Even if so, the geometry can be solved beforehand, or evaluated to a certain >point so that each process can evaluate its section of time independently. Sounds cryptic to me. >Lots of differentials problems (think FEA) are subject to useful parallelism >(if not always as embarrassingly so as graphics tends to be), so I fail to >see how it ends here. Just like ever... it is subject to the skill of the >programmer, and how much foresight he has. When thinking about your idea of chopping up say a mpeg file, and 'real time' I was thinking maybe GOPs of 15, so 15 frames, at 25 fps that makes .6 seconds, you need to read-ahead for 300 cores makes 180 seconds latency (not counting any data transport delays). 3 minutes, could even be useful for live streams of politicianions, as you can then cut in time if they say something stupid, like 'The Internets'. >Tim Yea, it is all fun, you need to code some of that stuff, just for the honour, Klingon like :-()
From: JosephKK on 4 Jun 2010 17:16
On Wed, 26 May 2010 08:30:38 -0400, Phil Hobbs <pcdhSpamMeSenseless(a)electrooptical.net> wrote: >Jan Panteltje wrote: >> The next step: A way to produce flexible gallium arsenid wafers in quantity has been found: >> http://beforeitsnews.com/news/48/149/Semiconductor_Gallium_Arsenide_Twice_As_Efficient_As_Silicon_in_Solar_Power_Applications_Say_Illinois_Researchers.html >> >> Intel considers producing CPUs on the stuff. >> >> Now the THz processor? >> Multicore is dead ;-) >> > >Not till someone figures out how to make P-channel GaAs FETs that are >worth anything. Hole mobility in GaAs is pitiful. Building a modern >processor out of NMOS would make for rather interesting power >dissipation densities--i.e. the whole thing would turn to lava. > >Cheers > >Phil Hobbs Naw, 80 GHz (U)LVPECL 8-bitters and maybe 12 or 16 bitters. Single 1.5 V supply. |