Prev: Effects of Memory Latency and Bandwidth onSupercomputer,Application Performance
Next: Effects of Memory Latency and Bandwidth on Supercomputer,Application Performance
From: Owen Shepherd on 9 Aug 2010 13:18 Jeremy Linton wrote: > Well, they doesn't hit all of your bullet points, but marvell kirkwood > processors are currently available. They have DDR2/DDR3, 2-3 GigE ports, > SATA ports, DMA engines, couple PCIe lanes etc. > > Having used a couple of these processors, they are more than capable. > You probably don't want to run compute intensive applications on them, > but they easily can keep a couple GiGE ports busy (80%+ utilization) > serving as file servers. With a little creativity i'm sure you could use > them for web servers, or any number of other tasks. Plus beyond the > basics they have numerous useful on chip devices. For example hardware > encryption or XOR operations in the DMA controllers. If only Marvell would stop with their belief that data-sheets are trade secrets...
From: Michael S on 9 Aug 2010 13:42 On Jul 28, 6:15 am, Andy Glew <"newsgroup at comp-arch.net"> wrote: > On 7/27/2010 11:34 AM, j...(a)cix.compulink.co.uk wrote: > > > In article > > <284da124-7934-42bb-a58c-899a935a0...(a)5g2000yqz.googlegroups.com>, > > gni...(a)gmail.com (gnirre) wrote: > > >> Will Microsofts [sic] design an ARM processor? > > > I doubt it very much. They probably want to put an ARM with custom > > peripherals onto a chip in some piece of equipment. They do sell quite a > > lot of electronics in various forms: this could well go into a Zune > > successor, for example. > > If that was what they wanted, they could have bought a much cheaper > license, that allows them to use an existing ARM core design, and build > an SOC out of it. > AFAIR, both Motorola/Freescale and TI (and Apple too?) own ARM architecture license despite not only never developing ARM cores of their own, but never really having a concrete plans to do so. Sometimes big corporations buy expensive things just because big corporations buy expensive things.
From: Paul Gotch on 9 Aug 2010 14:46 Owen Shepherd <owen.shepherd(a)e43.eu> wrote: > > Of course the real question is whether they added conditionals > > for marketing reasons, or because it actually helps performance > > and/or code size... > I'd expect it does help code size and performance to a degree (since > it takes load off the branch predictor, both reducing the probability > of a misprediction and allowing it to profile the rest of the code > better) LOL Back when the original 26 bit ARM architecture was designed I don't think Acorn had a marketing department to speak of. Predicated instruction sets help codesize if you have a compiler which does if conversion. Branch prediction isn't something original ARMs had, this was 1983... -p -- Paul Gotch --------------------------------------------------------------------
From: Owen Shepherd on 9 Aug 2010 15:51 Paul Gotch wrote: > > LOL > > Back when the original 26 bit ARM architecture was designed I don't > think Acorn had a marketing department to speak of. > > Predicated instruction sets help codesize if you have a compiler which > does if conversion. Branch prediction isn't something original ARMs > had, this was 1983... > > -p I'm aware of the design of the ARM architecture; it is quite an interesting story! However, the above discussion was in the context of the AVR32. (And to those who'll say "26 bit? How short sighted!", note that Intel had only just released the 286, and Acorn's previous computers had been 6502 based. 64MB was a lot of RAM back then, and it meant that the flags could be pushed into the same slot as the PC - a great performance win back in those days) (And I've had the pleasure of using a RISC OS machine. They really were quite innovative for their time; for example, RISC OS had sub pixel anti- aliasing before PCs even had vector fonts...)
From: nedbrek on 10 Aug 2010 08:23
Hello all, "Brett Davis" <ggtgp(a)yahoo.com> wrote in message news:ggtgp-776ECA.23240209082010(a)news.isp.giganews.com... > > So is CMOVE still implemented internally as a branch? > (I know this is crazy sounding, but that is what both did...) The biggest problem with CMOV is the renamer (so, it is easy to handle for an in-order machine). Given the sequence ld r4 = [r0] add r1 += r3 cmov r1 = zf ? r1 : r4 sub r6 -= r1 When you rename the subtract, you need to connect it to either the instruction producing r1 (the add) or the producer of r4 (based on the flags, which have [potentially] a third producer). Your options are: 1) Stall on the condition codes at the producer or the consumer (producer is easier to implement, consumer gives better perf) 2) Predict at the cmov, the cmov then becomes the check, and flush if wrong 3) Hack the renamer to support two (or more!) producers and add support in the execution core to bypass multiple (ouch!) 4) Emit select uops For case 4 (assuming 2 srcs per), you get cmov -> concat tmp = {flags,r1} select r1 = tmp, r4 The concat connects the producer of flags and the producer of r1. The select can then use the flags to select r1 or r4. Consumers of r1 depend on the select. If you have 3 srcs per, you can do the select directly select r1 = flags, r1, r4 Ned |