Prev: systemc
Next: sqrt(a^2 + b^2) in synthesizable VHDL?
From: JJ on 8 May 2006 14:25 Andreas Ehliar wrote: > On 2006-05-07, JJ <johnjakson(a)gmail.com> wrote: > > I would say that if we were to see PCIe on chip, even if on a higher $ > > part, we would quickly see alot more co pro board activity even just > > plain vanilla PC boards. > > You might be interested in knowing that Lattice is doing just that in > some of their LatticeSC parts. On the other hand, you are somewhat > limited in the kinds of application you are going to accelerate since > LatticeSC do not have embedded multipliers IIRC. (Lattice are > targetting communication solutions such as line cards that rarely needs > high performance multiplication in LatticeSC.) > > /Andreas Yeh, I have been following Lattice more closely recently, will take me some time to evaluate their specs more fully, may get more interested if they have a free use tool chain I can redo my work with. Does anyone have PCIe on chip though? John Jakson transputer guy
From: JJ on 8 May 2006 15:21 Piotr Wyderski wrote: > Andreas Ehliar wrote: > > > One interesting application for most of the people on this > > newsgroup would be synthesis, place & route and HDL simulation. > > My guess would be that these applications could be heavily > > accelerated by FPGA:s. > > A car is not the best tool to make another cars. > It's not a bees & butterflies story. :-) Same with FPGAs. > Well xyz auto workers do eat their own usually subsidised by the employer. I disagree, in a situation where FPGAs develop relatively slowly and P/R jobs take many hours, there would be a good opportunity to use FPGAs for just such a job. But then again FPGAs and the software is evolving too fast and P/R jobs in my case have gone from 8-30hrs a few years ago to a few minutes today so the incentive has gone. If I was paying $250K like the ASIC guys do for this and that, a hardware coprocessor might look quite cheap and the EDA software is much more independant of the foundries. DAC usually has a few hardware copro vendors most of them based on FPGAs. At one time some of those were even done in full custom silicon, that was really eating your own. > > My second guess that it is far from trivial to actually do this :) > > And who actually would need that? > I would be rather amazed if in a few years my 8 core Ulteron x86 chip was still running EDA tools on 1 core. > Best regards > Piotr Wyderski John Jakson transputer guy
From: Phil Tomson on 8 May 2006 20:09 In article <e3mq62$k21$2(a)news.lysator.liu.se>, Andreas Ehliar <ehliar(a)lysator.liu.se> wrote: >On 2006-05-06, Piotr Wyderski ><wyderski(a)mothers.against.spam-ii.uni.wroc.pl> wrote: >> What could it accelerate? Modern PCs are quite fast beasts... >> If you couldn't speed things up by a factor of, say, 300%, your >> device would be useless. Modest improvements by several tens >> of percents can be neglected -- Moore's law constantly works >> for you. FPGAs are good for special-purpose tasks, but there >> are not many such tasks in the realm of PCs. > >One interesting application for most of the people on this >newsgroup would be synthesis, place & route and HDL simulation. >My guess would be that these applications could be heavily >accelerated by FPGA:s. My second guess that it is far from trivial >to actually do this :) > Certainly on the simulation side of things various companies like Ikos (are they still around?) have been doing stuff like this for years. To some extent this is what ChipScope and Synplicity's Identify are doing only using more of a logic analyzer metaphor. Breakpoints are set and triggered through JTAG. As far as synthesis itself and P&R I would think that these could be acellerated in a highly parallel architecture like an FPGA. There are lots of algorithms that could be sped up in an FPGA - someone earlier in the thread said that the set of algorithms that could benefit from the parallelism available in FPGAs was small, bit I suspect it's actually quite large. Phil
From: Phil Tomson on 8 May 2006 20:12 In article <e3nvv7$h30$1(a)atlantis.news.tpi.pl>, Piotr Wyderski <wyderskiREMOVE(a)ii.uni.wroc.pl> wrote: >Andreas Ehliar wrote: > >> One interesting application for most of the people on this >> newsgroup would be synthesis, place & route and HDL simulation. >> My guess would be that these applications could be heavily >> accelerated by FPGA:s. > >A car is not the best tool to make another cars. >It's not a bees & butterflies story. :-) Same with FPGAs. Err... well, cars aren't exactly reprogrammable for many different purposes, though, are they? > >> My second guess that it is far from trivial to actually do this :) > >And who actually would need that? > Possibly you? What if we could decrease your wait for P&R from hours to minutes? I suspect you'd find that interesting, no? Phil
From: Phil Tomson on 8 May 2006 20:37
In article <1146975146.177800.163180(a)g10g2000cwb.googlegroups.com>, JJ <johnjakson(a)gmail.com> wrote: > >Jeremy Ralph wrote: >> If one wanted to develop an FPGA-based hardware accelerator that could >> attach to the average PC to process data stored in PC memory what >> options are there available. >> >> Decision factors are: >> + ease of use (dev kit, user's guide, examples) >> + ability to move data with minimal load on the host PC >> + cost >> + scalability (i.e. ability to upsize RAM and FPGA gates) >> + ability to instantiate a 32 bit RISC (or equiv) >> >> Someone recommended the TI & Altera Cyclone II PCIe dev board, which is >> said to be available soon. Any other recommendations? >> >> Also, what is the best way to move data between PC mem to FPGA? DMA? >> What transfer rates should one realistically expect? >> >> Thanks, >> Jeremy > >FPGAs and standard cpus are bit like oil & water, don't mix very well, >very parallel or very sequential. Actually, that's what could make it the perfect marriage. General purpose CPUs for the things they're good at like data IO, displaying information, etc. FPGAs for applications where parallelism is key. I think the big problem right now is conceptual: we've been living in a serial, Von Neumann world for so long we don't know how to make effective use of parallelism in writng code - we have a hard time picturing it. Read some software engineering blogs: with the advent of things like multi-core processors, the Cell, etc. (and most of them are blissfully unaware of the existence of FPGAs) they're starting to wonder about how they are going to be able to model their problems to take advantage of that kind of paralellism. They're looking for new abstractions (remember, software engineering [and even hardware engineering these days] is all about creating and managing abstractions). They're looking for and creating new languages (Erlang is often mentioned in these sorts of conversations). Funny thing is that it's the hardware engineers who hold part of the key: HDLs are very good at modelling parallelism and dataflow. Of course HDLs as they are now would be pretty crappy for building software, but it's pretty easy to see that some of the ideas inherant in HDLs could be usefully borrowed by software engineers. > >What exactly does your PC workload include. > >Most PCs that are fast enough to run Windows and the web software like >Flash are idle what 99% of the time, and even under normal use still >idle 90% of the time, maybe 50% idle while playing DVDs. > >Even if you have compute jobs like encoding video, it is now close >enough to real time or a couple of PCs can be tied together to get it >done. > >Even if FPGAs were infinitely fast and cheap, they still don't have a >way to get to the data unless you bring it to them directly, in a PC >accelerator form, they are bandwidth starved compared to the cache & >memory bandwidth the PC cpu has. Well, there's that DRC computing product that puts a big FPGA in one slot of a dual opteron motherboard passing data between the Opteron and FPGA at very high speed via the hypertransport bus. It seems like the perfect combination. The transfer speeds are high enough to enable lots of types of FPGA accelerator applications that wouldn't have been practical before. > >There have been several DIMM based modules, one even funded by Xilinx >VC a few years back, I suspect Xilinx probably scraped up the remains >and any patents? > >That PCI bus is way to slow to be of much use except for problems that >do a lot of compute on relatively little data, but then you could use >distributed computing instead. PCIe will be better but then again you >have to deal with new PCIe interfaces or using a bridge chip if you are >building one. Certainly there are classes of problems which require very little data transfer between FPGA and CPU that could work acceptably even in a PCI environment. > >And that leaves the potential of HT connections for multi socket (940 & >other) Opteron systems as a promising route, lots of bandwidth to the >caches, probably some patent walls already, but in reality, very few >users have multi socket server boards. > >It is best to limit the scope of use of FPGAs to what they are actually >good at and therefore economical to use, that means bringing the >problem right to the pins, real time continuous video, radar, imaging, >audio, packet, signal processing, whatever with some logging to a PC. > >If a processor can be in the FPGA, then you can have much more >throughput to that since it is in the fabric rather than if you go >through an external skinny pipe to a relatively infinitely faster >serial cpu. Further, if your application is parallel, the you can >possibly replicate blocks each with a specialized processor possibly >with custom instructions or coprocessor till you run out of fabric or >FPGAs. Eventually though input & output will become limiting factors >again, do you have acquisition of live signals and or results that need >to be saved. > One wonders how different history might be now if instead of the serial Von Neumann architectures (that are now ubiquitious) we would have instead started out with say, cellular automata-like architectures. CAs are one computing architecture that are perfectly suited for the parallelism of FPGAs. (there are others like neural nets and their derivatives). Our thinking is limited by our 'legos', is it not? If all you know is a general purpose serial CPU then everything starts looking very serial. (if I recall correctly, before he died Von Neumann himself was looking into things like CAs and NNs because he wanted more of a parallel architecture) There are classes of biologicially inspired algorithms like GAs, ant colony optimization, particle swarm optimization, etc. which could greatly benefit from being mapped into FPGAs. Phil |