Prev: PEEEEEEP
Next: Texture units as a general function
From: Del Cecchi` on 4 Jan 2010 23:29 Andy "Krazy" Glew wrote: > >> You could use the provided hardware scatter-gather if you were astute >> enough to use InfiniBand interconnect. :-) >> >> del >> >> you can lead a horse to water but you can't make him give up ethernet. > > > Del: > > What's the stoy on Infiniband? I am retired now, but it seems to still be around. It seems to be used some in enterprise land but with a minuscle market share as compared to ethernet/tcpip. I was mostly teasing nick for complaining about the cycles used by scatter gather.
From: Del Cecchi` on 4 Jan 2010 23:32 Anne & Lynn Wheeler wrote: > Stephen Fuld <SFuld(a)alumni.cmu.edu.invalid> writes: > >>Do you want to know the history of Infiniband or some details of what >>it was designed to do (and mostly does)? > > > minor reference to SCI (being implementable subset of FutureBus) > http://en.wikipedia.org/wiki/Scalable_Coherent_Interface > > eventually morphing into current InfiniBand > http://en.wikipedia.org/wiki/InfiniBand > I don't recall any morphing at all from SCI to IB. And I was involved in both. For openers SCI was source synchronous parallel and ib is byte serial. SCI is coherent, IB is not.
From: Del Cecchi` on 4 Jan 2010 23:34 Robert Myers wrote: > On Jan 3, 11:16 pm, Del Cecchi` <dcecchinos...(a)att.net> wrote: > >>Robert Myers wrote: >> >>(snip) >> >> >> >> >>>Yes, and if your goal is to claw your way to the top of the all- >>>important Top 500 list, you're not going to waste money on >>>communication that doesn't help with a linpack benchmark. >> >>(snip) >> >>>Robert. >> >>But of course no one pays the big bucks for Blue Gene or other ibm >>cousins to get their name in the paper but to get a job done. >> >>It might be a job that will get done wrong or not need to be done at >>all, but they want to get it done. >> > > I'm a tad more cynical than you are. > > It's true that there were issues that the labs claimed needed to be > addressed urgently. The labs have at times come under ferocious > criticism for slow or no progress on the problem you showed a graphic > from, and the labs have come under ferocious criticism for some of the > systems they've bought from IBM. > > At the time of the first hurry-up Blue Gene purchase, Japan had > wiggled its way to the top of toe Top 500 list. You can talk all you > want, but I can show you briefings that talk about Earth Simulator > being at the Top of the Top 500 list. Of what *possible* relevance > did that have to the Rayleigh Taylor instability or the stewardship > stockpile program? None whatsoever, of course. > > Clearing that black mark off America's preeminence in high technology > was something that every senator and representative could understand. > Continuing to emphasize American leadership on that score is something > that political leaders continue to understand. Under the > circumstances, a "we're working on it" from anyone less than IBM would > not have done. > > I admire your restraint in responding, as it must feel to you that I > want to belittle IBM. I'm not sure that IBM even *wants* that > business. Want it or not, they've got it, and I am sure that a long > string of administrations has leaned on IBM to make sure that it is > given a priority that IBM otherwise would not. My real problem is not > with IBM, but with the politics of big science. > > Robert. > IBM want the business of making blue genes because it is prestigous AND profitable. It is sold to others besides the bomb labs.
From: nmm1 on 5 Jan 2010 04:14 In article <7qfthqF646U2(a)mid.individual.net>, Del Cecchi` <delcecchi(a)gmail.com> wrote: >Andy "Krazy" Glew wrote: >> >>> You could use the provided hardware scatter-gather if you were astute >>> enough to use InfiniBand interconnect. :-) >>> >>> you can lead a horse to water but you can't make him give up ethernet. >> >> What's the stoy on Infiniband? > >I am retired now, but it seems to still be around. It seems to be used >some in enterprise land but with a minuscle market share as compared to >ethernet/tcpip. > >I was mostly teasing nick for complaining about the cycles used by >scatter gather. Well, I wasn't complaining, so much as pointing out that the myth that you can overlap computation and I/O was true on most mainframes, is true for some specialist uses on some supercomputers, but otherwise is almost entirely false. And, as usual, the reason that it doesn't work is the software architecture. Incidentally, I should be curious to know how much of InfiniBand hardware manual became generally available. It is a long time since I looked at the software one, but my recollection/impression is that no more than 25% of it has become generally available. If that. As I said, perhaps half of SCSI isn't generally available, though few people realise what it theoretically provides. But each InfiniBand manual is c. 1,000 pages, and SCSI is only a few hundred in all. Regards, Nick Maclaren.
From: Anne & Lynn Wheeler on 5 Jan 2010 09:43
a little SCI drift from long ago and far away Date: Mon, 29 Jun 92 19:20:20 EST From: wheeler Subject: SCI meeting at slac There was a presentation on SLACs computational and data-storage requirements over the next 3-5 years ... and how SCI would being to efficiently address some of the opportunities. There was also mention that SCI was recently presented to the SCSI standards committee and a SCSI protocol using 200mbit (maybe 100mbit) SCI cable looks very promising. This appears to be along the same limes ase HARRIER-II serial implementation running 80mbits (pushing to 160?). .... snip ... We had been trying to get HARRIER (9333) to interoperate with FCS .... instead it morphed into SSA. old reference to Jan92 meeting: http://www.garlic.com/~lynn/95.html#13 Both SCI and FCS groups were in to SCSI committee with proposals for serial SCSI. random sample from little later ... note that oo-operating systems were something the rage; apple had pink (some of which morphed into taligent) and sun had done doe/spring. Date: Sat, 3 Feb 1996 10:17:19 -0800 From: wheeler Subject: SCI convex exemplar already does this ... SCI but with pa/risc chips. sequent is exactly using the 4-way ... but with some industrial strength re-engineering to the components. a year ago ... sparc10 engineer had me in to talk about possibly doing scaleup/commercializing sun's doe/spring oo-operating system. the original SCI was conceived by gustafson at SLAC and pushed thru IEEE committee ... where it became more of a general purpose camel than a race-horse. the sparc10 guy is now at xxxx and has developed an optimized subset of the SCI protocol (looks a lot more like gustafson's original proposal) ... and in cmos chip can get about five times the thruput of the dolphin GaAs chip. In practical terms, the implication is that the subset chip could be used to cost-effectively provide memory cache consistency between 100s-1000 of workstations spread around a building ... compared to 10-256 processors in a central complex (i.e. the dolphin GaAs sci design point). At the time, I asked why they couldn't get hennesey (i.e. lots of the mips/sgi scaleup and the dash/flash projects at stanford). The comment was that DOE (the gov agency, not the sun oo-opsys) has hennesey off trying to bail-out the DOE/intel teraflop computer. in any case, we saw the doe/spring (oo-opsys) group at sun the other afternoon ... looks like they've effectively been all moved over to java ... and as they mentioned ... their work on doe/spring had only been to scaleup into the tens of processors. as per the attached ... with motorola effectively migrating all the 88k to the power/pc ... data general was hung-out for a processor. the choices were pretty much to make the translation to power/pc ... or go to intel. note that SCI is NOT a one gigabyte per second protocol. The dolphin GaAs chip is a one gigabyte implementation. SCI in fact is effectively a set of syncronous bus-protocols redone with encapsulated packets and made asyncronous. SCI can run over serial copper at 250mbits/sec, slow fiber at 1gbit/sec ... and/or scaleup above 10gbytes/sec. One of the problems the guy at motorola phoenix claimed as an SCI problem was that it bottlenecked ... he effectively described an implementation mapped to something that is akin to FDDI ring architecture. While entry-level versions of SCI can be done that way .... SCI can also map to 10gbyte/sec, scallable, non-blocking cross-bar switch (i.e. all attachments capable of concurrent full 10gbyte/sec thruput). In any case, the we have some opportunity to see the sequent machine soon ... and everybody is going to have a hard-time beating their price/performance & scaleup (as per our deployment platform analysis). The board cache description is a little off ... it should be shared 4mbyte L2 (i.e. avg. 1mbyte/processor cache). The detailed cycle-by-cycle hardware simulator that Oracle has developed for its development platforms has been used to convince some vendors that 1mbyte is sort of entry ante cache size for Oracle ... and that 4mbyte is still on the sweet part of the curve. quad board with 4mbyte L2 cache should be a screamer for oracle .... and the sci scaleup to 64boards (256 processors) in a complex will be a really tough price/performance point for everybody else. .... snip ... -- 40+yrs virtualization experience (since Jan68), online at home since Mar1970 |