Prev: PEEEEEEP
Next: Texture units as a general function
From: nmm1 on 9 Jan 2010 13:32 In article <hiae8b$pg5$1(a)news.eternal-september.org>, Stephen Fuld <SFuld(a)Alumni.cmu.edu.invalid> wrote: > >I once worked with an IBM marketing guy who had some great "laws". >Things like "There is always a worst bug". I once compiled a few and >posted them on my office wall. His name wasn't Zorn, was it? Regards, Nick Maclaren.
From: Stephen Fuld on 9 Jan 2010 14:34 nmm1(a)cam.ac.uk wrote: > In article <hiae8b$pg5$1(a)news.eternal-september.org>, > Stephen Fuld <SFuld(a)Alumni.cmu.edu.invalid> wrote: >> I once worked with an IBM marketing guy who had some great "laws". >> Things like "There is always a worst bug". I once compiled a few and >> posted them on my office wall. > > His name wasn't Zorn, was it? No, it was Bob(?) Hallem. He called them Hallem's Laws. Another was an extension of Disraeli's "Mendacity Index" "Lies, Damned Lies, Statistics, Development Schedules" -- - Stephen Fuld (e-mail address disguised to prevent spam)
From: nmm1 on 9 Jan 2010 15:02 In article <hialo8$rs6$1(a)news.eternal-september.org>, Stephen Fuld <SFuld(a)Alumni.cmu.edu.invalid> wrote: > >>> I once worked with an IBM marketing guy who had some great "laws". >>> Things like "There is always a worst bug". I once compiled a few and >>> posted them on my office wall. >> >> His name wasn't Zorn, was it? > >No, it was Bob(?) Hallem. He called them Hallem's Laws. It was a very bad joke, I admit. >Another was an extension of Disraeli's "Mendacity Index" > >"Lies, Damned Lies, Statistics, Development Schedules" Rather like the one I use: "Lies, Damned Lies, Statistics, Official Publications" Regards, Nick Maclaren.
From: Anne & Lynn Wheeler on 10 Jan 2010 15:56 Kai Harrekilde-Petersen <khp(a)harrekilde.dk> writes: > SCI started out as the grand be-all-end-all cache-coherent > super-thing, but it could also do non-coherent transfers, and at > Dolphin ICS, we sure had more success in attracting customers to use > the noncoherent transfers for clustering (or as an IO extension bus) > than doing the cache coherence stuff. > > On the cc-SCI side, we only had DG as a customer, with their Numaliine > series. a couple items from long ago and far away (following also includes Dolphin/Convex/Examplar announcement): Date: January 23, 1991 Subject: DOLPHIN SERVER ABANDONS ECL 88000 RISC PLAN Computergram - Norsk Data A/S affiliate Dolphin Server Technology A/S is re-focussing its effort to build a 1,000 MIPS, multi-processing server based on an ECL version of the Motorola 88000 RISC chip. The project, known as Orion, was originally slated for completion late in 1992, but the ECL CPU development effort has run into trouble - similar problems have already bedevilled other RISC projects, most recently MIPS Computer Systems' R6000 ECL part. Dolphin and Motorola collaborated on the design of the ECL part, and National Semiconductor was to fabricate it. Central to Dolphin's long- term plan has been the use of SCI, the Scalable Coherent Interface bus architecture, which is similar to the Futurebus+ system in concept, but has attracted a good deal less attention. Dolphin already has the 88000-based Triton 88 server under its belt, and now plans to introduce an interim Triton SCI system early in 1992. It will be a 300 MIPS multi-processor system combining SCI, cache and memory components from the Orion, with Motorola's much previewed 88110 RISC chip. The Orion - now not expected until 1993 - will use Motorola's post-88110 100MHz BiCMOS technology, rather than the Dolphin-designed CPU. Dolphin, which says it has "peeped behind the curtain and seen what Motorola is up to," has both feet firmly in the Motorola camp, and expects single-chip, multi-processors with 100m transistors clocking at 300MHz from the firm by the late 1990s, with a 4,000 MIPS part by the year 2000. Dolphin is awaiting final ratification of an SCI standard from the IEEE - expected later this year - and will then go straight into production of the Triton SCI. Dolphin is implementing SCI in a Token Ring-like formation, which it claims, offers up to five times the throughput of Futurebus+. On the Triton SCI Dolphin will offer bridges to VME-based systems, to other types of SCI systems, and may also develop links to Futurebus+. Enhancements planned for the Triton 88 this year include the addition of Unix System V.4, Novell and Banyan Vines networking support, increased storage options and a new plug-in CPU board with up to five 88000 processors. Following its OEM deal with Thomson-CSF SA subsidiary Cetia SA, Dolphin says it is now finalising a European distribution channel, and will also make a UK announcement soon. Dolphin claims an installed base of 225 Triton 88s. .... snip ... above mentions SCI in T/R-like ... offers five times the throughput of Futurebus+ following very long & heavily "snipped" ... totally unrelated, SLACVM was the original website outside of CERN: http://www.slac.stanford.edu/history/earlyweb/history.shtml Date: Tue, 23 Jun 1992 22:11 -0800 (PST) From: DBG(a)SLACVM.SLAC.Stanford.EDU Subject: Some online SCI documents To: Distribution Current status: the base standard is now approved by the IEEE as IEEE Std 1596-1992. It originally went out for official ballot in late January 91. Voters approved it by a 92% affirmative vote that ended April 15. Final corrections and polishing were done, and the revised draft was recirculated to the voters again and passed. Draft 2.00 was approved by the IEEE Standards board on 18 March 1992. Pre-publication copies of the standard are available from the IEEE Service Center, Piscataway, NJ, (800)678-4333. Commercial products to support and use SCI are already in final design and simulation, so the support chips should be available soon, 3Q92. - SCI-related documents are available electronically via anonymous FTP from HPLSCI.HPL.HP.COM, except for a few documents which are paper only. Online formats are Macintosh Word 4 (Compacted,self expg) and PostScript. The PostScript includes Unix compressed and uncompressed forms. Paper documents can be ordered from Kinko's 24hr copy Service, Palo Alto, California, (415)328-3381. Various payment forms can be arranged. Newcomers should order the latest mailing plus the package NEW, which contains the most essential documents from previous mailings. SCI depends on the IEEE 1212 CSR Architecture as well, so you will also need a copy of that, which is available from the IEEE Service Ctr. - Send your name, mailing address, phone number, fax number, email address, to me and I will put you on a list of people to be notified when new mailings are available; you will also be listed in an occasional directory of people who are participating in or observing SCI development. - Contact: - David B. Gustavson IEEE P1596 Chairman Stanford Linear Accelerator Center Computation Research Group P.O.Box 4349, Bin 88 Stanford, CA 94309 415-926-2863 or dbg(a)slacvm.slac.stanford.edu An SCI Extensions Study Group has been formed to consider what SCI-related extensions to pursue and how to organize them into standards. Related standards projects: 1212: Control and Status Register Architecture. This specification defines the I/O architecture for SCI, Futurebus+ (896.x) and SerialBus (P1394). Chaired by David V. James, Apple Computer, dvj(a)apple.com, 408-974-1321, fax 408-974-0781. An approved standard as of December 1991. Being published by the IEEE. P1596.1: SCI/VME Bridge. This specification defines a bridge architecture for interfacing VME buses to an SCI node. This will provide early I/O support for SCI systems via VME. Products are likely to be available in 1992. Chaired by Bjorn Solberg, CERN, CH-1211 Geneva 23, Switzerland. bsolberg(a)dsy-srv3.cern.ch, ++41-22-767-2677, fax ++41-22-782-1820. P1596.2: Cache Optimizations for Large Numbers of Processors using the Scalable Coherent Interface. Develop request combining, tree-structured coherence directories and fast data distribution mechanisms that may be important for systems with thousands of processors, compatible with the base SCI coherence mechanism. Chaired by Ross Johnson, U of Wisconsin, ross(a)cs.wisc.edu, 608-262-6617, fax 608-262-9777. P1596.3: Low-Voltage Differential Interface for the Scalable Coherent Interface. Specify low-voltage (less than 1 volt) differential signals suitable for high speed communication between CMOS, GaAs and BiCMOS logic arrays used to implement SCI. The object is to enable low-cost CMOS chips to be used for SCI implementations in workstations and PCs, at speeds of at least 200 MBytes/sec. This work seems to have converged on a signal swing of 0.25 V centered on +1 V. Chairman is Stephen Kempainen,National Semiconductor, 408-721-2836, fax 408-721-7218. asdksc(a)tevm2.nsc.com P1596.4: High-Bandwidth Memory Interface, based on SCI Signalling Technology. Define a high-bandwidth interface that will permit access to the large internal bandwidth already available in dynamic memory chips. The goal is to increase the performance and reduce the complexity of memory systems by using a subset of the SCI protocols. Started by Hans Wiggers of Hewlett Packard, current chairman is David Gustavson, Stanford Linear Accelerator Center, 415-961-3539, fax 415-961-3530. P1596.5: Data Transfer Formats Optimized for SCI. This working group has defined a set of data types and formats that will work efficiently on SCI for transferring data among heterogeneous processors in a multiprocessor SCI system. The working group has finished, voting to send the draft out for sponsor ballot. Chairman is David V. James, Apple Computer, dvj(a)apple.com, 408-974-1321, fax 408-974-0781. CONVEX SELECTS DOLPHIN'S SCI INTERCONNECT TECHNOLOGY FOR USE IN FUTURE PROCESSORS. According to a technology transfer agreement announced today, Dolphin SCI Technology A.S, a subsidiary of Dolphin Server Technology A.S, will share its Scalable Coherent Interface (SCI) technology with Convex Computer Corporation for use in Convex' future generation supercomputers currently under development. The agreement is initially worth several hundred thousands USD to Dolphin, and includes intentions of future cooperation between the two companies. Convex is continously working to develop new machines to strengthen the company's supercomputer market position. Convex manufactures systems which solve many of today's most demanding applications such as climate modelling, genetic sequencing and computational fluid dynamics. Convex recently announced a relationship with Hewlett-Packard which will result in Convex's adoption of HP's PA-RISC processor technology to build these future high performance machines. The Scalable Coherent Interface is an enabling technology for multiprocessor systems. With the current rapidly increasing RISC microprocessor power, even the best of today's interconnect - or "bus" - systems can only support small multiprocessor configurations. Buses are inherently bottlenecks, because only one processor "talks" at a time, and clock rates are limited by the physics of tapped transmission lines with variable loading. Buses also scale poorly with system size because propagation delays limit handshake and arbitration speed. .... snip ... -- 40+yrs virtualization experience (since Jan68), online at home since Mar1970
From: Steven G. Johnson on 13 Jan 2010 11:20
On Jan 4, 11:08 am, Thomas Womack <twom...(a)chiark.greenend.org.uk> wrote: > The use-17-instead-of-16 tricks still work, since you can often get > bank clashes inside the L2 cache; I did various benchmarks of > [120..129]x[120..129]x[120..129] FFTs withFFTW, and was initially > slightly surprised to find that 128x128x128 was among the slowest. > As far as I know,FFTWdoesn't let you specify the data layout so you > can't tell it to do a 128x128x128 FFT on data stored in the top part > of a 129x129x128 box; Actually, FFTW does allow arbitrary data layouts. You can use the "advanced" interface with the "nembed" parameter to specify a smaller multidimensional array embedded in a larger one, or the guru interface for even more general layouts. Even if you use a 128x128x128 array, FFTW will in some cases do a sequence of the discontiguous subtransforms by copying a few of them at a time to a contiguous buffer, and similar tricks to avoid cache- line conflicts. > I believe an early version of the manual said > that they had often found performance improvements by doing this but > couldn't figure out how to exploit them in a product with a > comprehensible interface. What you're referring to is that at one point we found performance improvements (I believe on an IBM RS/6000, if I remember correctly), by inserting padding into the middle of a *one-dimensional* array (again to avoid cache conflicts), but it didn't seem like there was a sane interface for specifying a 1d array with padding in the middle. Regards, Steven G. Johnson |