Prev: 128316 Computer Knowledge, Free and alwqays Up to Date 59
Next: Fwd: Different stacks for return addresses and data?
From: Robert Myers on 3 Mar 2010 12:01 On Mar 3, 11:17 am, timcaff...(a)aol.com (Tim McCaffrey) wrote: > > Ok, now I'm confused. You original post implied to me you did > not like Cray or what he did. > > To be clear myself, I have a great deal of respect for Mr. Cray. > Although he made some mistakes, they are really only obvious in > hindsight. > There was an important mistype in my original post. Since I've talked here frequently how lame, from a user's point of view, current "supercomputers" are compared to the Cray, I assumed that the mistype would be obvious. I also got back a post recently to the effect that "obviously" current computers don't have the architecture of the Cray 1, so you "obviously" can't expect computers to do similar things, and my suggesting that they should be able to was "almost moronic." Well, some of the architectural changes being discussed *could* make a current computer feel a lot more like a Cray-1. Robert.
From: MitchAlsup on 3 Mar 2010 12:18 On Mar 2, 1:48 pm, Robert Myers <rbmyers...(a)gmail.com> wrote: > On Mar 2, 2:12 pm, Stephen Fuld <SF...(a)alumni.cmu.edu.invalid> wrote:> On 3/2/2010 9:56 AM, Robert Myers wrote: > > > Well, some of the details of Mitch's proposal aren't clearly specified. > > I can't tell if he intends the off chip DRAM to be part of the > > processor's address space or not. If it is, then the on-chip DRAM is > > essentially a level 4 cache. But it could be that it isn't, in which > > case, it is more like a fast paging device with some extra features. > > Help me out here. Isn't the page file part of the processor's address > space? When something is paged out, you don't have to worry about > coherence because you can't touch it. I intend that the ECS be used as a 'paging' area. That is the memory is not accessible by a load or a store, but is addressible as if the ECS were a disk with zero rotational latency, and performs transfers in page sized units with about the delay of a cache line transfer of current era. Done this way, the size of the coherent domain is samll enough that coherence checking does not increase memory access time, but large memory is accessible because the paging is so fast. Much of the other detail is to make the memory management updates as fast as the data transfers. The things different than ECS-of-olde is that the I/O devices are attached to the ECS and not directly to main memory. This is not a paging pack--this is more like the index cache of a 300TB database (or maybe the resident cache of the 300TB database). I originally considered this system to have a FBDIMM-like multiplexer 4*4 channels and each CPU chip on the motherboard also had 4 FBDIMM- like channels. With 4 such chips on the motherboard, and a desire to build systems with as many as 16 motherboards in a system, one needs a way to provide relatively uniform access to very large memories. Consider that the motherboards are positioned horizontally, and that the memory boards are positionied vertically. At each intersection between the motherboards and the memory boards there is an FBDIMM-like channel connection. With such an arrangement and 2 layers of the FBDIMM multiplerer on the motherboards and two layers on the memory boards, on has concurrent access to as many as 4096 FBDIMMs. (several TBytes.) The reason to use an FBDIMM-like multiplexer is that the FBDIMM channels has low sequential route latency. One does not wait for the whole request/response to show up before routing it forward. The proposed format has all routing information in the first DDR beat of the message, so that one can get from input pin to output pin in less than 3 ns. Pin speeds will be 6GTs+ ala FBDIMM. Of the 200ns access time, 50ns is spent routing, 100ns spent in the DRAM access, and 50ns is spent waiting for one of several conflicting message to complete their route throught the needed channel. With the arrangement of the paragraph above, up to 64 processing chips have access to up to 64 FBDIMM channels with as many as 8 FBDIMMs on each channel all running concurrently. Each FBDIMM-like channel has 6 GB/s peak throughput and all 64 channels can operate simutaneously. By the time such a system could be constructed, each processing chip will have on the order of 16 threads; while up to 1024 threads would be available if the CPU architecture was more Niagra-like. One big reason to punt the I/O to ECS is that you really don't want I/ O requests from the 1024-4096 SATA disks in a system of the aforementioned scale to swamp the memory interconnect on the coherent side of things. It is expected that only a moderate portion of the total available BW into the ECS is used by the computation amalgam, leaving a significant amount of BW to the I/O devices. There are a "few" software issue to resolve also. Mitch
From: Robert Myers on 3 Mar 2010 12:43 On Mar 3, 10:17 am, n...(a)cam.ac.uk wrote: > In article <1d917fa8-0be1-47ba-8863-4a10d0817...(a)t20g2000yqe.googlegroups..com>, > Robert Myers <rbmyers...(a)gmail.com> wrote: > > >On Mar 3, 1:40=A0am, Terje Mathisen <"terje.mathisen at tmsw.no"> wrote: > > >> So Robert, do I satisfy your prejudices? > >> :-) > > >Prejudices are just prejudices, and I labeled mine as such. > > No, only a few of them. > I am a veritable seething cauldron of prejudices. > >The point about Fortran is that you really have to know what the > >computer is doing in considerably more detail than the language > >interface describes. > > Eh? If you were to say that about programming in general, it would > be debatable. You might, JUST, be able to say that about Fortran > versus Python or Java. But it's a bizarre statement to make about > Fortran without qualification. > > What do you mean by it? > No one has ever come to you and said, "I only added a print statement and now my program doesn't work?" Fortran works a lot like c in handling arrays, and, if you don't understand that a lot of what Fortran does is some offset from an address (which could be the base address of a common block), you can get into serious trouble, or at least have a hard time understanding what's happening. As Terje has pointed out, not understanding cache can be a serious handicap. The language interface offers no clue. If you are a Cray programmer, the language manual does tell you about special considerations, but the reason for them isn't so easy to understand without looking at the architecture. How many examples do you want? Robert.
From: nmm1 on 3 Mar 2010 13:22 In article <093bcd39-12ae-48a4-9add-5ff041c5c9e2(a)a18g2000yqc.googlegroups.com>, Robert Myers <rbmyersusa(a)gmail.com> wrote: > >> >The point about Fortran is that you really have to know what the >> >computer is doing in considerably more detail than the language >> >interface describes. >> >> Eh? =A0If you were to say that about programming in general, it would >> be debatable. =A0You might, JUST, be able to say that about Fortran >> versus Python or Java. =A0But it's a bizarre statement to make about >> Fortran without qualification. >> >> What do you mean by it? >> >No one has ever come to you and said, "I only added a print statement >and now my program doesn't work?" Sometimes. It's more often the other way round. >Fortran works a lot like c in >handling arrays, and, if you don't understand that a lot of what >Fortran does is some offset from an address (which could be the base >address of a common block), you can get into serious trouble, or at >least have a hard time understanding what's happening. Eh? Fortran operates nothing like C in this area. The only aspect where it could be said to is sequence association, and that is clearly specified in the standard. Fortran has no equivalent of C's pointer morass, unless you explicitly use its C interoperability features and shoot yourself in your foot. I really can't see why you are singling out Fortran. You don't need to know what the computer is doing in any more detail than for almost all other languages, and considerably less than you do for C, C++ or Perl. >As Terje has pointed out, not understanding cache can be a serious >handicap. The language interface offers no clue. And it makes no difference to whether a Fortran program will work, only to how fast it runs. Again, why Fortran? The same is true of ALL other languages! Regards, Nick Maclaren.
From: Robert Myers on 3 Mar 2010 15:00
On Mar 3, 1:22 pm, n...(a)cam.ac.uk wrote: > In article <093bcd39-12ae-48a4-9add-5ff041c5c...(a)a18g2000yqc.googlegroups..com>, > Robert Myers <rbmyers...(a)gmail.com> wrote: > > > > >> >The point about Fortran is that you really have to know what the > >> >computer is doing in considerably more detail than the language > >> >interface describes. > > >> Eh? =A0If you were to say that about programming in general, it would > >> be debatable. =A0You might, JUST, be able to say that about Fortran > >> versus Python or Java. =A0But it's a bizarre statement to make about > >> Fortran without qualification. > > >> What do you mean by it? > > >No one has ever come to you and said, "I only added a print statement > >and now my program doesn't work?" > > Sometimes. It's more often the other way round. > > >Fortran works a lot like c in > >handling arrays, and, if you don't understand that a lot of what > >Fortran does is some offset from an address (which could be the base > >address of a common block), you can get into serious trouble, or at > >least have a hard time understanding what's happening. > > Eh? Fortran operates nothing like C in this area. The only aspect > where it could be said to is sequence association, and that is > clearly specified in the standard. Fortran has no equivalent of > C's pointer morass, unless you explicitly use its C interoperability > features and shoot yourself in your foot. > > I really can't see why you are singling out Fortran. You don't > need to know what the computer is doing in any more detail than > for almost all other languages, and considerably less than you > do for C, C++ or Perl. > > >As Terje has pointed out, not understanding cache can be a serious > >handicap. The language interface offers no clue. > > And it makes no difference to whether a Fortran program will work, > only to how fast it runs. Again, why Fortran? The same is true > of ALL other languages! > Now I see why you've bristled. I singled out Fortran because, when I was in school, it was the language that all engineers used or expected to use, and, for all practical purposes, it was the only language I used professionally. In practice, I did fairly reckless things with Fortran and made use of pointer extensions, but, of course you didn't have to do that sort of thing and plenty around me got in trouble with vanilla arrays and common blocks. I think Fortran is still a pretty good language and I'm sorry that it has fallen into disfavor in so many places. Robert. |