Prev: Looking for Sponsorship
Next: Processors stall on OLTP workloads about half the time--almostno matter what you do
From: Quadibloc on 30 Apr 2010 10:58 On Apr 30, 5:57 am, Anne & Lynn Wheeler <l...(a)garlic.com> wrote: > from above (2006) article: > > is that the price per MIPS today is approximately six times higher than > the $165 per MIPS that the traditional technology/price decline link > would have produced Part of this is the cost of RAS features (Reliability, Availability, Serviceability: others have substituted Scalability or Security for the last one), and part a hidden charge for access to IBM's quality software. With the Nehalem-EX, the clock is ticking on part of that. HP owns OpenVMS, a decent mainframe-quality operating system. It should really look into giving IBM some competition. John Savard
From: MitchAlsup on 30 Apr 2010 12:46 On Apr 28, 2:36 pm, George Neuner <gneun...(a)comcast.net> wrote: > What remains mostly is research into ways of recognizing repetitious > patterns of data access in linked data structures (lists, trees, > graphs, tries, etc.) and automatically prefetching data in advance of > its use. I haven't followed this research too closely, but my > impression is that it remains a hard problem. There is a pattern recognizing prefetched in GreyHound (Opteron Rev-G) and later that can lock onto no sequential and non-monotonic access patterns. One of the bad SpedFP benchamrks had an access patern that look something like (in cache ine addresses): loop: prefetch[n+1]'address = prefetch[n]'address+4 prefetch[n+2]'address = prefetch[n+1]'address+4 prefetch[n+3]'address = prefetch[n+2]'address-1 repeat at loop/break on crossing of physical page boundary That is the loop concerns 3 cache lines, two having stpe sizes of +4 (or was it +3) and the next has a step size of -1. My DRAM controler locks onto this non-linear stride and prefetches the lines at high efficiency. Here up to 7 (or was it 8) different strides could be 'followed' if found in a repetive situation. {However the credit is not due to me, but to another engineer who discovered a means to encode this non-linear strides in an easy to access table.} Its easy to see a compiler figuring this out also. Mitch
From: MitchAlsup on 30 Apr 2010 14:43 On Apr 30, 1:05 pm, George Neuner <gneun...(a)comcast.net> wrote: > On Fri, 30 Apr 2010 09:46:47 -0700 (PDT), MitchAlsup > Yes. The example seems to be a list traversal, although I'm not sure > what the negative offset represents - possibly a pointer to node data > in a spined list. Agreed that this is probably some kind of list traversal. The negative number is as aspect of how the list was built. <snip> > I see prefetching as desirable for something like a map function where > a) the entire list is traversed, and b) there will (typically) be some > nontrivial computation per node. But many list algorithms involve > only simple processing per node and, on average, only half the list > will be traversed. It doesn't make sense to me to prefetch a bunch > of nodes that may never be touched ... that's just cache pollution. What you failed to see is that the DRAM prefetcher places the prefetched line in a DRAM read buffer and polutes no cache in the system. If no demand request for the line arrives, it is silently discarded. If a demand or interrior request arrives, the line is transfered back without any DRAM latency. You still incure the latency of getting out to the DRAM controller and back (6-8ns), but save the DRAM access latency (24-60ns). And you don't polute any of the caches! Mitch
From: MitchAlsup on 1 May 2010 14:54 On Apr 30, 2:55 pm, George Neuner <gneun...(a)comcast.net> wrote: > On Fri, 30 Apr 2010 11:43:22 -0700 (PDT), MitchAlsup > > > > > > <MitchAl...(a)aol.com> wrote: > >On Apr 30, 1:05 pm, George Neuner <gneun...(a)comcast.net> wrote: > > >> I see prefetching as desirable for something like a map function where > >> a) the entire list is traversed, and b) there will (typically) be some > >> nontrivial computation per node. But many list algorithms involve > >> only simple processing per node and, on average, only half the list > >> will be traversed. It doesn't make sense to me to prefetch a bunch > >> of nodes that may never be touched ... that's just cache pollution. > > >What you failed to see is that the DRAM prefetcher places the > >prefetched line in a DRAM read buffer and polutes no cache in the > >system. If no demand request for the line arrives, it is silently > >discarded. If a demand or interrior request arrives, the line is > >transfered back without any DRAM latency. You still incure the latency > >of getting out to the DRAM controller and back (6-8ns), but save the > >DRAM access latency (24-60ns). And you don't polute any of the caches! > > Which is better than fetching into multiple levels of cache, but still > has the effect of tying up resources: the memory controller, the chips > responding to the read, the read buffer line, etc. - unavailability of > any or all of which might delay some other computation (depending on > the architecture). There were no resources being tied up that were/are useable by memory requests from the coherent or incoherent requestors in this prefetcher. Nor were any cycles on the DRAM busses taken that would have been useable to requests sitting around waiting for DRAM accesses. This prefetcher watched for those periods in time where no requests were present and banks were already open and used those 'free' cycles. The only downside is that when prefetches were made closing of DRAM pages might be delayed while the prefetch plays out. Mitch
From: Quadibloc on 2 May 2010 22:36 On May 2, 6:04 pm, Del Cecchi <delcec...(a)gmail.com> wrote: > Quadibloc wrote: > > HP owns OpenVMS, a decent mainframe-quality operating system. It > > should really look into giving IBM some competition. > Why should HP try to reintroduce VMS into the market place? Do you > really think that this is a financially beneficial or viable action? It's true that I can't be certain this would be a sensible thing to do. But I do think that there is a need for more operating systems that are reliable and offer the security that real mainframe operating systems do. Microsoft Windows doesn't cut it. Neither does Linux. Even commercial versions of Unix, while they serve their intended purposes better than Linux can as a substitute for them, are still derived from what began as an extremely minimalist operating system. Open VMS might well not be viable as part of an attempt by HP to compete directly with IBM's mainframe offerings. But the market has a lot of other places where HP could direct a system with a port of Open VMS. They could, for example, make it an alternative to Windows Server. John Savard
First
|
Prev
|
Pages: 1 2 3 4 5 Prev: Looking for Sponsorship Next: Processors stall on OLTP workloads about half the time--almostno matter what you do |