Prev: Processors stall on OLTP workloads about half the time--almostno matter what you do
Next: Processors stall on OLTP workloads about half the time--almostno matter what you do
From: Anne & Lynn Wheeler on 30 Apr 2010 13:32 Quadibloc <jsavard(a)ecn.ab.ca> writes: > Part of this is the cost of RAS features (Reliability, Availability, > Serviceability: others have substituted Scalability or Security for > the last one), and part a hidden charge for access to IBM's quality > software. re: http://www.garlic.com/~lynn/2010i.html#0 Processors stall on OLTP workloads about half the time--almost no matter what you do the "Financial Matters: Mainframe Processor Pricing History" article http://www.zjournal.com/index.cfm?section=article&aid=346 was tracking mainframe (mip) pricing during the 70s, 80s, and early part of 90s when there were (similar) clone mainframes ... and the (ibm) mainframe price/mip system pricing curve changed after clone mainframes left the market in the 90s (i.e. the comment was that if the 70s, 80s, & 90s curve had continued up thru the data of the article, a mainframe selling for $18m ... would have been instead selling for $3m ... aka mainframe to mainframe pricing). some number of complaints in the ibm-main mainframe mailing list is that (regardless of high mainframe pricing), that mainframe software pricing is dominating costs. A 25+ yr old RAS story was that the product manager for 3090 mainframe tracked me down after 3090s had been in customer shops for a year. There is a mainframe industry reporting service that collects customer mainframe EREP reports and publishes regular monthly summaries (at the time including the various clone vendors). The problem was that 3090 was designed to have something like aggregate 3-5 "channel errors" per annum in total across all installed machines. The reporting service turned up closer to 20 total "channel errors" that had occured in aggregate across all installed 3090s. I had done operating system driver for HYPERChannel ... allowing remote mainframe controllers and devices at remote locations, using HYPERChannel as a form of mainframe channel extension (for internal installations). In some case, when I had an unrecoverable error, I would reflect and emulated "channel check" which would result in various recovery and retry operations by the standard operating system RAS. I then tried to get the HYPERChannel driver released to customers, but various corporat factions objected. As a result, the HYPERChannel vendor effectively had to do a re-implementation. In any case, the 15 "extra" 3090 channel errors (aggregate across all installed 3090s for the first year) was some HYPERChannel installations (reflecting emulated channel check). So I did some research and selected emulated IFCC (iterface control check) to be substituted in place of CC (channel check) ... it turns out that IFCC follows effectively identical path thru error recovery as CC (but wouldn't show up as channel error in the industry reports). Point is that there doesn't seemed to be anything similar in other markets (i.e. industry monthly error/RAS reports across all customer installed machines). as an aside ... when we were doing ha/cmp in the early 90s http://www.garlic.com/~lynn/subtopic.html#hacmp I was asked to write a section for the corporate continuous availability strategy document. The section got pulled because both Rochester (as/400) and POK (mainframe) complained (that they couldn't meet the availability criteria in my section). I had coined the term disaster survivability and geographic survivability when out marketing ha/cmp http://www.garlic.com/~lynn/submain.html#available that was separate/independent to work involving cluster scaleup in ha/cmp ... aka project started out as ha/6000 ... but I changed the name to ha/cmp to also reflect the work on cluster scaleup. when the cluster scaleup part of the effort was transferred and we were told that we couldn't work on anything with more than four processors, the didn't both to change the product name. recent thread in this n.g. on the cluster scaleup subject: http://www.garlic.com/~lynn/2010.html#6 Larrabee delayed: anyone know what's happening? http://www.garlic.com/~lynn/2010.html#31 Larrabee delayed: anyone know what's happening? http://www.garlic.com/~lynn/2010.html#41 Larrabee delayed: anyone know what's happening? http://www.garlic.com/~lynn/2010.html#44 Larrabee delayed: anyone know what's happening? http://www.garlic.com/~lynn/2010f.html#50 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#52 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#55 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#56 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#57 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#58 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#60 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#61 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#63 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#64 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010f.html#70 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010g.html#4 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010g.html#8 Handling multicore CPUs; what the competition is thinking http://www.garlic.com/~lynn/2010g.html#48 Handling multicore CPUs; what the competition is thinking -- 42yrs virtualization experience (since Jan68), online at home since Mar1970
From: George Neuner on 30 Apr 2010 14:05 On Fri, 30 Apr 2010 09:46:47 -0700 (PDT), MitchAlsup <MitchAlsup(a)aol.com> wrote: >On Apr 28, 2:36�pm, George Neuner <gneun...(a)comcast.net> wrote: >> What remains mostly is research into ways of recognizing repetitious >> patterns of data access in linked data structures (lists, trees, >> graphs, tries, etc.) and automatically prefetching data in advance of >> its use. �I haven't followed this research too closely, but my >> impression is that it remains a hard problem. > >There is a pattern recognizing prefetched in GreyHound (Opteron Rev-G) >and later that can lock onto no sequential and non-monotonic access >patterns. One of the bad SpedFP benchamrks had an access patern that >look something like (in cache ine addresses): > >loop: >prefetch[n+1]'address = prefetch[n]'address+4 >prefetch[n+2]'address = prefetch[n+1]'address+4 >prefetch[n+3]'address = prefetch[n+2]'address-1 >repeat at loop/break on crossing of physical page boundary > >That is the loop concerns 3 cache lines, two having stpe sizes of +4 >(or was it +3) and the next has a step size of -1. My DRAM controler >locks onto this non-linear stride and prefetches the lines at high >efficiency. Here up to 7 (or was it 8) different strides could be >'followed' if found in a repetive situation. {However the credit is >not due to me, but to another engineer who discovered a means to >encode this non-linear strides in an easy to access table.} > >Its easy to see a compiler figuring this out also. > >Mitch Yes. The example seems to be a list traversal, although I'm not sure what the negative offset represents - possibly a pointer to node data in a spined list. As I mentioned to Robert, linked list following is one area where I know there has been success (another is in search trees). The issue with lists is whether to bother prefetching at all because many list algorithms do little computation per node and prefetch would need to keep several nodes ahead to avoid stall ... which IMO doesn't seem reasonable for most cases. I see prefetching as desirable for something like a map function where a) the entire list is traversed, and b) there will (typically) be some nontrivial computation per node. But many list algorithms involve only simple processing per node and, on average, only half the list will be traversed. It doesn't make sense to me to prefetch a bunch of nodes that may never be touched ... that's just cache pollution. George
From: George Neuner on 30 Apr 2010 15:55 On Fri, 30 Apr 2010 11:43:22 -0700 (PDT), MitchAlsup <MitchAlsup(a)aol.com> wrote: >On Apr 30, 1:05�pm, George Neuner <gneun...(a)comcast.net> wrote: > >> I see prefetching as desirable for something like a map function where >> a) the entire list is traversed, and b) there will (typically) be some >> nontrivial computation per node. �But many list algorithms involve >> only simple processing per node and, on average, only half the list >> will be traversed. � It doesn't make sense to me to prefetch a bunch >> of nodes that may never be touched ... that's just cache pollution. > >What you failed to see is that the DRAM prefetcher places the >prefetched line in a DRAM read buffer and polutes no cache in the >system. If no demand request for the line arrives, it is silently >discarded. If a demand or interrior request arrives, the line is >transfered back without any DRAM latency. You still incure the latency >of getting out to the DRAM controller and back (6-8ns), but save the >DRAM access latency (24-60ns). And you don't polute any of the caches! Which is better than fetching into multiple levels of cache, but still has the effect of tying up resources: the memory controller, the chips responding to the read, the read buffer line, etc. - unavailability of any or all of which might delay some other computation (depending on the architecture). There isn't any free lunch. George
From: Anne & Lynn Wheeler on 30 Apr 2010 17:07 Quadibloc <jsavard(a)ecn.ab.ca> writes: > HP owns OpenVMS, a decent mainframe-quality operating system. It > should really look into giving IBM some competition. re: http://www.garlic.com/~lynn/2010i.html#0 Processors stall on OLTP workloads about half the time--almost no matter what you do http://www.garlic.com/~lynn/2010i.html#2 Processors stall on OLTP workloads about half the time--almost no matter what you do from this post (in ibm-main mailing list) http://www.garlic.com/~lynn/2010i.html#1 25 reasons why hardware is still hot at IBM IBM's Unix poaching slows in Q1 http://www.theregister.co.uk/2010/04/29/ibm_unix_takeouts/ from above: In November 2008, HP was perfectly happy to crow that it had converted more than 250 IBM mainframe shops to Integrity machines in the prior two years - which prompted IBM to retaliate about the 5,000 HP and Sun takeouts it had done in the prior four years. .... snip ... .... Integrity (Itanium2) severs http://en.wikipedia.org/wiki/HP_Integrity_Servers http://h20341.www2.hp.com/integrity/us/en/systems/integrity-systems-overview.html -- 42yrs virtualization experience (since Jan68), online at home since Mar1970
From: Morten Reistad on 1 May 2010 07:43
In article <m3wrvops2r.fsf(a)garlic.com>, Anne & Lynn Wheeler <lynn(a)garlic.com> wrote: >Quadibloc <jsavard(a)ecn.ab.ca> writes: >> HP owns OpenVMS, a decent mainframe-quality operating system. It >> should really look into giving IBM some competition. Nowadays it is not much about the os itself. It is about scaling the application, and the database. >from above: > >In November 2008, HP was perfectly happy to crow that it had converted >more than 250 IBM mainframe shops to Integrity machines in the prior two >years - which prompted IBM to retaliate about the 5,000 HP and Sun >takeouts it had done in the prior four years. HP sells good hardware (finally), but IBM has found a very profitable "niche"; helping all the successful not-quite-google, but still very aggressively growing internet operations deliver and scale their systems. They will take good care of their wallets, too. -- mrr |