Prev: Help with Tcal
Next: COBOL Error Handling (was: What MF says about ROUNDED(was:Cobol Myth Busters
From: Anonymous on 26 Sep 2007 05:33 In article <uq8jf3pd3rq48eqio0hdtqo172nv2c16is(a)4ax.com>, Robert <no(a)e.mail> wrote: >On Tue, 25 Sep 2007 22:45:12 +0000 (UTC), docdwarf(a)panix.com () wrote: > >>In article <regif3d0b34nreavsckap09omqjhptnik8(a)4ax.com>, >>Robert <no(a)e.mail> wrote: >>>On Tue, 25 Sep 2007 09:25:04 +0000 (UTC), docdwarf(a)panix.com () wrote: >>> > >>>>Now, Mr Wagner... is one to expect another dreary series of repetitions >>>>about how mainframers who said that indices were faster than subscripts >>>>were, in fact, right about something? >>> >>>I expected I-told-you-so from the mainframe camp. >> >>It may be interesting to see if you get one; my point - and pardon the >>obscure manner of its making - was that you made a series of repetitions >>which a demonstration has disproved and it may be interesting to see if an >>equally lengthy series of repetitions follows... or if it just Goes Away >>until you next get an idea about something... and begin another, similar >>series of repetitions. > >We saw that subscript and index run at the same speed on three CPU >families -- HP PA >(SuperDome), DEC Alpha (Cray) and Richard's undisclosed machine, >possibly Intel. I can barely speak for myself, Mr Wagner, let alone some 'we'... but I recall seeing post after post were you indicated, rather pointedly, that the speed superiority of indices over subscripts was something maintained by mainframers and was, according to your test, an obsolete belief. Results were then posted, purporting to be from a mainframe run, which appeared to verify this obsolete belief. >I am >confident we'd see the same on Intel, PowerPC (pseries, iseries, Mac) >and SPARC, based on >tests I ran a few years ago. Thus the generalizaton. I was surprised to >see zSeries did >not follow the pattern of the others. A wonderful world it is, Mr Wagner... and perhaps this inconsistency of performance might work itsself into your own consistency of performance. It might be an interesting exercise, saying, in the future, 'this-and-that is quite obviously the case... but remember, when I said that-and-this was the case I was, quite soundly and publicly, shown an example to the contrary.' > >My previous idea, that memory alignment no longer matters, turned out to >be wrong. It does >matter on modern RISC machines. > >There's a good chance I'll get another idea. That's to be be hoped for... and even more so that it will be shaped by one's previous ideas, both proven right *and* proven wrong. DD
From: Anonymous on 26 Sep 2007 05:38 In article <5lu49rFa5hnvU1(a)mid.individual.net>, Pete Dashwood <dashwood(a)removethis.enternet.co.nz> wrote: [snip] >A few days ago I >was running a test on a P4 notebook that had to create a couple of million >rows on an ACCESS database. Why, Mr Dashwood... how interesting! Keep at it, you'll be up to sixty million and change in no time! DD
From: Judson McClendon on 26 Sep 2007 07:26 "Pete Dashwood" <dashwood(a)removethis.enternet.co.nz> wrote: > > It is things like this that make me wonder why we even bother about performance and have heated discussions about things like > indexes and subscripts, when the technology is advancing rapidly enough to simply > take care of it. Consider this. If Microsoft had put performance at a premium, Windows would boot in 1 second, you could start any Office application and it would be ready for input in the blink of an eye, and your Access test would have run in a few seconds. How many thousand man-years have been spent cumulatively all over the planet waiting on these things? :-) -- Judson McClendon judmc(a)sunvaley0.com (remove zero) Sun Valley Systems http://sunvaley.com "For God so loved the world that He gave His only begotten Son, that whoever believes in Him should not perish but have everlasting life."
From: Roger While on 26 Sep 2007 08:20 I really, really tried to keep away from this subject but ... One of the problems with the speed2 prog is the attempt to deduce the perform cost. Now OC produces exactly the C code that reflects the statements eg. /* speed2.cob:63: PERFORM */ { for (n0 = ((int)COB_BSWAP_32(*(int *)(b_18 + 30))); n0 > 0; n0--) { { /* speed2.cob:64: EXIT */ { goto l_5; } } /* EXIT PERFORM CYCLE 5: */ l_5:; } } BUT gcc (in current versions) is far more clever and deletes the whole thing :-) Revised test prog - (This should be compatible with most compilers) identification division. program-id. speed5. data division. working-storage section. 01 comp5-number comp-5 pic s9(09). 01 s-subscript binary pic s9(09). 01 repeat-factor value 900000000 comp-5 pic s9(09). 01 test-byte pic x(01). 01 misaligned-area. 05 array-element occurs 4096 indexed x-index. 10 misaligned-number comp-5 pic s9(09). 10 to-cause-misalignment pic x(01). 05 byte-element occurs 4096 indexed x-index-1 pic x. procedure division. initialize misaligned-area display "Start prog " function current-date set x-index to 1000 display "Index start " function current-date perform repeat-factor times if x-index = 1000 set x-index up by 1 else set x-index down by 1 end-if move array-element (x-index) to test-byte end-perform display "Index end " function current-date move 1000 to s-subscript display "COMP start " function current-date perform repeat-factor times if s-subscript = 1000 add 1 to s-subscript else subtract 1 from s-subscript end-if move array-element (s-subscript) to test-byte end-perform display "COMP end " function current-date move 1000 to comp5-number display "COMP-5 start " function current-date perform repeat-factor times if comp5-number = 1000 add 1 to comp5-number else subtract 1 from comp5-number end-if move array-element (comp5-number) to test-byte end-perform display "COMP-5 end " function current-date set x-index-1 to 1000 display "Index start " function current-date perform repeat-factor times if x-index-1 = 1000 set x-index-1 up by 1 else set x-index-1 down by 1 end-if move byte-element (x-index-1) to test-byte end-perform display "Index end " function current-date move 1000 to s-subscript display "COMP start " function current-date perform repeat-factor times if s-subscript = 1000 add 1 to s-subscript else subtract 1 from s-subscript end-if move byte-element (s-subscript) to test-byte end-perform display "COMP end " function current-date move 1000 to comp5-number display "COMP-5 start " function current-date perform repeat-factor times if comp5-number = 1000 add 1 to comp5-number else subtract 1 from comp5-number end-if move byte-element (comp5-number) to test-byte end-perform display "COMP-5 end " function current-date stop run. Note that the repeat count is pushed up, otherwise the results are statistically meaningless.Tests repeated 5 times with +- 1/100 second difference. Results from Linux boxen (in single-user mode) (As all benchmarks should be done on 'nix systems) (32 bit is P4 prescott with 3.2GhZ) (64 bit is P4 MF SE 2.2 (Linux x86 32 bit) cob -u -O -C notrunc -C sourceformat=free speed5.cob cobrun speed5 Start prog 2007092612363397+0200 Index start 2007092612363397+0200 Index end 2007092612363681+0200 COMP start 2007092612363681+0200 COMP end 2007092612364047+0200 COMP-5 start 2007092612364047+0200 COMP-5 end 2007092612364361+0200 Index start 2007092612364361+0200 Index end 2007092612364672+0200 COMP start 2007092612364672+0200 COMP end 2007092612365034+0200 COMP-5 start 2007092612365034+0200 COMP-5 end 2007092612365357+0200 OC 0.33 current - cobc -x -O2 -std=mf -free speed5.cob ../speed5 Start prog 2007092612311407+0200 Index start 2007092612311407+0200 Index end 2007092612311690+0200 COMP start 2007092612311690+0200 COMP end 2007092612312044+0200 COMP-5 start 2007092612312044+0200 COMP-5 end 2007092612312326+0200 Index start 2007092612312326+0200 Index end 2007092612312609+0200 COMP start 2007092612312609+0200 COMP end 2007092612312963+0200 COMP-5 start 2007092612312963+0200 COMP-5 end 2007092612313246+0200 OC 0.33 current on Linux x86_64 (64 bit) cobc -x -O2 -std=mf -free speed5.cob ../speed5 Start prog 2007092612285455+0200 Index start 2007092612285455+0200 Index end 2007092612285602+0200 COMP start 2007092612285602+0200 COMP end 2007092612285855+0200 COMP-5 start 2007092612285855+0200 COMP-5 end 2007092612290004+0200 Index start 2007092612290004+0200 Index end 2007092612290135+0200 COMP start 2007092612290135+0200 COMP end 2007092612290366+0200 COMP-5 start 2007092612290366+0200 COMP-5 end 2007092612290497+0200 Now as to what has all been said in this thread, then I have the following comments - COMP (aka BINARY) is stored as big-endian by all compilers these days. Therefore there is a penalty on little-endian machines (or better the OS/firmware for eg. bi-endian) to byte-swap, operate and re-byteswap results. This, of course, affects eg. x86(_64). However, see below Alignment - There are in fact not that many alignment tolerant machines there. Intel x86(_64) and Power PC are known. (The Itanium is not) This means that any reference to a COMP/COMP-5 item must be moved to an intermediate area unless it can be proved at compile time that it is appropiately aligned. (eg. at 01 level) So we have to look at a bisection of the above two attributes. Generally speaking, for performance, (other than INDEX) one should use COMP-5 (aka BINARY-LONG SIGNED/UNSIGNED) for subscripts/counters etc. and define them at the 01 level. Not only that, a particular compiler implementation has it's own INDEX definition which is somewhat difficult to ascertain. (And which is not necessarily a C-5 item) Roger
From: Pete Dashwood on 26 Sep 2007 09:04
"Arnold Trembley" <arnold.trembley(a)worldnet.att.net> wrote in message news:zjmKi.137851$ax1.11998(a)bgtnsc05-news.ops.worldnet.att.net... > Pete Dashwood wrote: >> (snip) Here are the results of "Speed2" from a genuine Intel Celeron Core >> 2 Duo Vaio AR250G notebook with 2 GB of main memory, running under >> Windows XP with SP2 applied, using your code (with the following >> amendments: all asterisks and comments removed, exit perform cycle >> removed), compiled with no options other than the defaults (which >> includes "Optimize"), with the Fujitsu NetCOBOL version 6 compiler, >> compiled to .EXE: >> >> Null test 1 >> Index 3 >> Subscript 25 >> Subscript comp-5 3 >> Index 1 3 >> Subscript 1 22 >> Subscript 1 comp-5 3 >> >> As you can see, indexing is between 7 and 8 times more efficient than >> subscripting, unless you use optimized subscripts, in this environment. > > Here are the results of "Speed2" using a 2.60 GHz Pentium 4 with 512 MB of > main memory, running under Windows XP with SP2 applied, using Robert's > code with EXIT PERFORM CYCLE commented out, compiled with a 1990 education > version of Realia COBOL (equivalent to Realia 3): > > Null test 5 > Index 2 > Subscript 8 > Subscript comp-5 8 > Index 1 2 > Subscript 1 7 > Subscript 1 comp-5 7 That is SOOO cool! Obviously, generated code makes all the difference. Here's code from 2 compilers running in the same OS Environment, yet look at the figures for subscripts; the P4 creams the Core 2, although the Core 2 is theoretically faster. In fact, the P4 is faster on everything except the null test :-) And both systems are way faster than IBM mainframes. (That still hasn't quite sunk in yet; after working on mainframes for decades it is hard for me to realize that a notebook costing < .01% of what a mainframe costs, could be orders of magnitude faster...) Again, to me at least, this just completely confirms that it is not possible to make meaningful statements about performance unless you run actual tests. Pete. -- "I used to write COBOL...now I can do anything." |