From: o//annabee on 16 Mar 2006 11:21 P? Thu, 16 Mar 2006 16:57:09 +0100, skrev randyhyde(a)earthlink.net <randyhyde(a)earthlink.net>: > I see you've got your head buried in the same whole in the sand that > Rene does. Ignoring reality just because you don't like it is a sign of > insanity, you know? Well. Dont know, but I think its spelled "Hole" anyway. "Whole" is meaning more like "Complete". Which reminds me. Where can we download the 6 non-trival masterful applications you have written in assembly? I looked at at webster, but couldnt find anything but christian resources, which I though wore odd. And some book reviews. I keep asking incase you holyness missed that post. >> That is only you. So no problem. > > So why are you complaining if you actually believe this? I m not complaining. But I think when you have such wivid imagination, you should make this daydream more realistic. Unless of course this "breaking others code component" is important to you. > Cheers, > Randy Hyde >
From: randyhyde@earthlink.net on 16 Mar 2006 11:57 o//annabee wrote: > På Wed, 15 Mar 2006 17:49:51 +0100, skrev randyhyde(a)earthlink.net > <randyhyde(a)earthlink.net>: > > > Closer, but still no cigar. > > Care to try again? > > I am glad you noticed the error. I have corrected the code. Still errors, see below. > > When doing the alignment properly, we get a speedup, of both loops. > As you can see, yours is still slower. > Also, as a by-product I get actually IDENTICAL timings for several runs. > Which I havent really seen before. Well, I don't have your particular CPU, but when I run this code on a PIV, here are the results that I get: My code Your Code 8418 c76c 8600 c76c 8590 c7d0 8568 c9b4 8648 c970 Your version seems to be about 50% slower than mine on a PIV. Again, I don't have access to your CPU, so can I can't verify your numbers, but if you look at the actual code generated by RosAsm for the two routines: My Version, disassembled from your RosAsm code: start: ..text:00404000 cpuid ..text:00404002 rdtsc ..text:00404004 push eax ..text:00404005 mov ecx, dword_403000 ..text:0040400B xor eax, eax ..text:0040400D jecxz short loc_404010 ..text:0040400F nop ..text:00404010 ..text:00404010 loc_404010: ; CODE XREF: ..text:0040400Dj ..text:00404010 ; ..text:00404013j ..text:00404010 add eax, ecx ..text:00404012 dec ecx ..text:00404013 jnz short loc_404010 ..text:00404015 rdtsc ..text:00404017 pop ebx ..text:00404018 sub eax, ebx ..text:0040401A int 3 ; Trap to Debugger Your version, disassembled from your RosAsm code: cpuid ..text:0040401D rdtsc ..text:0040401F push eax ..text:00404020 mov ecx, 2710h ..text:00404025 xor eax, eax ..text:00404027 jmp short loc_404030 ..text:00404027 ; --------------------------------------------------------------------------- ..text:00404029 align 8 ..text:00404030 ..text:00404030 loc_404030: ; CODE XREF: ..text:00404027j ..text:00404030 ; ..text:00404038j ..text:00404030 cmp ecx, 0 ..text:00404033 jbe short loc_40403A ..text:00404035 add eax, ecx ..text:00404037 dec ecx ..text:00404038 jmp short loc_404030 ..text:0040403A ; --------------------------------------------------------------------------- ..text:0040403A ..text:0040403A loc_40403A: ; CODE XREF: ..text:00404033j ..text:0040403A rdtsc ..text:0040403C pop ebx ..text:0040403D sub eax, ebx ..text:0040403F int 3 ; Trap to Debugger Well, the difference becomes pretty obvious. What you're trying to tell me is that a loop with 50% more instructions, that is, ..text:00404030 cmp ecx, 0 ..text:00404033 jbe short loc_40403A ..text:00404035 add eax, ecx ..text:00404037 dec ecx ..text:00404038 jmp short loc_404030 versus ..text:00404010 loc_404010: ; CODE XREF: ..text:0040400Dj ..text:00404010 ; ..text:00404013j ..text:00404010 add eax, ecx ..text:00404012 dec ecx ..text:00404013 jnz short loc_404010 is actually *faster*? Hmmm... I sure seems like *my* measurements are a lot more intuitive. That is, the code with 50% more instructions (your's) runs 50% slower. That AMD CPU is quite amazing indeed, if this is really the case. > > The diffrence _is_ in favor of my code. Whereas in the orginal post, you > claimed mine (or rather RosAsm's) while macro to be slow, because it has > the test at the top. Even if the timings wore in your favor, would not > change the fact that the RosAsm macro does very well. My measurements, and an inspection of the actual code that RosAsm generates behind your back, seem to bear out my original claims. > > So even it is a small point, it proves you boasted out some definite > error, in your attempt to scare people from using the RosAsm macros. This > test definitly prove that there are no reason not to use the RosAsm While > macro. If you look at the two pieces of disassembled code, I think that this alone should scare people away from using macros if they want the fastest possible code. And, btw, I want to emphasize *macros*, not *RosAsm macros*. You get the same problem whether the macro was written for RosAsm, MASM, HLA, FASM, or whatever. What I *have* claimed is that MASM's implementation of "if" statements is *better* than the macros that come with RosAsm. This is because MASM is a bit smarter about this stuff. You will also discover that HLA's "while" loop generates the "test for loop at the end" rather than the same code that RosAsm generates. Now perhaps that fails to be better code on your particular AMD CPU, I cannot verify that as I do not have access to that CPU. But an inspection of the code and measurements that I've made suggest that putting the branch at the bottom of the loop and removing an extra jump is *much* better coding indeed. > > Actually. since the timings are both steady, and the diffrence very small, > this might even be due to an initial payment in your routine, because of a > few misinterpeted jumps by the CPU. Yes, not to mention your failure to serialize before the second rdtsc in each example. But that still doesn't explain the 50% difference that *I* see on a PIV. And the difference I see is right in line with the number of instructions. Imagine that. > > hmm. I do not know why but seems a few branch mispredictions may be at the > cause of this. And they happen in _your_ code. Not in the RosAsm While > macro. ---- On *your* CPU, things like pairable instructions and branch prediction *could* be why the two loops execute in a similar amount of time. It's not like the PIV is a paragon of great microcoding. But it *really* smells like you've made an error somewhere. I'd suggest that you try putting several additional instructions in the loop and see what happens then. That would counter any bizzarre instruction pairing phenomenon that is going on. > > This was tested on an AMD 64, 3700+ running win2000, And mine was tested on a PIV running XP. > TestProc: > > cpuid > rdtsc | push eax > mov ecx D$n > xor eax eax ;You've just discovered the problem with ; relative local labels here. Do you see ; the problem in this code? This is ; *exactly* why I refused to put this ; lame form of local labels into HLA. ; Earlier assemblers I'd written ; had relative local labels and I ; saw this problem *far* too often. > jecxz L0> > Align 16 > L0: > > add eax ecx > dec ecx > jnz L0< > L0: > > rdtsc | pop ebx ;Is this rdtsc serialized? > sub eax ebx > int 3 > ;/4EBA > > > cpuid > rdtsc | push eax > mov ecx 10000 ;This is different from above. > xor eax eax > Align 16 > while ecx > 0 > add eax ecx > dec ecx > End_While > rdtsc | pop ebx ; is this rdtsc serialized? > sub eax ebx > int 3 > ;4E45 > Another issue- Caching effects are not allowed for in this code. The way you executed it, by running and the stopping, guarantees that the code will *not* be in the cache when you run it. What you should *really* do is run each code fragment in a loop a couple of times and then use the last measurement. That way, everything is in cache and you'll get more realistic readings. Indeed, the reason your timings may be so close is because the memory subsystem on your PC is sub-par and what you're really measuring is the amount of time it takes to read data from main memory. Cheers, Randy Hyde
From: randyhyde@earthlink.net on 16 Mar 2006 12:05 sevagK wrote: > > > > I am not sure 9 is the true limit, but if it isnt it should be. Yes. As is usual, 'bee, try and turn a limitation of RosAsm into some bizzare kind of advantage. You're the one who keeps telling us that assembly language removes all the limitations. Why would you accept an assembler that places limitations on you? > > A couple of questions: > > Is this limit just for if..endif macros nesting or does the limit > include nesting a combination of macros? > eg: > if... > if... > while... > forever.... > if.. > ... > endif > endfor > endwhile > endif > endif Sevag- No. Not the way the current RosAsm macros are written. Rene has use a *different* local symbol for each set of macros, e.g., something like "I" for IF, "W" for WHILE, etc. However, the story that you're missing is that because RosAsm doesn't support true local symbols in macros, you can get into a *lot* of trouble if you try something like this: if eax > 0 cmp ebx 1 jne >I0 ... I0: ... endif Unfortunately, the IF macros have already defined I0 (which might be at the endif clause). Alas, because of the nature of RosAsm macros (no local macro symbols), the IF statement above transfers control to the I0 label rather than to the endif. Rene will tell you that it's up to the programmer to realize this and not use "I" labels, I say that's a crock. None of the other assemblers have this problem. > > > If one macro uses a variable &&0, and another macro also uses that same > symbol, does Rosasm generate unique local symbols for each macro that > uses the &&0 variable or does the whole thing get arsed when some macro > nested somewhere in another macro overwrites an &&x variable? &&x are global entities. There are no local macro symbols. And, in particular, there is no way to correctly carry over information (such as symbol names) from one macro to another. This is why Rene suggests that people use the if, .if, ..if, ...if, etc. scheme. Because his macro system cannot figure out if you're missing an endif or have an extra endif, etc., when using the same if/else/endif scheme everyone else does. Cheers, Randy Hyde
From: randyhyde@earthlink.net on 16 Mar 2006 12:10 o//annabee wrote: > > > Rene, you should learn the difference between "abitrary" and "up to 10 > > levels". > > As I said in my original post, 10 levels for an IF statement is > > probably sufficient for most people, but it is *not* arbitrary. > > arbitrary, in this context is insane. Perhaps to you. But arbitrary in the sense of being able to write a recursive macro that handles an abitrary depth (subject to reasonable memory allocation, of course) is not insane. The fact that *you* haven't reached the level where you can imagine what a recursive macro might be used for doesn't mean that the whole world is stuck at your level. Gee, you seem to have trouble figuring out what conditional assembly is used for (or so I assume by reading your posts elsewhere). I suspect you wouldn't even know what a recursive macro is, much less what you would use it for. > > > And > > although 10 levels may be sufficient for an IF, it most certainly is > > not sufficient for other applications. > > Give an example. Recursive macros. Such as the pattern matching macros in the HLA Standard Library. > > > The fact that assemblers like > > MASM, TASM, and HLA can handle an abitrary depth (well, subject to > > internal memory constraints) and RosAsm cannot suggests that RosAsm is > > less powerful than these other assemblers in this respect. > > Give an example of arbitrary levels please. Pattern matching macros in the HLA standard library. Cheers, Randy Hyde
From: Betov on 16 Mar 2006 12:15
o//annabee <fack(a)szmyggenpv.com> ?crivait news:op.s6ik2cnqce7g4q(a)bonus: > Which reminds me. Where can we download the 6 non-trival masterful > applications you have written in assembly? I looked at at webster, but > couldnt find anything but christian resources, which I though wore > odd. And some book reviews. I keep asking incase you holyness missed > that post. Courage: Kill him ! Kill him ! At the end, he will point you to the pathetic "HLA Advantures" game, that we suffer since, now,... how many _years_ exactly?... :]]]]] Betov. < http://rosasm.org > |