Prev: VB 6 & VS?
Next: C:\WINDOWS\system32\ieframe.dll\1
From: Jim Mack on 13 Mar 2010 18:46 MM wrote: > Jim Mack wrote: >> >> Are you doing this compiled to native code with all optimizations? > > No, I was running in the IDE when I ran the comparison speed tests. > I ran them each 3x and each time the result was 18s for the shifts > and 19s for the SwapEndian08 (per million iterations = For/Next > loop). That explains it. I think you'll find that the version with QPP shifts is considerably slower than the Swap08 code when running compiled. Running in the IDE isn't representative of how well compiled code will run. >> Just for comparison, try this: >> >> http://www.microdexterity.com/other/bendian.dll > > I will tomorrow. I'm too tired now. It's bedtime! A reminder: I modified that link to http://www.microdexterity.com/other/bendian.zip -- Jim Mack Twisted tees at http://www.cafepress.com/2050inc "We sew confusion"
From: Mike Williams on 14 Mar 2010 03:44 "Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message news:uskvqzvwKHA.5940(a)TK2MSFTNGP02.phx.gbl... > As far as the Declare not working... I have no trouble with > it here. Let me cut and paste the actual Declare I used rather > than the one I typed into the text file. I see that one got > word-wrapped along the way: I had already actually dealt with the word wrapping and it still bombed out. Part of the problem was that I was running it only as a compiled exe and so I did not see any IDE errors. I later ran it in the IDE to look for the problem and I got a "bad DLL calling convention" error, which I could see was because of a problem in the declaration, where the text file has: Private Declare Function Endian4Dec lib "bendian.dll" _ Alias "Endian4" (ByVal LongToSwap) As Long .. . . so the problem was due to the missing "As Long" in "(ByVal LongToSwap)". I had not noticed that when I pasted in the code and, now that I've fixed it by adding "As Long", the test code using that declaration runs fine. > > [Mike W's previously posted results] > > 0.058 microseconds for Nobody's VB code method > > 0.026 microseconds for wsock32.dll method > > 0.018 microseconds for Donald Lessau's VB code > > 0.005 microseconds for Jim Mack's DLL > > Hmmm. Well, I wonder if I could track down the earlier thread > where people were seeing slightly slower times for the ASM > code vs optimized VB code. I thought it was the same vbspeed > code we're using here, but maybe there's something even better > out there. Actually I think that in those previous tests you might have been using your ASM code when used as a standard Declaration rather than when exposed as a TypeLib, and that the overhead of the call to the declared function might have been swamping out the underlying speed of your ASM code, resulting in the vbspeed optimised VB code "pipping it to the post". If that is the case, and if at the time you had used the "exposed as a TypeLib" version instead, then I suspect your ASM solution would have "walked it" as far as beating the compiled VB code is concerned. In fact the overhead of a call to a declaration is what I was alluding to a couple of days ago when I said in my response to to MM, ". . but generally any code that calls a declared function to perform a very small task is slowed down by the overhead of the call itself." Now that I've got the declared function working I've run the tests again, on the same laptop, and this is what I am getting (all approximately the same as before except there is now another result, which is for your own function when used as a standard declaration): 0.057 microseconds for Nobody's VB code method 0.026 microseconds for hton in wsock32.dll method 0.023 microseconds for Jim Mack's ASM as a standard declaration 0.018 microseconds for Donald Lessau's VB code method 0.005 microseconds for Jim Mack's ASM exposed as a TypeLib So, your own ASM code when exposed as a TypeLib runs away with it, beating "all comers" hands down. The overhead of an empty "For Next" loop in my test loop itself is about 0.001 microseconds per iteration, so the call to your ASM itself takes only about 0.004 microseconds and, quite frankly, I cannot see any other solution beating a time of just 4 nanoseconds (at least not on my little old laptop!). Nice one. Do you mind if I hang on to your DLL to use myself if I should ever find the need to perform such "endian" swaps at a rapid rate? Mike
From: MM on 14 Mar 2010 04:37 On Sat, 13 Mar 2010 23:40:44 -0000, "Mike Williams" <Mike(a)WhiskyAndCoke.com> wrote: >"MM" <kylix_is(a)yahoo.co.uk> wrote in message >news:h45op5te0ksqhdm93451apmuqjkvsejbbh(a)4ax.com... > >> No, I was running in the IDE when I ran the comparison >> speed tests. I ran them each 3x and each time the result >> was 18s for the shifts and 19s for the SwapEndian08 >> (per million iterations = For/Next loop). > >You're not going to get much in the way of speed when running "top heavy" >code in the IDE or as pcode. You need to be running it as a native code >compiled exe. It was the *comparison" that was significant, not the absolute speed. MM
From: MM on 14 Mar 2010 06:10 On Sat, 13 Mar 2010 18:46:41 -0500, "Jim Mack" <jmack(a)mdxi.nospam.com> wrote: >MM wrote: >> Jim Mack wrote: >>> >>> Are you doing this compiled to native code with all optimizations? >> >> No, I was running in the IDE when I ran the comparison speed tests. >> I ran them each 3x and each time the result was 18s for the shifts >> and 19s for the SwapEndian08 (per million iterations = For/Next >> loop). > >That explains it. I think you'll find that the version with QPP shifts >is considerably slower than the Swap08 code when running compiled. >Running in the IDE isn't representative of how well compiled code will >run. No, the comparisons are very similar. Here are my final results, done just now with the COMPILED app (compiled to native code, optimise for fast code, advanced optimisations all unchecked): Each test done three times. All times in secs per 1 million iterations: 1. CopyMem 19 19 19 2. Shifts 17 16 17 3. SwapEndian08 17 17 17 4. Bendian 18 17 17 5. String 22 22 22 Notes (line spacings removed): 1. Function Endian32_UsingCopyMem(ByVal n As Long) As Long Dim big(3) As Byte Dim little(3) As Byte CopyMemory big(0), n, 4 little(3) = big(0) little(2) = big(1) little(1) = big(2) little(0) = big(3) CopyMemory n, little(0), 4 Endian32_UsingCopyMem = n End Function 2. Function Endian32_UsingShift(ByVal n As Long) As Long Endian32_UsingShift = ((ShiftLL(n And &HFF&, 24)) _ Or (ShiftLL(n And &HFF00&, 8)) _ Or (ShiftLR(n And &HFF0000, 8)) _ Or (ShiftLR(n And &HFF000000, 24))) End Function 3. Function SwapEndian08(ByVal dw As Long) As Long SwapEndian08 = _ (((dw And &HFF000000) \ &H1000000) And &HFF&) Or _ ((dw And &HFF0000) \ &H100&) Or _ ((dw And &HFF00&) * &H100&) Or _ ((dw And &H7F&) * &H1000000) If (dw And &H80&) Then SwapEndian08 = SwapEndian08 Or &H80000000 End Function 4. Function Endian32_UsingBendian(ByVal n As Long) As Long Endian32_UsingBendian = Endian4(n) End Function 5. Function Endian32_UsingString(ByVal n As Long) As Long Dim s As String Dim d As String Dim i As Integer s = Right$("0000000" & Hex$(n), 8) For i = 1 To 7 Step 2 d = Mid$(s, i, 2) & d Next Endian32_UsingString = CLng("&H" & d) End Function In each case I entered the string 4B3B2B1B into txtNumber. The same calling routine was used in all cases: Dim start As Date Dim n As Long Dim count As Long start = Now For count = 1 To 1000000 n = "&H" & txtNumber txtResult = Hex$(Endian32_UsingCopyMem(n)) ' This line changes as per method used - see below Next MsgBox "CopyMem: " & DateDiff("s", start, Now) 1. txtResult = Hex$(Endian32_UsingCopyMem(n)) 2. txtResult = Hex$(Endian32_UsingShift(n)) 3. txtResult = Hex$(SwapEndian08(n)) 4. txtResult = Hex$(Endian32_UsingBendian(n)) 5. txtResult = Hex$(Endian32_UsingString(n)) So, all in all, it's been an interesting exercise, and the difference is most marked between my original string approach (5) and the others. I should note that the bendian method won't work in the IDE (file bendian.dll not found). MM
From: Mike Williams on 14 Mar 2010 06:22
"MM" <kylix_is(a)yahoo.co.uk> wrote in message news:5m7pp550144ecifs36v56nvejibf0t3vci(a)4ax.com... >> [Mike W said] You're not going to get much in the way of speed >> when running "top heavy" code in the IDE or as pcode. You >> need to be running it as a native code compiled exe. > > [MM said] It was the *comparison* that was significant, > not the absolute speed. Yes, but the comparison is not valid either. Regardless of the absolute timing values, the relationship between the speed of the different methods is in most cases totally different for code run in the IDE than it is for code run as a native code compiled exe because the "pcode" overhead of the individual VB statements is in many cases massive when compared to the "under the hood" execution time, especially when dealing with statements that have a very small compiled execution time. This gives the code with the fewer VB statements a very large advantage over code with more VB statements when run in the IDE, so much so that the code with the fewer VB statement can beat the code with the larger number of VB statements when run in the IDE even though in a compiled exe completely the opposite may be true and the code with the larger number of VB statements might actually be faster. In order to get a real comparison of the true speed of the different methods you really do need to run the code in the way it would normally be run in your distributed app, which is as a compiled exe. Also, when dealing with code that has "under the hood" execution times that are very small compared to the pcode overhead, you really do need to compile as a native code compiled exe and not as a pcode compiled exe. Running such code in the IDE, or as a pcode compiled exe, simply will not give you a true comparison of the speed of the different methods. As a specific example, here again are the results of the native code compiled exe tests that I recently posted for two of the methods tested on my own laptop: 0.023 microseconds for Jim Mack's ASM as a standard declaration 0.018 microseconds for Donald Lessau's VB code method As you can see, Donald Lessau's straight VB code method beats Jim Mack's DLL (where the DLL function is declared in the usual way) and Donald Lessau's method is about 30 per cent faster. However, here are the results of exactly the same test on exactly the same machine when compiled as a pcode compiled exe (which will be similar to the speed when run in the IDE): 0.041 microseconds for Jim Mack's ASM as a standard declaration 0.320 microseconds for Donald Lessau's VB code method You will see that the results are completely different. In both cases they are slower of course, as would be expected, but in addition to being slower the relationship between the two has totally changed, with Donald Lessau's VB code method (which was previously the faster of the two) now being very much the slower, so the result is reversed and Jim Mack's code is massively faster, about seven times faster, than Donald Lessau's code. Essentially, to get a true comparison of the speed of different methods you need to run your code in the condition it will be in when you have distributed it to your customers, which in most cases, and particularly in the case of "top heavy" code, as a native code compiled exe. Mike |