Which Endian routine is better? [Visual Basic]

Prev: VB 6 & VS?
Next: C:\WINDOWS\system32\ieframe.dll\1

From: Jim Mack on 13 Mar 2010 18:46

MM wrote:
> Jim Mack wrote:
>>
>> Are you doing this compiled to native code with all optimizations?
>
> No, I was running in the IDE when I ran the comparison speed tests.
> I ran them each 3x and each time the result was 18s for the shifts
> and 19s for the SwapEndian08 (per million iterations = For/Next
> loop).

That explains it. I think you'll find that the version with QPP shifts
is considerably slower than the Swap08 code when running compiled.
Running in the IDE isn't representative of how well compiled code will
run.

>> Just for comparison, try this:
>>
>> http://www.microdexterity.com/other/bendian.dll
>
> I will tomorrow. I'm too tired now. It's bedtime!

A reminder: I modified that link to

http://www.microdexterity.com/other/bendian.zip

--
Jim Mack
Twisted tees at http://www.cafepress.com/2050inc
"We sew confusion"

From: Mike Williams on 14 Mar 2010 03:44

"Jim Mack" <jmack(a)mdxi.nospam.com> wrote in message
news:uskvqzvwKHA.5940(a)TK2MSFTNGP02.phx.gbl...

> As far as the Declare not working... I have no trouble with
> it here. Let me cut and paste the actual Declare I used rather
> than the one I typed into the text file. I see that one got
> word-wrapped along the way:

I had already actually dealt with the word wrapping and it still bombed out.
Part of the problem was that I was running it only as a compiled exe and so
I did not see any IDE errors. I later ran it in the IDE to look for the
problem and I got a "bad DLL calling convention" error, which I could see
was because of a problem in the declaration, where the text file has:

Private Declare Function Endian4Dec lib "bendian.dll" _
Alias "Endian4" (ByVal LongToSwap) As Long

.. . . so the problem was due to the missing "As Long" in "(ByVal
LongToSwap)". I had not noticed that when I pasted in the code and, now that
I've fixed it by adding "As Long", the test code using that declaration runs
fine.

> > [Mike W's previously posted results]
> > 0.058 microseconds for Nobody's VB code method
> > 0.026 microseconds for wsock32.dll method
> > 0.018 microseconds for Donald Lessau's VB code
> > 0.005 microseconds for Jim Mack's DLL
>
> Hmmm. Well, I wonder if I could track down the earlier thread
> where people were seeing slightly slower times for the ASM
> code vs optimized VB code. I thought it was the same vbspeed
> code we're using here, but maybe there's something even better
> out there.

Actually I think that in those previous tests you might have been using your
ASM code when used as a standard Declaration rather than when exposed as a
TypeLib, and that the overhead of the call to the declared function might
have been swamping out the underlying speed of your ASM code, resulting in
the vbspeed optimised VB code "pipping it to the post". If that is the case,
and if at the time you had used the "exposed as a TypeLib" version instead,
then I suspect your ASM solution would have "walked it" as far as beating
the compiled VB code is concerned. In fact the overhead of a call to a
declaration is what I was alluding to a couple of days ago when I said in my
response to to MM, ". . but generally any code that calls a declared
function to perform a very small task is slowed down by the overhead of the
call itself."

Now that I've got the declared function working I've run the tests again, on
the same laptop, and this is what I am getting (all approximately the same
as before except there is now another result, which is for your own function
when used as a standard declaration):

0.057 microseconds for Nobody's VB code method
0.026 microseconds for hton in wsock32.dll method
0.023 microseconds for Jim Mack's ASM as a standard declaration
0.018 microseconds for Donald Lessau's VB code method
0.005 microseconds for Jim Mack's ASM exposed as a TypeLib

So, your own ASM code when exposed as a TypeLib runs away with it, beating
"all comers" hands down. The overhead of an empty "For Next" loop in my test
loop itself is about 0.001 microseconds per iteration, so the call to your
ASM itself takes only about 0.004 microseconds and, quite frankly, I cannot
see any other solution beating a time of just 4 nanoseconds (at least not on
my little old laptop!). Nice one. Do you mind if I hang on to your DLL to
use myself if I should ever find the need to perform such "endian" swaps at
a rapid rate?

Mike

From: MM on 14 Mar 2010 04:37

On Sat, 13 Mar 2010 23:40:44 -0000, "Mike Williams"
<Mike(a)WhiskyAndCoke.com> wrote:

>"MM" <kylix_is(a)yahoo.co.uk> wrote in message
>news:h45op5te0ksqhdm93451apmuqjkvsejbbh(a)4ax.com...
>
>> No, I was running in the IDE when I ran the comparison
>> speed tests. I ran them each 3x and each time the result
>> was 18s for the shifts and 19s for the SwapEndian08
>> (per million iterations = For/Next loop).
>
>You're not going to get much in the way of speed when running "top heavy"
>code in the IDE or as pcode. You need to be running it as a native code
>compiled exe.

It was the *comparison" that was significant, not the absolute speed.

MM

From: MM on 14 Mar 2010 06:10

On Sat, 13 Mar 2010 18:46:41 -0500, "Jim Mack" <jmack(a)mdxi.nospam.com>
wrote:

>MM wrote:
>> Jim Mack wrote:
>>>
>>> Are you doing this compiled to native code with all optimizations?
>>
>> No, I was running in the IDE when I ran the comparison speed tests.
>> I ran them each 3x and each time the result was 18s for the shifts
>> and 19s for the SwapEndian08 (per million iterations = For/Next
>> loop).
>
>That explains it. I think you'll find that the version with QPP shifts
>is considerably slower than the Swap08 code when running compiled.
>Running in the IDE isn't representative of how well compiled code will
>run.

No, the comparisons are very similar. Here are my final results, done
just now with the COMPILED app (compiled to native code, optimise for
fast code, advanced optimisations all unchecked):

Each test done three times. All times in secs per 1 million
iterations:

1. CopyMem 19 19 19
2. Shifts 17 16 17
3. SwapEndian08 17 17 17
4. Bendian 18 17 17
5. String 22 22 22

Notes (line spacings removed):
1. Function Endian32_UsingCopyMem(ByVal n As Long) As Long
Dim big(3) As Byte
Dim little(3) As Byte
CopyMemory big(0), n, 4
little(3) = big(0)
little(2) = big(1)
little(1) = big(2)
little(0) = big(3)
CopyMemory n, little(0), 4
Endian32_UsingCopyMem = n
End Function

2. Function Endian32_UsingShift(ByVal n As Long) As Long
Endian32_UsingShift = ((ShiftLL(n And &HFF&, 24)) _
Or (ShiftLL(n And &HFF00&, 8)) _
Or (ShiftLR(n And &HFF0000, 8)) _
Or (ShiftLR(n And &HFF000000, 24)))
End Function

3. Function SwapEndian08(ByVal dw As Long) As Long
SwapEndian08 = _
(((dw And &HFF000000) \ &H1000000) And &HFF&) Or _
((dw And &HFF0000) \ &H100&) Or _
((dw And &HFF00&) * &H100&) Or _
((dw And &H7F&) * &H1000000)
If (dw And &H80&) Then SwapEndian08 = SwapEndian08 Or &H80000000
End Function

4. Function Endian32_UsingBendian(ByVal n As Long) As Long
Endian32_UsingBendian = Endian4(n)
End Function

5. Function Endian32_UsingString(ByVal n As Long) As Long
Dim s As String
Dim d As String
Dim i As Integer
s = Right$("0000000" & Hex$(n), 8)
For i = 1 To 7 Step 2
d = Mid$(s, i, 2) & d
Next
Endian32_UsingString = CLng("&H" & d)
End Function

In each case I entered the string 4B3B2B1B into txtNumber. The same
calling routine was used in all cases:

Dim start As Date
Dim n As Long
Dim count As Long
start = Now
For count = 1 To 1000000
n = "&H" & txtNumber
txtResult = Hex$(Endian32_UsingCopyMem(n)) ' This line changes
as per method used - see below
Next
MsgBox "CopyMem: " & DateDiff("s", start, Now)

1. txtResult = Hex$(Endian32_UsingCopyMem(n))
2. txtResult = Hex$(Endian32_UsingShift(n))
3. txtResult = Hex$(SwapEndian08(n))
4. txtResult = Hex$(Endian32_UsingBendian(n))
5. txtResult = Hex$(Endian32_UsingString(n))

So, all in all, it's been an interesting exercise, and the difference
is most marked between my original string approach (5) and the others.
I should note that the bendian method won't work in the IDE (file
bendian.dll not found).

MM

From: Mike Williams on 14 Mar 2010 06:22

"MM" <kylix_is(a)yahoo.co.uk> wrote in message
news:5m7pp550144ecifs36v56nvejibf0t3vci(a)4ax.com...
>> [Mike W said] You're not going to get much in the way of speed
>> when running "top heavy" code in the IDE or as pcode. You
>> need to be running it as a native code compiled exe.
>
> [MM said] It was the *comparison* that was significant,
> not the absolute speed.

Yes, but the comparison is not valid either. Regardless of the absolute
timing values, the relationship between the speed of the different methods
is in most cases totally different for code run in the IDE than it is for
code run as a native code compiled exe because the "pcode" overhead of the
individual VB statements is in many cases massive when compared to the
"under the hood" execution time, especially when dealing with statements
that have a very small compiled execution time. This gives the code with the
fewer VB statements a very large advantage over code with more VB statements
when run in the IDE, so much so that the code with the fewer VB statement
can beat the code with the larger number of VB statements when run in the
IDE even though in a compiled exe completely the opposite may be true and
the code with the larger number of VB statements might actually be faster.
In order to get a real comparison of the true speed of the different methods
you really do need to run the code in the way it would normally be run in
your distributed app, which is as a compiled exe. Also, when dealing with
code that has "under the hood" execution times that are very small compared
to the pcode overhead, you really do need to compile as a native code
compiled exe and not as a pcode compiled exe. Running such code in the IDE,
or as a pcode compiled exe, simply will not give you a true comparison of
the speed of the different methods. As a specific example, here again are
the results of the native code compiled exe tests that I recently posted for
two of the methods tested on my own laptop:

0.023 microseconds for Jim Mack's ASM as a standard declaration
0.018 microseconds for Donald Lessau's VB code method

As you can see, Donald Lessau's straight VB code method beats Jim Mack's DLL
(where the DLL function is declared in the usual way) and Donald Lessau's
method is about 30 per cent faster. However, here are the results of exactly
the same test on exactly the same machine when compiled as a pcode compiled
exe (which will be similar to the speed when run in the IDE):

0.041 microseconds for Jim Mack's ASM as a standard declaration
0.320 microseconds for Donald Lessau's VB code method

You will see that the results are completely different. In both cases they
are slower of course, as would be expected, but in addition to being slower
the relationship between the two has totally changed, with Donald Lessau's
VB code method (which was previously the faster of the two) now being very
much the slower, so the result is reversed and Jim Mack's code is massively
faster, about seven times faster, than Donald Lessau's code.

Essentially, to get a true comparison of the speed of different methods you
need to run your code in the condition it will be in when you have
distributed it to your customers, which in most cases, and particularly in
the case of "top heavy" code, as a native code compiled exe.

Mike

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Prev: VB 6 & VS?
Next: C:\WINDOWS\system32\ieframe.dll\1