From: Karl E. Peterson on
Jim Mack wrote:
> Karl E. Peterson wrote:
>>
>> So, confidentially <g>, do Mike's results make sense to you???
>
> I'm just getting back to this after a few days at SxSW (incredible
> scene, but I was working),

Lucky dog!

> and I glossed over a lot of messages. So
> I'm not positive which results you're talking about.

He says your routine is 3-4x faster when called via typelib rather than
a Declare.

> But if you mean the ones where my (typelib) bendian.dll beat
> SwapEndian08, which in turn edged out my (Declared) bendian.dll, then
> I'm not terribly surprised or suspicious.
>
> It's always been true that Declared functions are slower than
> tlb-exposed ones (to avoid GetLast etc). But my recollection, as I
> expressed upthread, was that we'd explored these paths before and that
> an external function just couldn't beat Swap08. I never did the tests
> myself, I just went by others' reports.

Well, I think I'm gonna have to try, now. :-)

--
..NET: It's About Trust!
http://vfred.mvps.org


From: Karl E. Peterson on
Mike Williams wrote:
> "Karl E. Peterson" <karl(a)exmvps.org> wrote in message
> news:eBvYhPtxKHA.3536(a)TK2MSFTNGP06.phx.gbl...
>
>> I hope to have an hour to throw at this, this afternoon! Thanks...
>
> Okay.

Initial results:

VB code http://www.xbeat.net/vbspeed/c_SwapEndian.htm
F1F2F3F4, F4F3F2F1 Time: 10.3 nanoseconds

Jim Mack's DLL declared as a function
F1F2F3F4, F4F3F2F1 Time: 15.1 nanoseconds

Jim Mack's DLL exposed in Project / References
F1F2F3F4, F4F3F2F1 Time: 3.1 nanoseconds

VB code http://www.xbeat.net/vbspeed/c_SwapEndian.htm
F1F2F3F4, F4F3F2F1 Time: 10.4 nanoseconds

Jim Mack's DLL declared as a function
F1F2F3F4, F4F3F2F1 Time: 15.0 nanoseconds

Jim Mack's DLL exposed in Project / References
F1F2F3F4, F4F3F2F1 Time: 3.1 nanoseconds

Now, it appears, my task is to figure out why. :-/

> One other thing I should mention (since I'm basically an honest person
> ;-)) is something that has only just occurred to me in that I can squeeze
> quite a few more nanoseconds out of the vbspeed SwapEndian08 function in the
> test code I sent you by unwrapping it, which does speed it up quite a bit,
> although even then it is still about 60 per cent slower than Jim Mack's DLL
> when exposed in Project / References. Maybe you could check the results you
> are getting at your end from the code I recently posted, and then look at the
> other thing.

Yeah, I know that could definitely be inlined to make it a bit fairer.

What I totally don't understand is the 3-5x speed difference between
the declared and undeclared external calls. That's just huge.

--
..NET: It's About Trust!
http://vfred.mvps.org


From: dpb on
Karl E. Peterson wrote:
....

> What I totally don't understand is the 3-5x speed difference between the
> declared and undeclared external calls. That's just huge.

Just a complete WAG -- ability to do a global vs purely local
optimization, maybe?

--


From: Karl E. Peterson on
dpb wrote:
> Karl E. Peterson wrote:
> ...
>
>> What I totally don't understand is the 3-5x speed difference between the
>> declared and undeclared external calls. That's just huge.
>
> Just a complete WAG -- ability to do a global vs purely local optimization,
> maybe?

There's some kind of wild optimization going on, lemme tell ya! See my
upcoming reply to Mike...

--
..NET: It's About Trust!
http://vfred.mvps.org


From: Karl E. Peterson on
Mike Williams wrote:
> "Karl E. Peterson" <karl(a)exmvps.org> wrote in message
> news:eBvYhPtxKHA.3536(a)TK2MSFTNGP06.phx.gbl...
>
>> I hope to have an hour to throw at this, this afternoon! Thanks...
>
> Okay. One other thing I should mention (since I'm basically an honest person
> ;-)) is something that has only just occurred to me in that I can squeeze
> quite a few more nanoseconds out of the vbspeed SwapEndian08 function in the
> test code I sent you by unwrapping it, which does speed it up quite a bit,
> although even then it is still about 60 per cent slower than Jim Mack's DLL
> when exposed in Project / References. Maybe you could check the results you
> are getting at your end from the code I recently posted, and then look at the
> other thing.

Be prepared to be blown away. I *cannot* explain what I'm seeing here.

First off, running your code, what I got was very similar, as I said in
another post.

So, I decided I'd drop your timing method in favor of my own CStopWatch
(http://vb.mvps.org/samples/StopWatch), and also rolled all the tests
into a single routine. I also added another test to do the vbSpeed
routine inline. Here's the full form module, to which I added a
multiline textbox for the output:

Option Explicit

Private Declare Function Endian4Declared Lib "bendian.dll" _
Alias "Endian4" (ByVal LongToSwap As Long) As Long

Private Sub Form_Load()
Text1.Text = ""
End Sub

Private Function SwapEndian08(ByVal dw As Long) As Long
' http://www.xbeat.net/vbspeed/c_SwapEndian.htm
SwapEndian08 = _
(((dw And &HFF000000) \ &H1000000) And &HFF&) Or _
((dw And &HFF0000) \ &H100&) Or _
((dw And &HFF00&) * &H100&) Or _
((dw And &H7F&) * &H1000000)
If (dw And &H80&) Then SwapEndian08 = _
SwapEndian08 Or &H80000000
End Function

Private Sub Command1_Click()
Dim stp As CStopWatch
Dim n As Long, dw As Long, wd As Long
Dim loops As Long
Dim t(0 To 4) As Long
Dim msg As String

' Initialize stuff
loops = 100000000
Set stp = New CStopWatch

' Loop overhead
stp.Reset
For dw = 1 To loops
Next dw
t(0) = stp.Elapsed

' Best routine from vbSpeed, inline
stp.Reset
For dw = 1 To loops
wd = _
(((dw And &HFF000000) \ &H1000000) And &HFF&) Or _
((dw And &HFF0000) \ &H100&) Or _
((dw And &HFF00&) * &H100&) Or _
((dw And &H7F&) * &H1000000)
If (dw And &H80&) Then wd = wd Or &H80000000
Next dw
t(1) = stp.Elapsed

' Best routine from vbSpeed, as function
stp.Reset
For dw = 1 To loops
wd = SwapEndian08(dw)
Next dw
t(2) = stp.Elapsed

' Jim Mack's DLL declared as a function
stp.Reset
For dw = 1 To loops
wd = Endian4Declared(dw)
Next dw
t(3) = stp.Elapsed

' Jim Mack's DLL exposed in Project / References
stp.Reset
For dw = 1 To loops
wd = Endian4(dw)
Next dw
t(4) = stp.Elapsed

' Output
For n = 0 To UBound(t)
msg = "t(" & CStr(n) & ") = " & t(n)
If n > 0 Then
msg = msg & " - " & CStr(t(0)) _
& " = " & CStr(t(n) - t(0))
End If
Dump msg
Next n
Dump "Loops = " & CStr(loops)
End Sub

Private Sub Dump(Optional txt As String)
With Text1
.SelStart = Len(.Text)
.SelText = txt & vbCrLf
End With
End Sub

Note that I also used the loop counter as the variable to
endian-swapped. I did this, because I simply could not account for the
results I saw when I used a fixed value, as you did. As it turned out,
that did not matter. I am still getting the same, *incomprehensible*
results. Example, with 100 million loops, where all times are in
milliseconds:

t(0) = 86
t(1) = 86 - 86 = 0
t(2) = 987 - 86 = 901
t(3) = 1426 - 86 = 1340
t(4) = 302 - 86 = 216
Loops = 100000000

t(0) = 86
t(1) = 86 - 86 = 0
t(2) = 987 - 86 = 901
t(3) = 1425 - 86 = 1339
t(4) = 302 - 86 = 216
Loops = 100000000

Now, why on Earth that t1 isn't showing *any* time over the loop
overhead, I just cannot say. Please check my work, and tell me I made
a typo somewhere? I really don't see it. I even moved that block
around, and swapped out the index in the t() entry.

Tests were run on the native compiled EXE, Pentium Pro favored, and all
advanced optimizations turned on. System is Windows 7 x64 with 10MB
RAM and dual Xeon processors.

I expected that the native code results would be fastest. In no way
did I expect them to be immeasurable! Thoughts?

--
..NET: It's About Trust!
http://vfred.mvps.org


First  |  Prev  |  Next  |  Last
Pages: 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Prev: VB 6 & VS?
Next: C:\WINDOWS\system32\ieframe.dll\1