Prev: VB 6 & VS?
Next: C:\WINDOWS\system32\ieframe.dll\1
From: Karl E. Peterson on 18 Mar 2010 18:27 Jim Mack wrote: > Karl E. Peterson wrote: >> >> So, confidentially <g>, do Mike's results make sense to you??? > > I'm just getting back to this after a few days at SxSW (incredible > scene, but I was working), Lucky dog! > and I glossed over a lot of messages. So > I'm not positive which results you're talking about. He says your routine is 3-4x faster when called via typelib rather than a Declare. > But if you mean the ones where my (typelib) bendian.dll beat > SwapEndian08, which in turn edged out my (Declared) bendian.dll, then > I'm not terribly surprised or suspicious. > > It's always been true that Declared functions are slower than > tlb-exposed ones (to avoid GetLast etc). But my recollection, as I > expressed upthread, was that we'd explored these paths before and that > an external function just couldn't beat Swap08. I never did the tests > myself, I just went by others' reports. Well, I think I'm gonna have to try, now. :-) -- ..NET: It's About Trust! http://vfred.mvps.org
From: Karl E. Peterson on 18 Mar 2010 19:33 Mike Williams wrote: > "Karl E. Peterson" <karl(a)exmvps.org> wrote in message > news:eBvYhPtxKHA.3536(a)TK2MSFTNGP06.phx.gbl... > >> I hope to have an hour to throw at this, this afternoon! Thanks... > > Okay. Initial results: VB code http://www.xbeat.net/vbspeed/c_SwapEndian.htm F1F2F3F4, F4F3F2F1 Time: 10.3 nanoseconds Jim Mack's DLL declared as a function F1F2F3F4, F4F3F2F1 Time: 15.1 nanoseconds Jim Mack's DLL exposed in Project / References F1F2F3F4, F4F3F2F1 Time: 3.1 nanoseconds VB code http://www.xbeat.net/vbspeed/c_SwapEndian.htm F1F2F3F4, F4F3F2F1 Time: 10.4 nanoseconds Jim Mack's DLL declared as a function F1F2F3F4, F4F3F2F1 Time: 15.0 nanoseconds Jim Mack's DLL exposed in Project / References F1F2F3F4, F4F3F2F1 Time: 3.1 nanoseconds Now, it appears, my task is to figure out why. :-/ > One other thing I should mention (since I'm basically an honest person > ;-)) is something that has only just occurred to me in that I can squeeze > quite a few more nanoseconds out of the vbspeed SwapEndian08 function in the > test code I sent you by unwrapping it, which does speed it up quite a bit, > although even then it is still about 60 per cent slower than Jim Mack's DLL > when exposed in Project / References. Maybe you could check the results you > are getting at your end from the code I recently posted, and then look at the > other thing. Yeah, I know that could definitely be inlined to make it a bit fairer. What I totally don't understand is the 3-5x speed difference between the declared and undeclared external calls. That's just huge. -- ..NET: It's About Trust! http://vfred.mvps.org
From: dpb on 18 Mar 2010 19:47 Karl E. Peterson wrote: .... > What I totally don't understand is the 3-5x speed difference between the > declared and undeclared external calls. That's just huge. Just a complete WAG -- ability to do a global vs purely local optimization, maybe? --
From: Karl E. Peterson on 18 Mar 2010 20:05 dpb wrote: > Karl E. Peterson wrote: > ... > >> What I totally don't understand is the 3-5x speed difference between the >> declared and undeclared external calls. That's just huge. > > Just a complete WAG -- ability to do a global vs purely local optimization, > maybe? There's some kind of wild optimization going on, lemme tell ya! See my upcoming reply to Mike... -- ..NET: It's About Trust! http://vfred.mvps.org
From: Karl E. Peterson on 18 Mar 2010 20:23
Mike Williams wrote: > "Karl E. Peterson" <karl(a)exmvps.org> wrote in message > news:eBvYhPtxKHA.3536(a)TK2MSFTNGP06.phx.gbl... > >> I hope to have an hour to throw at this, this afternoon! Thanks... > > Okay. One other thing I should mention (since I'm basically an honest person > ;-)) is something that has only just occurred to me in that I can squeeze > quite a few more nanoseconds out of the vbspeed SwapEndian08 function in the > test code I sent you by unwrapping it, which does speed it up quite a bit, > although even then it is still about 60 per cent slower than Jim Mack's DLL > when exposed in Project / References. Maybe you could check the results you > are getting at your end from the code I recently posted, and then look at the > other thing. Be prepared to be blown away. I *cannot* explain what I'm seeing here. First off, running your code, what I got was very similar, as I said in another post. So, I decided I'd drop your timing method in favor of my own CStopWatch (http://vb.mvps.org/samples/StopWatch), and also rolled all the tests into a single routine. I also added another test to do the vbSpeed routine inline. Here's the full form module, to which I added a multiline textbox for the output: Option Explicit Private Declare Function Endian4Declared Lib "bendian.dll" _ Alias "Endian4" (ByVal LongToSwap As Long) As Long Private Sub Form_Load() Text1.Text = "" End Sub Private Function SwapEndian08(ByVal dw As Long) As Long ' http://www.xbeat.net/vbspeed/c_SwapEndian.htm SwapEndian08 = _ (((dw And &HFF000000) \ &H1000000) And &HFF&) Or _ ((dw And &HFF0000) \ &H100&) Or _ ((dw And &HFF00&) * &H100&) Or _ ((dw And &H7F&) * &H1000000) If (dw And &H80&) Then SwapEndian08 = _ SwapEndian08 Or &H80000000 End Function Private Sub Command1_Click() Dim stp As CStopWatch Dim n As Long, dw As Long, wd As Long Dim loops As Long Dim t(0 To 4) As Long Dim msg As String ' Initialize stuff loops = 100000000 Set stp = New CStopWatch ' Loop overhead stp.Reset For dw = 1 To loops Next dw t(0) = stp.Elapsed ' Best routine from vbSpeed, inline stp.Reset For dw = 1 To loops wd = _ (((dw And &HFF000000) \ &H1000000) And &HFF&) Or _ ((dw And &HFF0000) \ &H100&) Or _ ((dw And &HFF00&) * &H100&) Or _ ((dw And &H7F&) * &H1000000) If (dw And &H80&) Then wd = wd Or &H80000000 Next dw t(1) = stp.Elapsed ' Best routine from vbSpeed, as function stp.Reset For dw = 1 To loops wd = SwapEndian08(dw) Next dw t(2) = stp.Elapsed ' Jim Mack's DLL declared as a function stp.Reset For dw = 1 To loops wd = Endian4Declared(dw) Next dw t(3) = stp.Elapsed ' Jim Mack's DLL exposed in Project / References stp.Reset For dw = 1 To loops wd = Endian4(dw) Next dw t(4) = stp.Elapsed ' Output For n = 0 To UBound(t) msg = "t(" & CStr(n) & ") = " & t(n) If n > 0 Then msg = msg & " - " & CStr(t(0)) _ & " = " & CStr(t(n) - t(0)) End If Dump msg Next n Dump "Loops = " & CStr(loops) End Sub Private Sub Dump(Optional txt As String) With Text1 .SelStart = Len(.Text) .SelText = txt & vbCrLf End With End Sub Note that I also used the loop counter as the variable to endian-swapped. I did this, because I simply could not account for the results I saw when I used a fixed value, as you did. As it turned out, that did not matter. I am still getting the same, *incomprehensible* results. Example, with 100 million loops, where all times are in milliseconds: t(0) = 86 t(1) = 86 - 86 = 0 t(2) = 987 - 86 = 901 t(3) = 1426 - 86 = 1340 t(4) = 302 - 86 = 216 Loops = 100000000 t(0) = 86 t(1) = 86 - 86 = 0 t(2) = 987 - 86 = 901 t(3) = 1425 - 86 = 1339 t(4) = 302 - 86 = 216 Loops = 100000000 Now, why on Earth that t1 isn't showing *any* time over the loop overhead, I just cannot say. Please check my work, and tell me I made a typo somewhere? I really don't see it. I even moved that block around, and swapped out the index in the t() entry. Tests were run on the native compiled EXE, Pentium Pro favored, and all advanced optimizations turned on. System is Windows 7 x64 with 10MB RAM and dual Xeon processors. I expected that the native code results would be fastest. In no way did I expect them to be immeasurable! Thoughts? -- ..NET: It's About Trust! http://vfred.mvps.org |