Prev: How To Know
Next: Array Problem
From: mayayana on 25 Dec 2009 22:05 A bit of an edit to make my air code slightly less airy: Public Function SReplace(StrToRep As String) As String Dim Lens As Long, LNew As Long, iPos As Long, iPosNew As Long, iPosStart As Long, Len2 As Long Dim SA1 As SAFEARRAY1 Dim iChar As Integer Dim A1() As Integer Dim Boo1 As Boolean Lens = Len(StrToRep) LNew = Lens * 1.5 SReplace = String$(LNew, 0) With SA1 '-- set up array for t1.text string. .cbElements = 2 .cElements = Lens + 1 .cDims = 1 .pvData = StrPtr(StrToRep) End With CopyMemory ByVal VarPtrArray(A1), VarPtr(SA1), 4 iPosNew = 1 '-- offset into new string iPos = 0 Do While iPos < Lens iPosStart = iPos + 1 iChar = A1(iPos) Select Case iChar Case 0 Exit Do Case 42 If A1(iPos + 1) = 42 Then If A1(iPos + 2) = 42 Then Mid$(SReplace, iPosNew) = "**" iPosNew = iPosNew + 2 iPos = iPos + 2 Boo1 = True End If End If If Boo1 = False Then Mid$(SReplace, iPosNew) = Mid$(StrToRep, (iPosStart), 1) iPosNew = iPosNew + 1 End If Boo1 = False Case Else Mid$(SReplace, iPosNew) = Mid$(StrToRep, (iPosStart), 1) iPosNew = iPosNew + 1 End Select iPos = iPos + 1 Loop ZeroMemory ByVal VarPtrArray(A1), 4 Len2 = InStr(1, SReplace, Chr$(0)) If Len2 > 1 Then SReplace = Left$(SReplace, (Len2 - 1)) End Function '------------------------------------- 1) Added $ to all Mid calls. I've never been sure about the Mid statement. It works as Mid$ but is only documented as Mid. On the other hand, the VB object browser doesn't show the Mid statement at all. And MSDN, while listing the Mid and Mid$ function but only the Mid statement, actually only has one page pointed to by both Mid Function and Mid$ Function. It's strange that the VB docs are so cavalier about distinguishing between data types. I haven't tested whether Mid$ statement might be slower than Mid staement for some reason. It works, so I figure it's the safer option for optimization. 2) At "Case 42" I realized that if a match was not found then the character still needed to be dealt with. My first version would have dropped out any * not followed by two asterisks. (It can get tricky as the function gets more complex and customized, but a simple Replace function would avoid that problem. Also, with this method the incoming string can be LCase-d to do a text-compare operation. And of course it would be simple to add a counter.) I don't know if the method I'm using is the fastest, but it was the best option I came up with when I needed to write a complex string-rebuilding function. By pointing an array at the incoming string one avoids Asc/Chr conversions. Of course the new string could also be built as an integer array, but that quickly becomes unwieldy if there are long strings being replaced. In other words, if you want to replace "a" with "b" it would be easy, but if you want to replace "here" with "anywhere in the world" it gets awkward. The VBSpeed samples are interesting, and Olaf's code seems to be the clear winner there, but they seem to be only testing short strings. And the results vary so much for different calls that I wonder what value the tests have. In real-world usage a few microseconds difference is an absurd measure, especially given that the function is seldom called more than a few times. I don't see how those samples could be useful unless they were repeated on a typical string of perhaps 100 KB.
From: Schmidt on 26 Dec 2009 01:53 "mayayana" <mayaXXyana(a)rcXXn.com> schrieb im Newsbeitrag news:OHhzDgdhKHA.1236(a)TK2MSFTNGP04.phx.gbl... > The VBSpeed samples are interesting, and Olaf's > code seems to be the clear winner there, but they > seem to be only testing short strings. And the results > vary so much for different calls that I wonder what > value the tests have. In real-world usage a few > microseconds difference is an absurd measure, especially > given that the function is seldom called more than a few > times. I don't see how those samples could be useful > unless they were repeated on a typical string of perhaps > 100 KB. The practical reason I wrote this highly optimized Replace-routine was serverside code for dynamically "glued together" WebPage-snippets (templates), some of them below 1 or 2kByte, but some of them larger (10-30kByte). Building up such an "ready to post back" Webpage involved quite an amount of such replacements - and this was a few years back in time, where a "fast CPU" was one of the Pentium III-class (with only about 500MHz-1GHz clock-freq). And the performance gain, compared with the builtin VB6-Replace-function was really worth it. Nowadays the faster CPUs with their larger caches seem to make its usage obsolete, and you're right, the Mid-statement is not all that bad in comparison, just look at Jost Schwiders entries into the list - though there's a factor 2-4 anyways (on average), compared with the higher optimized (Integer-Array-mapped) string routines which follow his contributions. From a short test, I can measure about the same difference of factor 3-4, if I time the performance of your posted routine with my approach (corrected and tuned somewhat by Guido Beckmann). Factor 4, when tested on my older Pentium III 500MHz, and reduced to about factor 3, when compared on a modern CPU. And that is comparing a generically working Replace- routine with a currently "somewhat specialized" code, which acts on a "limited input-range" currently. Since you already work with the Integer-array-mapping, it would be consequent, to work "entirely within array-space", also for the "copy-over" of the appropriate parts. You could speedup your routine by a reasonable amount, doing so. Here comes a Class (based on Guidos latest corrections), which I've just adapted a bit, to work without the Typelib he was using - so that the code is ready for copy and paste: '***Into a Class Option Explicit Private Type SafeArray1D cDims As Integer fFeatures As Integer cbElements As Long cLocks As Long pvData As Long cElements1d As Long lLBound As Long End Type Private Declare Sub BindArray Lib "kernel32" Alias "RtlMoveMemory" _ (PArr() As Any, PSrc&, Optional ByVal cb& = 4) Private Declare Sub ReleaseArray Lib "kernel32" Alias "RtlMoveMemory" _ (PArr() As Any, Optional PSrc& = 0, Optional ByVal cb& = 4) Private Declare Sub RtlMoveMemory Lib "kernel32" _ (dst As Any, src As Any, ByVal nBytes&) Private Declare Function CharLowerBuffA& Lib "user32" _ (lpsz As Any, ByVal cchLength&) Private Declare Function CharLowerBuffW& Lib "user32" _ (lpsz As Any, ByVal cchLength&) Private aSrc%(), saSrc As SafeArray1D Private aNew%(), saNew As SafeArray1D Private aOld%(), saOld As SafeArray1D Private aDst%(), saDst As SafeArray1D Private aPosFnd&(), ubPosFnd& Private aLowChars%(&H8000 To &H7FFF) Friend Function Replace(Text As String, sOld As String, sNew As String, _ Optional ByVal Start As Long = 1, _ Optional ByVal Count As Long = 2147483647, _ Optional ByVal Compare As VbCompareMethod = vbBinaryCompare _ ) As String Dim c&, i&, j&, cntCpy& Dim cntFnd&, ptrSrc&, ptrDst& Dim lenFnd&, lenSrc&, lenNew&, lenNewB& Dim posFnd&, posOut&, posIn& Dim ubFnd&, Fnd0%, fSameLen As Boolean lenSrc = Len(Text) lenNew = Len(sNew) lenFnd = Len(sOld) ubFnd = lenFnd - 1 ptrSrc = StrPtr(Text) If lenSrc = 0 Then Exit Function If lenFnd = 0 Then Replace = Text: Exit Function If Start > 0 Then i = Start - 1 saSrc.pvData = ptrSrc saOld.pvData = StrPtr(sOld) saNew.pvData = StrPtr(sNew) If lenFnd = lenNew Then fSameLen = True Replace = Text saDst.pvData = StrPtr(Replace) ' ptrDst = StrPtr(Replace11) ' saDst.pvData = ptrDst End If c = lenSrc - lenFnd If Compare = vbBinaryCompare Then Fnd0 = aOld(0) For i = i To c 'Inline-Cascading for first Char If aSrc(i) <> Fnd0 Then i = i + 1: _ If aSrc(i) <> Fnd0 Then i = i + 1: _ If aSrc(i) <> Fnd0 Then i = i + 1: _ If aSrc(i) <> Fnd0 Then i = i + 1: _ If aSrc(i) <> Fnd0 Then i = i + 1: _ If aSrc(i) <> Fnd0 Then i = i + 1: _ If aSrc(i) <> Fnd0 Then i = i + 1: _ If aSrc(i) <> Fnd0 Then GoTo loopNxt If i > c Then Exit For 'Search all others j = ubFnd Do While j If aSrc(i + j) <> aOld(j) Then GoTo loopNxt j = j - 1 Loop cntFnd = cntFnd + 1 'Found at Position i (0 based) If fSameLen Then j = lenNew Do: j = j - 1: aDst(i + j) = aNew(j): Loop While j Else If cntFnd > ubPosFnd Then ubPosFnd = ubPosFnd + 512: ReDim Preserve aPosFnd(ubPosFnd) End If aPosFnd(cntFnd) = i * 2 End If If cntFnd = Count Then Exit For i = i + ubFnd loopNxt: Next i Else 'vbStringCompare Fnd0 = aLowChars(aOld(0)) For i = i To c If aLowChars(aSrc(i)) <> Fnd0 Then i = i + 1: _ If aLowChars(aSrc(i)) <> Fnd0 Then i = i + 1: _ If aLowChars(aSrc(i)) <> Fnd0 Then i = i + 1: _ If aLowChars(aSrc(i)) <> Fnd0 Then i = i + 1: _ If aLowChars(aSrc(i)) <> Fnd0 Then i = i + 1: _ If aLowChars(aSrc(i)) <> Fnd0 Then i = i + 1: _ If aLowChars(aSrc(i)) <> Fnd0 Then i = i + 1: _ If aLowChars(aSrc(i)) <> Fnd0 Then GoTo loopNxt2 If i > c Then Exit For 'Search all others j = ubFnd Do While j If aLowChars(aSrc(i + j)) <> aLowChars(aOld(j)) Then GoTo loopNxt2 End If j = j - 1 Loop 'Found at Position i (0 based) cntFnd = cntFnd + 1 If fSameLen Then j = lenNew Do: j = j - 1: aDst(i + j) = aNew(j): Loop While j Else If cntFnd > ubPosFnd Then ubPosFnd = ubPosFnd + 512: ReDim Preserve aPosFnd(ubPosFnd) End If aPosFnd(cntFnd) = i * 2 End If If cntFnd = Count Then Exit For i = i + ubFnd loopNxt2: Next i End If 'Generate Output If fSameLen Then Exit Function If cntFnd = 0 Then Replace = Text Else c = lenSrc + (lenNew - lenFnd) * cntFnd Replace = Space(c) ptrDst = StrPtr(Replace) saDst.pvData = ptrDst lenFnd = lenFnd * 2 If lenNew Then lenNewB = lenNew * 2 For i = 1 To cntFnd posFnd = aPosFnd(i) cntCpy = posFnd - posIn If cntCpy > 50 Then RtlMoveMemory ByVal saDst.pvData, ByVal saSrc.pvData, cntCpy saDst.pvData = saDst.pvData + cntCpy ElseIf cntCpy > 0 Then j = cntCpy \ 2 Do: j = j - 1: aDst(j) = aSrc(j): Loop While j saDst.pvData = saDst.pvData + cntCpy End If posIn = posFnd + lenFnd saSrc.pvData = ptrSrc + posIn If lenNew > 50 Then RtlMoveMemory ByVal saDst.pvData, ByVal saNew.pvData, lenNewB Else j = lenNew Do: j = j - 1: aDst(j) = aNew(j): Loop While j End If saDst.pvData = saDst.pvData + lenNewB Next i Else For i = 1 To cntFnd posFnd = aPosFnd(i) cntCpy = posFnd - posIn If cntCpy > 50 Then RtlMoveMemory ByVal saDst.pvData, ByVal saSrc.pvData, cntCpy saDst.pvData = saDst.pvData + cntCpy ElseIf cntCpy > 0 Then j = cntCpy \ 2 Do: j = j - 1: aDst(j) = aSrc(j): Loop While j saDst.pvData = saDst.pvData + cntCpy End If posIn = posFnd + lenFnd saSrc.pvData = ptrSrc + posIn Next i End If c = lenSrc * 2 - posIn If c > 50 Then RtlMoveMemory aDst(0), aSrc(0), c ElseIf c > 0 Then c = c \ 2 Do: c = c - 1: aDst(c) = aSrc(c): Loop While c End If End If End Function Private Sub Class_Initialize() Dim c& ubPosFnd = 512: ReDim aPosFnd(ubPosFnd) saSrc.cDims = 1 saSrc.cbElements = 2 saSrc.cElements1d = &H7FFFFFFF saNew = saSrc saOld = saSrc saDst = saSrc BindArray aSrc, VarPtr(saSrc) BindArray aNew, VarPtr(saNew) BindArray aOld, VarPtr(saOld) BindArray aDst, VarPtr(saDst) For c = -32768 To 32767: aLowChars(c) = c: Next c If CharLowerBuffW(aLowChars(-32768), &H10000) = 0 Then CharLowerBuffA aLowChars(65), (223 - 65) * 2 End If ' added by donald, 20011210 ' patch the stooges ' S 138/352 s 154/353 ' O 140/338 o 156/339 ' Z 142/381 z 158/382 ' Y 159/376 � 255/255 aLowChars(352) = 353 aLowChars(338) = 339 aLowChars(381) = 382 aLowChars(376) = 255 End Sub Private Sub Class_Terminate() ReleaseArray aSrc ReleaseArray aNew ReleaseArray aOld ReleaseArray aDst End Sub
From: mayayana on 26 Dec 2009 09:43 Thank you. That's very informative and good to know for future reference.
From: David Kaye on 26 Dec 2009 19:28 "Donald Lessau" <don(a)oflex.com> wrote: >Check at VBspeed. All functions here work the way you like it, and are >faster than Vb6 Replace: It's funny. Last night I tried several methods of replacing unwanted characters in a text file. I used Replace and InStr/Mid$, both as text and using the Asc function. I also tried doing a Select Case routine. What I found was that a For Next loop using Replace was the fastest for eliminating characters in a text file. What I did was write a bunch of routines all with the same basic things such as starting a timer and adding to a list box. Except for the string replacement part all were identical. I called them from a Command button and the results showed on my screen as they ran. The extra Dims were in there because they were used in other routines and I wanted to be sure I was doing the exact same thing each time in each routine except for the actual replacement. NOTE: I replaced characters < "A" only because I wanted to replace a bunch of stuff in a typical text file, not because I plan to use this exact replacement scenario. The Text1.Text box held 28k of text I copied from a text file. I think the most important parts are to put the entire text into a variable and to use Replace to replace the contents of that variable, that is b$ = the new b$. I'm assuming that VB is not creating a new memory location for b$, but is reusing the space. This probably accounts for the speed. This was the winning routine: Sub MyReplace() Dim i%, j%, a$, b$, timex timex = Timer b$ = Text1.Text For j% = 1 To 64 b$ = Replace(b$, Chr$(j%), "") Next For j% = 128 To 255 b$ = Replace(b$, Chr$(j%), "") Next List1.AddItem CDec(Timer - timex) End Sub There was one thing faster and that was a line by line Replace function for each and every character. That was so fast that the timer couldn't even register a change.
From: Larry Serflaten on 27 Dec 2009 02:30
"David Kaye" <sfdavidkaye2(a)yahoo.com> wrote > I think the most important parts are to put the entire text into a variable > and to use Replace to replace the contents of that variable, that is b$ = the > new b$. I'm assuming that VB is not creating a new memory location for b$, > but is reusing the space. This probably accounts for the speed. That couldn't happen if the replacement was larger than what was replaced. You have to remember that Replace is to be used in a wide variety of situations. If the scope of your problem is very limited, you should be able to beat VB's Replace statement hands down using knowlege of the particular situation. See what you get for this method which does the same job as your post: Function Scrub(Text As String) As String Dim inc() As Byte Dim txt() As Byte Dim src As Long, dst As Long txt = Text inc = StrConv(String$(32, 0) & String$(96, 2) & Chr$(0), vbFromUnicode) Do While src < UBound(txt) txt(dst) = txt(src) src = src + 2 dst = dst + inc(txt(dst)) Loop Do While dst < UBound(txt) txt(dst) = 32 dst = dst + 2 Loop Scrub = Trim$(txt) End Function |