Prev: How To Know
Next: Array Problem
From: David Kaye on 27 Dec 2009 03:14 "Larry Serflaten" <serflaten(a)usinternet.com> wrote: >That couldn't happen if the replacement was larger than what was replaced. > Okay, I changed things a bit. Instead of copying from Text1.Text, I opened a file for read access and input a blob of 80k from a text file. So, now b$ is 80k instead of 28k. Using the routine I showed originally, it takes 0.091+ seconds (that is 9/100ths of a second) to replace unwanted characters with empty strings (""). Interesting. Well, I tried replacing chr$(j%) with chr$(j%) & chr$(j%), thereby adding one byte to each replacement, and one would think forcing VB to make another copy of b$ somewhere. Oddly enough, the speed does not go up significantly. It changes to 0.093+. Amazing. In my situation, Replace is not slow at all. Let me try your example and see how it works.
From: David Kaye on 27 Dec 2009 03:26 "Larry Serflaten" <serflaten(a)usinternet.com> wrote: >See what you get for this method which does the same job as your post: > >Function Scrub(Text As String) As String >Dim inc() As Byte >Dim txt() As Byte >Dim src As Long, dst As Long etc The function takes about half the time of my Replace function, but it's not doing the exact same thing. I'll try it doing exactly the same and see if it's faster. It does show promise, though.
From: David Kaye on 27 Dec 2009 04:31 "Larry Serflaten" <serflaten(a)usinternet.com> wrote: >See what you get for this method which does the same job as your post: [....] > inc = StrConv(String$(32, 0) & String$(96, 2) & Chr$(0), vbFromUnicode) [....] I'm a bit confused about what the above string conversion does. It appears to define a variable called "inc" as a Unicode string containing 32 bytes of a null string, 96 bytes of ASCII 2, and a single byte of the nullstring and convert the whole mess into plain ASCII. But then "inc" appears to be used later as a function. Can you explain this to me?
From: Nobody on 27 Dec 2009 05:39 "David Kaye" <sfdavidkaye2(a)yahoo.com> wrote in message news:hh79h9$rn9$3(a)news.eternal-september.org... > "Larry Serflaten" <serflaten(a)usinternet.com> wrote: > >>See what you get for this method which does the same job as your post: > [....] >> inc = StrConv(String$(32, 0) & String$(96, 2) & Chr$(0), vbFromUnicode) > [....] > > I'm a bit confused about what the above string conversion does. It > appears to > define a variable called "inc" as a Unicode string containing 32 bytes of > a > null string, 96 bytes of ASCII 2, and a single byte of the nullstring and > convert the whole mess into plain ASCII. > > But then "inc" appears to be used later as a function. > > Can you explain this to me? "inc" is a byte array. See the declaration. It's short for "increment". Larry used it to specify increment values, so for byte value 0-31, nothing is incremented so characters in that range are ignored, and for byte value 32 to 127(96 characters), the count is incremented by 2 bytes, which is the size of one Unicode character. Essentially, it's another way of writing this code: If x >= 32 And x <=128 Then dst = dst + 2 End If The code doesn't accept byte values 128 to 255, and a runtime error would be generated in this case. You can avoid that by changing 96 to 224.
From: Schmidt on 27 Dec 2009 07:42
"David Kaye" <sfdavidkaye2(a)yahoo.com> schrieb im Newsbeitrag news:hh69nu$ugl$1(a)news.eternal-september.org... [String-Cleanup (advanced Trim-functionality)] > NOTE: I replaced characters < "A" only because > I wanted to replace a bunch of stuff in a typical text file, ... Careful with the range below Asc("A") - since you would also cleanup all "Number-Digits" and many "wanted punctuations" .... not sure, if that was your intent. > This was the winning routine: > > Sub MyReplace() > Dim i%, j%, a$, b$, timex > timex = Timer > b$ = Text1.Text > For j% = 1 To 64 > b$ = Replace(b$, Chr$(j%), "") > Next > For j% = 128 To 255 > b$ = Replace(b$, Chr$(j%), "") > Next > List1.AddItem CDec(Timer - timex) > End Sub Repeated calls to Replace (scanning the string over and over again, to replace different single-chars) is "horribly inefficient" ... ;-) Larrys approach with the lookup-array for the increments is a nice (and much better) idea - but as Nobody also pointed out - a direct checking per If ... or defining the replace-char-ranges in a "Select Case Line" works also very well and fast - and can be adapted easily as well to other ranges. Here comes a Sub, which does some cleanup (all chars between Asc 0-32 are removed) - directly on (within) the given Input-String - thereby avoiding *any* additional memory-allocations in the routine, which is, what Larrys routine slows down somewhat (about factor 3 in comparison). You will need to compile it to native-code (all advanced options set), to get "full-speed" - best to place such stuff in a Dll-Class - in that case (having the Routine in a Class) - you could move the aSrc/saSrc array-mapping-pair out of the routine - and declare it at the class-level (Binding and releasing could then be done in Class-Initialize/Terminate). '***Into a Form, then click the Form Option Explicit Private Type SafeArray1D cDims As Integer fFeatures As Integer cbElements As Long cLocks As Long pvData As Long cElements1d As Long lLBound1d As Long End Type Private Declare Sub BindArray Lib "kernel32" Alias "RtlMoveMemory" _ (PArr() As Any, PSrc&, Optional ByVal cb& = 4) Private Declare Sub ReleaseArray Lib "kernel32" Alias "RtlMoveMemory" _ (PArr() As Any, Optional PSrc& = 0, Optional ByVal cb& = 4) Private Declare Sub RtlMoveMemory Lib "kernel32" _ (dst As Any, src As Any, ByVal nBytes&) Private Declare Function QueryPerformanceFrequency& Lib "kernel32" (x@) Private Declare Function QueryPerformanceCounter& Lib "kernel32" (x@) Private Sub Form_Click() Dim i&, T#, S$ S = " abc" & vbTab & "123" & vbCrLf & "ABC" & Chr(0) & "123 " For i = 1 To 16 S = S & S 'results in an about 1.2MB test-string Next i T = HPTimer Cleanup S Caption = CLng((HPTimer - T) * 1000) End Sub Sub Cleanup(Text As String) Dim i&, j&, aSrc%(), saSrc As SafeArray1D saSrc.cDims = 1 saSrc.cbElements = 2 'the width of an 16Bit-Integer saSrc.cElements1d = Len(Text) + 2 'two more, to reflect the LBound saSrc.lLBound1d = -2 'include the 4 Len-Info-Bytes of the BSTR saSrc.pvData = StrPtr(Text) - 4 'adapt to the real start of the BSTR If saSrc.cElements1d = 2 Then Exit Sub 'nothing to replace BindArray aSrc, VarPtr(saSrc) For i = 0 To UBound(aSrc) Select Case aSrc(i) Case 0 To 32 '<-- define the ignored Char-ranges here... Case Else: aSrc(j) = aSrc(i): j = j + 1 End Select Next i RtlMoveMemory aSrc(-2), CLng(j + j), 4 'adjust the new Len-Info ReleaseArray aSrc End Sub Private Function HPTimer#() Dim x@: Static Frq@ If Frq = 0 Then QueryPerformanceFrequency Frq If QueryPerformanceCounter(x) Then HPTimer = x / Frq End Function Olaf |