From: Helmut Meukel on 1 May 2010 17:17 "ralph" <nt_consulting64(a)yahoo.net> schrieb im Newsbeitrag news:m9uot5tccbou117c6br28ftuahld03r6n6(a)4ax.com... > On Sat, 01 May 2010 17:29:59 +0100, MM <kylix_is(a)yahoo.co.uk> wrote: > >>Details: Access 97 mdb, 1.7 million records. I'm searching a text >>field (indexed) for a given phrase. Access 97 takes 15 secs to >>execute: >> >>SELECT Data.DX_MsgNum, Data.DX_Subject, Data.DX_Date >>FROM Data >>WHERE (((Data.DX_Subject) Like "*ford prefect*")); >> >> >>However, in VB6 I have: >> >>Dim Conn As ADODB.Connection >>Dim rs As ADODB.Recordset >>Dim SQL as String >>Dim s as String >> >>s = "ford prefect" >> >>SQL = "SELECT DX_MsgNum, DX_Lines, DX_Date, " _ >>& "DX_Subject, DX_From FROM Data Where " _ >>& "DX_Subject like '%" & s & "%'" >> >>Set Conn = New ADODB.Connection >>OpenConnection Conn, DatabaseName >> >>Set rs = New ADODB.Recordset >>rs.CursorLocation = adUseClient >>rs.CursorType = adOpenForwardOnly >>rs.LockType = adLockReadOnly >>rs.Open SQL, Conn >> >>********************************** >>This takes 43 secs! >>********************************** >> >>Why does it take so much longer in VB6? Should I specify a different >>CursorLocation, CursorType and/or LockType? I'm reading William R >>Vaughan's "ADO Examples and Best Practices" at the moment, but haven't >>seen anything obvious yet that explains the considerable difference in >>performance. >> >>BTW, both tests are being run on the same PC. No network involved. >> >>MM > > I suspect the first example, in "Access 97", is using DAO. Your VB6 > application is using ADO. > > DAO is always faster than ADO with a local database. > > -ralph DAO is always faster than ADO when used against an Access97 (Jet) database, even when the mdb resides on a remote PC with a fast network connection (Gigabit, slight traffic). I would suspect VB6 using DAO will still be slightly slower than Access. In this case I would create the SQL as a parameter query and store it in the MDB. Using an existing Query and just passing the actual value for the LIKE part speeds up execution considerably. Your VB program can check if the query exists in the Querydefs collection and use it, otherwise create it. BTW, the DBs most of my customers use are still Access97, but some years ago I switched to DAO 3.6 after some Dupe opened a database with Access2000 and allowed Access2000 to convert the database. Afterwards my programs were unable to open the converted mdb file. I recompiled all my apps with DAO 3.6 and the next time the same thing happened with another customer my programs were not affected while all other users still using Access97 complained. <g> Helmut.
From: MM on 1 May 2010 17:42 On Sat, 1 May 2010 21:42:01 +0200, "Schmidt" <sss(a)online.de> wrote: >But only guesses here on my side - we would get a clearer >picture, if MM would create a "varWChar-Typed" In Access 97 I only have the following Data Types available: Text Memo Number Date/Time Currency AutoNumber Yes/No OLE Object Hyperlink Lookup Wizard... > additional >Column in the DB (with a copy of the ANSI-ColumnContent) >and then performs the same Like-query twice, once against the >ANSI-column (as before), and then against the WChar-Column. ....so I'm not sure what you mean here? MM
From: MM on 1 May 2010 17:58 On Sat, 1 May 2010 21:12:44 +0200, "Schmidt" <sss(a)online.de> wrote: > >"MM" <kylix_is(a)yahoo.co.uk> schrieb im Newsbeitrag >news:j6lot5th85beksiedeqar3poeintdp3t2q(a)4ax.com... >> Details: Access 97 mdb, 1.7 million records. I'm searching a text >> field (indexed) for a given phrase. Access 97 takes 15 secs to >> execute: >> >> SELECT Data.DX_MsgNum, Data.DX_Subject, Data.DX_Date >> FROM Data >> WHERE (((Data.DX_Subject) Like "*ford prefect*")); > >The reason why the same query per VB6+ADO >takes roughly factor 2.5 longer is (IMO) due to >Unicode-Conversion-stuff which takes place under >the hood (whilst performing the "Contains-SubString"- >operation per Like-Operator in the full-table-scan with >the JET 4-engine). In the "15sec case" (if this is done directly >within a running Access-Process) an probably older >DAO-Version will be used, which is then able to make >use of an ANSI-comparison-function against ANSI-Column- >Content (without any special conversion-overhead). > > >Normally one would think, that comparing longer 16Bit- >Unicode-Strings (instead of 8-Bit-ANSI) would then take >(not more than) factor 2 longer, but the now larger allocations >for temporarily created Memory-Structures + the apparently >needed ColumnFromAnsiToUnicode-conversions introduce >(in sum) an overhead which comes near factor 3... >Maybe you can reduce the query-times to only about >factor 1.5 longer ones, if you convert the Access97-DB- >Format to an *.mdb-format, which allows to define the >Column-Field in question as varWChar, so that within >the DB-File the Columns StringValues already reside >as 16Bit wide ones - which would mean less conversion- >overhead within the Like-comparisons themselfes. > >Another thought... >If you want to reduce the time of such "find-words-within- >column-content" to only miliseconds, then you could also >make a try with a conversion of your *.mdb to an SQLite-DB, >and there you would have Fulltext-Search-capabilities which >do make use of a pre-indexing of the content in question over >so called "Virtual Tables". After such preparations on your >SQLite-DB you could then query the Content-Column in >question in the same way as you are used to with a "Google- >SearchField" (including AND, NOT and OR-operations for >multiple SearchWords). > >If you have an interest in that, I could give you advise, how to >achieve this goal with the RichClient3-stuff you perhaps already >have in use (IIRC). > >Olaf > What I have already thought of doing (I've been pondering speed-up approaches for days) is this: - Split each subject into words. - Discard stop words, hyphens, punctuation etc. - Add each word to a database (doesn't have to be Access). Each word is added once only (acting as unique key). - Against each word, store a long pointer that points to a list of record numbers associated with the subject text that the word came from. The record numbers would be stored as 4-byte (i.e. longs) in a binary file. The pointers in the list of unique words would act as Seek arguments for VB's Get statement. So the look-up process would be: - Find word in word list, either by binary chop (if list is sorted), binary tree, or via an Access 'helper'database. - Retrieve the pointer. - Seek to the pointer in the binary file containing all the record numbers, - Retrieve the records via the record numbers. - Do any further matching only in the found records. If one wanted to search for two terms connected by AND, e.g. "Fred AND Sally", one could initially retrieve the two blocks of records (Fred's block and Sally's block) and compare the record numbers. Only record numbers common to both blocks would need be considered. MM
From: Helmut Meukel on 1 May 2010 18:23 "MM" <kylix_is(a)yahoo.co.uk> schrieb im Newsbeitrag news:068pt5p21c3dn8tn3cvqk5sgf05o25raoc(a)4ax.com... > > What I have already thought of doing (I've been pondering speed-up > approaches for days) is this: > > - Split each subject into words. > - Discard stop words, hyphens, punctuation etc. > - Add each word to a database (doesn't have to be Access). Each word > is added once only (acting as unique key). > - Against each word, store a long pointer that points to a list of > record numbers associated with the subject text that the word came > from. The record numbers would be stored as 4-byte (i.e. longs) in a > binary file. The pointers in the list of unique words would act as > Seek arguments for VB's Get statement. > > So the look-up process would be: > - Find word in word list, either by binary chop (if list is sorted), > binary tree, or via an Access 'helper'database. > - Retrieve the pointer. > - Seek to the pointer in the binary file containing all the record > numbers, > - Retrieve the records via the record numbers. > - Do any further matching only in the found records. > > If one wanted to search for two terms connected by AND, e.g. "Fred AND > Sally", one could initially retrieve the two blocks of records (Fred's > block and Sally's block) and compare the record numbers. Only record > numbers common to both blocks would need be considered. > > MM Why not just use DAO ? If the 15 secs in Access are sufficient, your VB app will perform similiar using DAO. You can create the Query with Access and store it or create it using VB code. BTW, DAO is easier to use than ADO. Helmut.
From: Henning on 1 May 2010 18:36
"MM" <kylix_is(a)yahoo.co.uk> skrev i meddelandet news:j6lot5th85beksiedeqar3poeintdp3t2q(a)4ax.com... > Details: Access 97 mdb, 1.7 million records. I'm searching a text > field (indexed) for a given phrase. Access 97 takes 15 secs to > execute: > > SELECT Data.DX_MsgNum, Data.DX_Subject, Data.DX_Date > FROM Data > WHERE (((Data.DX_Subject) Like "*ford prefect*")); > > > However, in VB6 I have: > > Dim Conn As ADODB.Connection > Dim rs As ADODB.Recordset > Dim SQL as String > Dim s as String > > s = "ford prefect" > > SQL = "SELECT DX_MsgNum, DX_Lines, DX_Date, " _ > & "DX_Subject, DX_From FROM Data Where " _ > & "DX_Subject like '%" & s & "%'" > > Set Conn = New ADODB.Connection > OpenConnection Conn, DatabaseName > > Set rs = New ADODB.Recordset > rs.CursorLocation = adUseClient > rs.CursorType = adOpenForwardOnly > rs.LockType = adLockReadOnly > rs.Open SQL, Conn > > ********************************** > This takes 43 secs! > ********************************** > > Why does it take so much longer in VB6? Should I specify a different > CursorLocation, CursorType and/or LockType? I'm reading William R > Vaughan's "ADO Examples and Best Practices" at the moment, but haven't > seen anything obvious yet that explains the considerable difference in > performance. > > BTW, both tests are being run on the same PC. No network involved. > > MM It would be best when comparing speed, to keep the queries equal. Not that I think it will make a big difference in this case. /Henning |