From: ralph on 1 May 2010 18:39 On Sat, 1 May 2010 21:42:01 +0200, "Schmidt" <sss(a)online.de> wrote: > >"ralph" <nt_consulting64(a)yahoo.net> schrieb im Newsbeitrag >news:m9uot5tccbou117c6br28ftuahld03r6n6(a)4ax.com... > >> I suspect the first example, in "Access 97", is using DAO. >> Your VB6 application is using ADO. >> >> DAO is always faster than ADO with a local database. > >Whilst the above is true in nearly all cases, MMs problem >is not with the "different Recordset-creation and transfer- >mechanisms" (which are different in DAO and ADO) - >the resulting recordcount should be low in this case - and >therefore less significant regarding performance. > >The "most time consuming loop" in this case is one layer below IMO, >so this is more a "query-engine-thing" (JET-engine-Version related), >and the query-engine always needs to scan the full table, to >get access to the (about to be searched *within*) TextColumn - >and if the TextColumn-Definition in the DB-Schema has forced >a storage of the Columns content as ANSI-Text, then the >function which finally performs the internal Like-comparison >in the "table-scan-loop" would need to work in an ANSI- >version too, to achieve an optimal throughput. > >If (for whatever reason) the "current JET-engine-version in use" >does not choose different (maybe because only a "generalized >Unicode" Like-comparison-function is available), optimal matching >LikeW or LikeA implementations (according to the current >storage-format of the scanned text-column), then the effect >becomes explainable I'd say. > >But only guesses here on my side - we would get a clearer >picture, if MM would create a "varWChar-Typed" additional >Column in the DB (with a copy of the ANSI-ColumnContent) >and then performs the same Like-query twice, once against the >ANSI-column (as before), and then against the WChar-Column. > >Olaf > > Ha. I almost wander into that area (increasing performance of text comparisons) myself, but checked myself and decided to keep to a simple, less wordy, reply. A real effort for me. lol (Note: also I didn't see Nobody's reply or I would have kept my mouth shut even tighter. <g>) Overall, when it comes to using a Jet-formatted database, the fact that DAO is in-proc compared to ADO which is out-of-proc, and the fact that DAO and Jet grew-up together will account for a greater share of the performance differences. (DAO is practically the native client interface for JET, the two are joined at the hip from birth.) However, when it comes to text-comparison, as handy as 'Like' is, if performance is the objective - a little re-working of the data and re-phrasing of the question - can often go a long way towards improvement. -ralph
From: Henning on 1 May 2010 18:52 "Helmut Meukel" <NoSpam(a)NoProvider.de> skrev i meddelandet news:e1hmtzX6KHA.5848(a)TK2MSFTNGP06.phx.gbl... > "MM" <kylix_is(a)yahoo.co.uk> schrieb im Newsbeitrag > news:068pt5p21c3dn8tn3cvqk5sgf05o25raoc(a)4ax.com... >> >> What I have already thought of doing (I've been pondering speed-up >> approaches for days) is this: >> >> - Split each subject into words. >> - Discard stop words, hyphens, punctuation etc. >> - Add each word to a database (doesn't have to be Access). Each word >> is added once only (acting as unique key). >> - Against each word, store a long pointer that points to a list of >> record numbers associated with the subject text that the word came >> from. The record numbers would be stored as 4-byte (i.e. longs) in a >> binary file. The pointers in the list of unique words would act as >> Seek arguments for VB's Get statement. >> >> So the look-up process would be: >> - Find word in word list, either by binary chop (if list is sorted), >> binary tree, or via an Access 'helper'database. >> - Retrieve the pointer. >> - Seek to the pointer in the binary file containing all the record >> numbers, >> - Retrieve the records via the record numbers. >> - Do any further matching only in the found records. >> >> If one wanted to search for two terms connected by AND, e.g. "Fred AND >> Sally", one could initially retrieve the two blocks of records (Fred's >> block and Sally's block) and compare the record numbers. Only record >> numbers common to both blocks would need be considered. >> >> MM > > Why not just use DAO ? > If the 15 secs in Access are sufficient, your VB app will perform similiar > using DAO. > You can create the Query with Access and store it or create it using VB > code. > BTW, DAO is easier to use than ADO. > > Helmut. > > Just be aware that if using CreateWorkspace that is undocumented depending on MSRDO20.dll, wich comes with Office. /Henning
From: David Kaye on 1 May 2010 21:20 MM <kylix_is(a)yahoo.co.uk> wrote: >Details: Access 97 mdb, 1.7 million records. I'm searching a text >field (indexed) for a given phrase. Access 97 takes 15 secs to >execute: ADO versus DAO. ADO is an "out of process" connection, meaning that a communication link is established between it and your VB6 program. DAO runs within VB6, so it has a lot less overhead. Use DAO and you should see similar speed. Also, don't forget to use the forward-only option when you can, and be sure to index the keys you'll be using in your SQLs.
From: MM on 2 May 2010 04:31 On Sun, 2 May 2010 00:23:26 +0200, "Helmut Meukel" <NoSpam(a)NoProvider.de> wrote: > >"MM" <kylix_is(a)yahoo.co.uk> schrieb im Newsbeitrag >news:068pt5p21c3dn8tn3cvqk5sgf05o25raoc(a)4ax.com... >> >> What I have already thought of doing (I've been pondering speed-up >> approaches for days) is this: >> >> - Split each subject into words. >> - Discard stop words, hyphens, punctuation etc. >> - Add each word to a database (doesn't have to be Access). Each word >> is added once only (acting as unique key). >> - Against each word, store a long pointer that points to a list of >> record numbers associated with the subject text that the word came >> from. The record numbers would be stored as 4-byte (i.e. longs) in a >> binary file. The pointers in the list of unique words would act as >> Seek arguments for VB's Get statement. >> >> So the look-up process would be: >> - Find word in word list, either by binary chop (if list is sorted), >> binary tree, or via an Access 'helper'database. >> - Retrieve the pointer. >> - Seek to the pointer in the binary file containing all the record >> numbers, >> - Retrieve the records via the record numbers. >> - Do any further matching only in the found records. >> >> If one wanted to search for two terms connected by AND, e.g. "Fred AND >> Sally", one could initially retrieve the two blocks of records (Fred's >> block and Sally's block) and compare the record numbers. Only record >> numbers common to both blocks would need be considered. >> >> MM > >Why not just use DAO ? >If the 15 secs in Access are sufficient, Well, it's not really. I was surprised to discover that Access itself is so much faster than VB6 using ADO, but 15 secs is still abysmally slow for any search function. I want a result much faster than that, and I reckon my approach that I outlined above would reduce the time down to 1 to 2 secs on average. The word(s) one is searching on would be found instantly (binary chop is amazingly fast, even on a huge database) and thereafter it would only be the time needed to actually fetch the records via the record numbers. Okay, any additional comparison with the actual subject in the record would slow it down, but I reckon it would be a lot faster overall. MM
From: MM on 2 May 2010 04:37
On Sun, 02 May 2010 01:20:20 GMT, sfdavidkaye2(a)yahoo.com (David Kaye) wrote: >MM <kylix_is(a)yahoo.co.uk> wrote: >>Details: Access 97 mdb, 1.7 million records. I'm searching a text >>field (indexed) for a given phrase. Access 97 takes 15 secs to >>execute: > >ADO versus DAO. ADO is an "out of process" connection, meaning that a >communication link is established between it and your VB6 program. DAO runs >within VB6, so it has a lot less overhead. Use DAO and you should see similar >speed. Also, don't forget to use the forward-only option when you can, and be >sure to index the keys you'll be using in your SQLs. I'm already using rs.CursorType = adOpenForwardOnly and the text field in question (in the Access 97 mdb) is indexed. However, such an index is probably pretty useless for this project, because I am searching for words *within* the text field, not the whole text. Probably the text field's index would be useful if one were searching for the whole field as a single phrase, but the index that Access builds when one sets up an index on a text field isn't going to know about individual words. MM |