Prev: ADO: Mass updating a database with unique key
Next: add initial MSHFLexGrid row number to adodb. recordset
From: Tony Toews [MVP] on 2 Feb 2010 14:47 "Ivar" <ivar.ekstromer000(a)ntlworld.com> wrote: >If it's speed your looking for then my first thought would be to keep >communication with the ADO to a minimum. Run a query on the ADO to return a >single field recordset of the unique keys that already exists in the >database.Create a VB collection and add each record of the recordset to it. >Create a second VB collection. Try to add the new keys to the first >collection, if a duplicate key error happens when adding to the first >collection then add the new key to the second collection. The second >collection will then contain only the unique keys of the records to add to >the database. I'm sure this would be a lot faster than using ADO error >checking. Interesting idea. That would depend on the ratio of the number of records already present and to be inserted. If 10M and 20 then this solution wouldn't be practical. If 1M and 1M then it would make a lot of sense. Tony -- Tony Toews, Microsoft Access MVP Tony's Main MS Access pages - http://www.granite.ab.ca/accsmstr.htm Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/ For a convenient utility to keep your users FEs and other files updated see http://www.autofeupdater.com/ Granite Fleet Manager http://www.granitefleet.com/
From: Schmidt on 2 Feb 2010 17:02 "Tony Toews [MVP]" <ttoews(a)telusplanet.net> schrieb im Newsbeitrag news:s50hm5ditqornk7uo3g4mankv2t618tuqp(a)4ax.com... > "Ivar" <ivar.ekstromer000(a)ntlworld.com> wrote: > > >If it's speed your looking for then my first thought would > >be to keep communication with the ADO to a minimum. > >Run a query on the ADO to return a single field recordset > >of the unique keys that already exists in the database. > >Create a VB collection and add each record of the recordset > >to it... > > Interesting idea. That would depend on the ratio of the > number of records already present and to be inserted. > If 10M and 20 then this solution wouldn't be practical. This solution would be even more performant (relative), the more records exist in the DB. The KeyPoint (as already addressed by Ivar) is, to avoid inefficient Actions on the DB on a "per Record-Base" - even more so, if each of these single "Record-Exists-Checks" would be additionally buffered (or wrapped) by a VB- Error-Handler. A DB-engine is very good on "well-sized set-operations", but not that much on "highfrequent single DB-actions". An Exists-Check against a Dictionary which offers such an Exists-method directly (to avoid On Error wrapping) (or maybe a lightweight, selfwritten [B]-Tree-structure) cannot be beaten IMO, if compared to DB-based checks, which have to go through too many layers then. To reduce the amount of "hashed RecordKeys to check against", one can perform a fast "MinKey, MaxKey" determination on the (as Faraz writes, about 2000 new to be inserted records). This can be achieved in a simple loop first (no sorting needed on the 2000 new insert-candidates - just by checking/extending a Min- and a Max-Variable in one single "loop through". Though pre-sorting would not hurt. Then the Select against the already existing RecordKeys in the huge table can be reduced to the range (or ranges, if there are too huge gaps in-between) - this way you will end up with 1-3 smaller (single KeyField) Recordsets, which need to be transferred into the "Fast-KeyLookup" container-structures. I've already had to implement such a scenario and ended up with something like that - everything else I've tested was slower (including something like Larrys suggestion, based on Temp-Tables, which already would work faster than the "Error-Buffered #2-approach", since it would involve much less stress on the "DBLayer-*interface*", working more "DBEngine internally" then). That was based on my SortedDictionary-Implementation and SQLite, but the principle should work well also on ADO/JET (or DAO/JET), I'm very sure about that. Olaf
From: Faraz Azhar on 4 Feb 2010 00:08 On Feb 1, 3:53 pm, "Ivar" <ivar.ekstromer...(a)ntlworld.com> wrote: > If it's speed your looking for then my first thought would be to keep > communication with the ADO to a minimum. Run a query on the ADO to return a > single field recordset of the unique keys that already exists in the > database.Create a VB collection and add each record of the recordset to it. > Create a second VB collection. Try to add the new keys to the first > collection, if a duplicate key error happens when adding to the first > collection then add the new key to the second collection. The second > collection will then contain only the unique keys of the records to add to > the database. I'm sure this would be a lot faster than using ADO error > checking. > > Ivar I tried the #2 option. It was pretty slow. I'll try this one as well. Yes, speed is a big concern for me. On every synchronization I run about 3k SQL queries to achieve #2 option. That really pulls down the speed. I'll try the buffered approach by using collections (or dictionaries). Lets see if that increases the throughput.
From: David Youngblood on 4 Feb 2010 06:54 > "Faraz Azhar" <itzfaraz(a)gmail.com> wrote... > I tried the #2 option. It was pretty slow. I'll try this one as well. > Yes, speed is a big concern for me. On every synchronization I run > about 3k SQL queries to achieve #2 option. That really pulls down the > speed. I'll try the buffered approach by using collections (or > dictionaries). Lets see if that increases the throughput. Here are a couple of things you might try. DAO - generally faster than ADO against an access db. It also has a Seek method that provides for faster lookup of an indexed field. BeginTrans/CommitTrans - can be used to buffer your data, improving speed I haven't used it since vb3, but something like this. This is an outline, not code. BeginTrans Do While Not EOF(ff) Get next record i = i + 1 .Seek "=", keyvalue If .NoMatch Then .AddNew Fill in new record with data .Update End If If i Mod 100 = 0 Then CommitTrans BeginTrans End If Loop CommitTrans This may not work though, if your new data has duplicates within itself. David
From: Helmut Meukel on 4 Feb 2010 10:49 "David Youngblood" <dwy(a)flash.net> schrieb im Newsbeitrag news:OPRNvDZpKHA.5588(a)TK2MSFTNGP02.phx.gbl... > > Here are a couple of things you might try. > > DAO - generally faster than ADO against an access db. It also has a Seek > method that provides for faster lookup of an indexed field. > I found Seek unbeatable fast, compared to Find method or SQL. > BeginTrans/CommitTrans - can be used to buffer your data, improving speed > > I haven't used it since vb3, but something like this. > This is an outline, not code. > > BeginTrans > Do While Not EOF(ff) > Get next record > i = i + 1 > > .Seek "=", keyvalue > If .NoMatch Then > .AddNew > Fill in new record with data > .Update > End If > > If i Mod 100 = 0 Then > CommitTrans > BeginTrans > End If > > Loop > CommitTrans > > This may not work though, if your new data has duplicates within itself. > Hmm, I never tested if Seek will find the new entry if it's still part of a open transaction, but otherwise it will find the new entry and the code will skip the duplicate in your new data just fine. Worst case you can always add the key values of the not committed new records to an array and check the new key value against the values in the array first. Clear the array after the CommitTrans. Helmut.
First
|
Prev
|
Pages: 1 2 Prev: ADO: Mass updating a database with unique key Next: add initial MSHFLexGrid row number to adodb. recordset |