Prev: não acho meu orkut
Next: maike yordano pirela vera
From: Banana on 17 Jan 2010 23:57 Marshall Barton wrote: > Side note. There was a cdma thread on this issue many years > ago where David Fenton posted the results of extensive > performance testing of using anything that boils down to > WHERE False. In my mind, he conclusively demonstrated that > Jet query optimization is not smart enough to recognize that > no records will be retrieved and consequently does a full > table scan. > > An alternative that can use indexes to optimize data > retrieval is to compare a unique indexed field to an > impossible value. For example, while only 99.99...% > guaranteed safe, > WHERE {increment autonumber PK field} = 0 > will be orders of magnitude faster than WHERE False. On a > large table, this can make a HUGE difference. > Thanks for sharing. That certainly did piqued my curiosity and decided to try for myself, mainly to see whether it was still true today as it was 'many years ago'. On a 64-bit Access 2010 & Win2008 OS with 2 GB RAM & 320 GB 7200 RPM HD, I imported into a table containing 2.8 million rows of data or about 110 MB worth, and set up a VBA routine to perform several trials of 3 criteria: WHERE PK = 0 WHERE PK = 0.1 WHERE 1 = 0 Looking at the ShowPlan output, it would seem that the first two would perform better than last one since it states index is used while last one does a table scan. I threw in 0.1, reasoning that since autonumber were integers, 0.1 would be actually a better guarantee should someone dare to insert 0 into the primary key but this obviously means an implicit conversion, so... Anyway, just to pique my curiosity. However, I was unable to show that the time required to open a empty recordset was consistently in favor of any one of those expression. Sometimes 1=0 is the fastest, sometimes, it's 0.1 then sometimes it's 0. However, in all cases, none of those expressions exceeded 100 milliseconds and generally floats between 3 ranges; 15-ish, 30-40-ish and 60-70ish nearly equally among three possible criteria. The next thing I did was to cross reference David's results, but unfortunately I have been unsuccessful in locating the CDMA thread cited. I would love to see what he did and verify I did not miss anything important. It's too early to conclude anything but from those trials I did, I am not quite convinced that 1=0 (which get transformed into "Not 0=0" in ShowPlan, BTW... other variations gets the same transformation) is going to be much slower than PK = 0 or PK = 0.1. The ShowPlan may say 1=0/False/Not 0=0 forces a table scan, but I think we have to question whether it is a case of ShowPlan not being completely honest or detailed, especially considering that we were working with 2.8 million records, which I certainly think would be noticeable if table scan were actually done. If I could get a link to that CDMA thread, I would be quite delighted.
From: Tom Wickerath AOS168b AT comcast DOT on 18 Jan 2010 02:50 I have also run JET ShowPlan tests on the 1= 0 criteria, and indeed, I noticed the table scans in the Showplan.out files. But, like Banana, the performance *seemed* very fast. None of the tables in databases I work on at work are all that huge, so I wasn't ready to make any conclusions based on those results. But, at my place of work, we are using WANs whether we like it or not, since most of the servers for shared folders that I use are located in Bellevue, WA. and I work in Seattle. Still, the performance is very fast and rock solid. Tom Wickerath Microsoft Access MVP http://www.accessmvp.com/TWickerath/ __________________________________________ "Banana" wrote: > Marshall Barton wrote: > > Side note. There was a cdma thread on this issue many years > > ago where David Fenton posted the results of extensive > > performance testing of using anything that boils down to > > WHERE False. In my mind, he conclusively demonstrated that > > Jet query optimization is not smart enough to recognize that > > no records will be retrieved and consequently does a full > > table scan. > > > > An alternative that can use indexes to optimize data > > retrieval is to compare a unique indexed field to an > > impossible value. For example, while only 99.99...% > > guaranteed safe, > > WHERE {increment autonumber PK field} = 0 > > will be orders of magnitude faster than WHERE False. On a > > large table, this can make a HUGE difference. > > > > Thanks for sharing. That certainly did piqued my curiosity and decided > to try for myself, mainly to see whether it was still true today as it > was 'many years ago'. > > On a 64-bit Access 2010 & Win2008 OS with 2 GB RAM & 320 GB 7200 RPM HD, > I imported into a table containing 2.8 million rows of data or about 110 > MB worth, and set up a VBA routine to perform several trials of 3 criteria: > > WHERE PK = 0 > WHERE PK = 0.1 > WHERE 1 = 0 > > Looking at the ShowPlan output, it would seem that the first two would > perform better than last one since it states index is used while last > one does a table scan. I threw in 0.1, reasoning that since autonumber > were integers, 0.1 would be actually a better guarantee should someone > dare to insert 0 into the primary key but this obviously means an > implicit conversion, so... Anyway, just to pique my curiosity. > > However, I was unable to show that the time required to open a empty > recordset was consistently in favor of any one of those expression. > Sometimes 1=0 is the fastest, sometimes, it's 0.1 then sometimes it's 0. > However, in all cases, none of those expressions exceeded 100 > milliseconds and generally floats between 3 ranges; 15-ish, 30-40-ish > and 60-70ish nearly equally among three possible criteria. > > The next thing I did was to cross reference David's results, but > unfortunately I have been unsuccessful in locating the CDMA thread > cited. I would love to see what he did and verify I did not miss > anything important. > > It's too early to conclude anything but from those trials I did, I am > not quite convinced that 1=0 (which get transformed into "Not 0=0" in > ShowPlan, BTW... other variations gets the same transformation) is going > to be much slower than PK = 0 or PK = 0.1. The ShowPlan may say > 1=0/False/Not 0=0 forces a table scan, but I think we have to question > whether it is a case of ShowPlan not being completely honest or > detailed, especially considering that we were working with 2.8 million > records, which I certainly think would be noticeable if table scan were > actually done. > > If I could get a link to that CDMA thread, I would be quite delighted.
From: Banana on 18 Jan 2010 10:41 Banana wrote: > However, I was unable to show that the time required to open a empty > recordset was consistently in favor of any one of those expression. > Sometimes 1=0 is the fastest, sometimes, it's 0.1 then sometimes it's 0. > However, in all cases, none of those expressions exceeded 100 > milliseconds and generally floats between 3 ranges; 15-ish, 30-40-ish > and 60-70ish nearly equally among three possible criteria. A follow-up- I wanted to cover two more bases: 1) I tested how much time it would take to evaluate the "Not 0=0" by doing a For..Next loop with as many iterations as there were records in the table. While this is not an apple to apple comparison, it should give us some idea of how much time it takes to evaluate the Not 0=0 and thus give us some idea of whether table scanning is actually being done. In my first tests, I got roughly average of about 90-100 milliseconds, which is certainly slower than my reported trials for the Not 0=0 but I do not consider this to be statistically significant so... 2) I decided to be really safe and expand my table from 2.8 millions into 11.4 millions or about 435 MB and re-run the trials. Reasoning that if the table scanning actually were being done, the time needed to execute 1=0 should increase. In those trials, it did not and remained at the same averages I reported last night with the other two criteria. The For..Loop did increase to 400-500 milliseconds which is now significant enough to warrant questioning whether ShowPlan is telling the complete story when it claims that it does a table scan for 1=0/False/Not 0 = 0.
From: Marshall Barton on 18 Jan 2010 11:00 Banana wrote: >Marshall Barton wrote: >> Side note. There was a cdma thread on this issue many years >> ago where David Fenton posted the results of extensive >> performance testing of using anything that boils down to >> WHERE False. In my mind, he conclusively demonstrated that >> Jet query optimization is not smart enough to recognize that >> no records will be retrieved and consequently does a full >> table scan. >> >> An alternative that can use indexes to optimize data >> retrieval is to compare a unique indexed field to an >> impossible value. For example, while only 99.99...% >> guaranteed safe, >> WHERE {increment autonumber PK field} = 0 >> will be orders of magnitude faster than WHERE False. On a >> large table, this can make a HUGE difference. >> > >Thanks for sharing. That certainly did piqued my curiosity and decided >to try for myself, mainly to see whether it was still true today as it >was 'many years ago'. > >On a 64-bit Access 2010 & Win2008 OS with 2 GB RAM & 320 GB 7200 RPM HD, >I imported into a table containing 2.8 million rows of data or about 110 >MB worth, and set up a VBA routine to perform several trials of 3 criteria: > >WHERE PK = 0 >WHERE PK = 0.1 >WHERE 1 = 0 > >Looking at the ShowPlan output, it would seem that the first two would >perform better than last one since it states index is used while last >one does a table scan. I threw in 0.1, reasoning that since autonumber >were integers, 0.1 would be actually a better guarantee should someone >dare to insert 0 into the primary key but this obviously means an >implicit conversion, so... Anyway, just to pique my curiosity. > >However, I was unable to show that the time required to open a empty >recordset was consistently in favor of any one of those expression. >Sometimes 1=0 is the fastest, sometimes, it's 0.1 then sometimes it's 0. >However, in all cases, none of those expressions exceeded 100 >milliseconds and generally floats between 3 ranges; 15-ish, 30-40-ish >and 60-70ish nearly equally among three possible criteria. > >The next thing I did was to cross reference David's results, but >unfortunately I have been unsuccessful in locating the CDMA thread >cited. I would love to see what he did and verify I did not miss >anything important. > >It's too early to conclude anything but from those trials I did, I am >not quite convinced that 1=0 (which get transformed into "Not 0=0" in >ShowPlan, BTW... other variations gets the same transformation) is going >to be much slower than PK = 0 or PK = 0.1. The ShowPlan may say >1=0/False/Not 0=0 forces a table scan, but I think we have to question >whether it is a case of ShowPlan not being completely honest or >detailed, especially considering that we were working with 2.8 million >records, which I certainly think would be noticeable if table scan were >actually done. > >If I could get a link to that CDMA thread, I would be quite delighted. I tried to search Google Groups, but that seems to have lost its ability to find stuff in the archives. David, if you are following this thread, we would appreciate it if you can dredge up that test and re post your findings. Banana, he performance results you and Tom are reporting are very interesting, but I am not totally convinced at this time. Performance testing can be very difficult and tricky, especially with things that involve I/O. When caching is used, the first run after a boot can be drastically different than subsequent runs. Today's multi core processors and seriously fast multi gigabyte memory may very well make a cached table scan quick enough, maybe nearly as fast as an index scan. Then there is the issue of what other active processes are also using the system's resources. A full table scan in cached memory on a lightly loaded system might be pretty fast, but when there are other processes making large demands for processor cycles and memory, the result could be a completely different story. Tom's performance over a WAN (how fast?) is interesting and warrants further analysis under varying user scenarios. More than a little intriguing is the show plan use of Not 0=0 when the query used some other expression for False. This strongly implies that the query optimizer can recognize expressions that evaluate to False. Why it would then say it will use a full table scan seems contradictory to me. -- Marsh MVP [MS Access]
From: John W. Vinson on 18 Jan 2010 12:51
On Mon, 18 Jan 2010 10:00:33 -0600, Marshall Barton <marshbarton(a)wowway.com> wrote: >More than a little intriguing is the show plan use of Not >0=0 when the query used some other expression for False. >This strongly implies that the query optimizer can recognize >expressions that evaluate to False. Why it would then say >it will use a full table scan seems contradictory to me. I wonder if it's the same effect that causes references to Rnd() in a query to return the same value for all rows: the query engine determines that no table field is included in the expression and evaluates it only once. Might the same be the case with 0=0 or False as a criterion? -- John W. Vinson [MVP] |