trying to minimize Write Conflicts in a multi-user database [General]

Prev: não acho meu orkut
Next: maike yordano pirela vera

From: Banana on 17 Jan 2010 23:57

Marshall Barton wrote:
> Side note. There was a cdma thread on this issue many years
> ago where David Fenton posted the results of extensive
> performance testing of using anything that boils down to
> WHERE False. In my mind, he conclusively demonstrated that
> Jet query optimization is not smart enough to recognize that
> no records will be retrieved and consequently does a full
> table scan.
>
> An alternative that can use indexes to optimize data
> retrieval is to compare a unique indexed field to an
> impossible value. For example, while only 99.99...%
> guaranteed safe,
> WHERE {increment autonumber PK field} = 0
> will be orders of magnitude faster than WHERE False. On a
> large table, this can make a HUGE difference.
>

Thanks for sharing. That certainly did piqued my curiosity and decided
to try for myself, mainly to see whether it was still true today as it
was 'many years ago'.

On a 64-bit Access 2010 & Win2008 OS with 2 GB RAM & 320 GB 7200 RPM HD,
I imported into a table containing 2.8 million rows of data or about 110
MB worth, and set up a VBA routine to perform several trials of 3 criteria:

WHERE PK = 0
WHERE PK = 0.1
WHERE 1 = 0

Looking at the ShowPlan output, it would seem that the first two would
perform better than last one since it states index is used while last
one does a table scan. I threw in 0.1, reasoning that since autonumber
were integers, 0.1 would be actually a better guarantee should someone
dare to insert 0 into the primary key but this obviously means an
implicit conversion, so... Anyway, just to pique my curiosity.

However, I was unable to show that the time required to open a empty
recordset was consistently in favor of any one of those expression.
Sometimes 1=0 is the fastest, sometimes, it's 0.1 then sometimes it's 0.
However, in all cases, none of those expressions exceeded 100
milliseconds and generally floats between 3 ranges; 15-ish, 30-40-ish
and 60-70ish nearly equally among three possible criteria.

The next thing I did was to cross reference David's results, but
unfortunately I have been unsuccessful in locating the CDMA thread
cited. I would love to see what he did and verify I did not miss
anything important.

It's too early to conclude anything but from those trials I did, I am
not quite convinced that 1=0 (which get transformed into "Not 0=0" in
ShowPlan, BTW... other variations gets the same transformation) is going
to be much slower than PK = 0 or PK = 0.1. The ShowPlan may say
1=0/False/Not 0=0 forces a table scan, but I think we have to question
whether it is a case of ShowPlan not being completely honest or
detailed, especially considering that we were working with 2.8 million
records, which I certainly think would be noticeable if table scan were
actually done.

If I could get a link to that CDMA thread, I would be quite delighted.

From: Tom Wickerath AOS168b AT comcast DOT on 18 Jan 2010 02:50

I have also run JET ShowPlan tests on the 1= 0 criteria, and indeed, I
noticed the table scans in the Showplan.out files. But, like Banana, the
performance *seemed* very fast. None of the tables in databases I work on at
work are all that huge, so I wasn't ready to make any conclusions based on
those results. But, at my place of work, we are using WANs whether we like it
or not, since most of the servers for shared folders that I use are located
in Bellevue, WA. and I work in Seattle. Still, the performance is very fast
and rock solid.

Tom Wickerath
Microsoft Access MVP
http://www.accessmvp.com/TWickerath/
__________________________________________

"Banana" wrote:

> Marshall Barton wrote:
> > Side note. There was a cdma thread on this issue many years
> > ago where David Fenton posted the results of extensive
> > performance testing of using anything that boils down to
> > WHERE False. In my mind, he conclusively demonstrated that
> > Jet query optimization is not smart enough to recognize that
> > no records will be retrieved and consequently does a full
> > table scan.
> >
> > An alternative that can use indexes to optimize data
> > retrieval is to compare a unique indexed field to an
> > impossible value. For example, while only 99.99...%
> > guaranteed safe,
> > WHERE {increment autonumber PK field} = 0
> > will be orders of magnitude faster than WHERE False. On a
> > large table, this can make a HUGE difference.
> >
>
> Thanks for sharing. That certainly did piqued my curiosity and decided
> to try for myself, mainly to see whether it was still true today as it
> was 'many years ago'.
>
> On a 64-bit Access 2010 & Win2008 OS with 2 GB RAM & 320 GB 7200 RPM HD,
> I imported into a table containing 2.8 million rows of data or about 110
> MB worth, and set up a VBA routine to perform several trials of 3 criteria:
>
> WHERE PK = 0
> WHERE PK = 0.1
> WHERE 1 = 0
>
> Looking at the ShowPlan output, it would seem that the first two would
> perform better than last one since it states index is used while last
> one does a table scan. I threw in 0.1, reasoning that since autonumber
> were integers, 0.1 would be actually a better guarantee should someone
> dare to insert 0 into the primary key but this obviously means an
> implicit conversion, so... Anyway, just to pique my curiosity.
>
> However, I was unable to show that the time required to open a empty
> recordset was consistently in favor of any one of those expression.
> Sometimes 1=0 is the fastest, sometimes, it's 0.1 then sometimes it's 0.
> However, in all cases, none of those expressions exceeded 100
> milliseconds and generally floats between 3 ranges; 15-ish, 30-40-ish
> and 60-70ish nearly equally among three possible criteria.
>
> The next thing I did was to cross reference David's results, but
> unfortunately I have been unsuccessful in locating the CDMA thread
> cited. I would love to see what he did and verify I did not miss
> anything important.
>
> It's too early to conclude anything but from those trials I did, I am
> not quite convinced that 1=0 (which get transformed into "Not 0=0" in
> ShowPlan, BTW... other variations gets the same transformation) is going
> to be much slower than PK = 0 or PK = 0.1. The ShowPlan may say
> 1=0/False/Not 0=0 forces a table scan, but I think we have to question
> whether it is a case of ShowPlan not being completely honest or
> detailed, especially considering that we were working with 2.8 million
> records, which I certainly think would be noticeable if table scan were
> actually done.
>
> If I could get a link to that CDMA thread, I would be quite delighted.

From: Banana on 18 Jan 2010 10:41

Banana wrote:
> However, I was unable to show that the time required to open a empty
> recordset was consistently in favor of any one of those expression.
> Sometimes 1=0 is the fastest, sometimes, it's 0.1 then sometimes it's 0.
> However, in all cases, none of those expressions exceeded 100
> milliseconds and generally floats between 3 ranges; 15-ish, 30-40-ish
> and 60-70ish nearly equally among three possible criteria.

A follow-up-

I wanted to cover two more bases:

1) I tested how much time it would take to evaluate the "Not 0=0" by
doing a For..Next loop with as many iterations as there were records in
the table. While this is not an apple to apple comparison, it should
give us some idea of how much time it takes to evaluate the Not 0=0 and
thus give us some idea of whether table scanning is actually being
done. In my first tests, I got roughly average of about 90-100
milliseconds, which is certainly slower than my reported trials for the
Not 0=0 but I do not consider this to be statistically significant so...

2) I decided to be really safe and expand my table from 2.8 millions
into 11.4 millions or about 435 MB and re-run the trials. Reasoning that
if the table scanning actually were being done, the time needed to
execute 1=0 should increase. In those trials, it did not and remained at
the same averages I reported last night with the other two criteria. The
For..Loop did increase to 400-500 milliseconds which is now significant
enough to warrant questioning whether ShowPlan is telling the complete
story when it claims that it does a table scan for 1=0/False/Not 0 = 0.

From: Marshall Barton on 18 Jan 2010 11:00

Banana wrote:

>Marshall Barton wrote:
>> Side note. There was a cdma thread on this issue many years
>> ago where David Fenton posted the results of extensive
>> performance testing of using anything that boils down to
>> WHERE False. In my mind, he conclusively demonstrated that
>> Jet query optimization is not smart enough to recognize that
>> no records will be retrieved and consequently does a full
>> table scan.
>>
>> An alternative that can use indexes to optimize data
>> retrieval is to compare a unique indexed field to an
>> impossible value. For example, while only 99.99...%
>> guaranteed safe,
>> WHERE {increment autonumber PK field} = 0
>> will be orders of magnitude faster than WHERE False. On a
>> large table, this can make a HUGE difference.
>>
>
>Thanks for sharing. That certainly did piqued my curiosity and decided
>to try for myself, mainly to see whether it was still true today as it
>was 'many years ago'.
>
>On a 64-bit Access 2010 & Win2008 OS with 2 GB RAM & 320 GB 7200 RPM HD,
>I imported into a table containing 2.8 million rows of data or about 110
>MB worth, and set up a VBA routine to perform several trials of 3 criteria:
>
>WHERE PK = 0
>WHERE PK = 0.1
>WHERE 1 = 0
>
>Looking at the ShowPlan output, it would seem that the first two would
>perform better than last one since it states index is used while last
>one does a table scan. I threw in 0.1, reasoning that since autonumber
>were integers, 0.1 would be actually a better guarantee should someone
>dare to insert 0 into the primary key but this obviously means an
>implicit conversion, so... Anyway, just to pique my curiosity.
>
>However, I was unable to show that the time required to open a empty
>recordset was consistently in favor of any one of those expression.
>Sometimes 1=0 is the fastest, sometimes, it's 0.1 then sometimes it's 0.
>However, in all cases, none of those expressions exceeded 100
>milliseconds and generally floats between 3 ranges; 15-ish, 30-40-ish
>and 60-70ish nearly equally among three possible criteria.
>
>The next thing I did was to cross reference David's results, but
>unfortunately I have been unsuccessful in locating the CDMA thread
>cited. I would love to see what he did and verify I did not miss
>anything important.
>
>It's too early to conclude anything but from those trials I did, I am
>not quite convinced that 1=0 (which get transformed into "Not 0=0" in
>ShowPlan, BTW... other variations gets the same transformation) is going
>to be much slower than PK = 0 or PK = 0.1. The ShowPlan may say
>1=0/False/Not 0=0 forces a table scan, but I think we have to question
>whether it is a case of ShowPlan not being completely honest or
>detailed, especially considering that we were working with 2.8 million
>records, which I certainly think would be noticeable if table scan were
>actually done.
>
>If I could get a link to that CDMA thread, I would be quite delighted.

I tried to search Google Groups, but that seems to have lost
its ability to find stuff in the archives. David, if you
are following this thread, we would appreciate it if you can
dredge up that test and re post your findings.

Banana, he performance results you and Tom are reporting are
very interesting, but I am not totally convinced at this
time. Performance testing can be very difficult and tricky,
especially with things that involve I/O. When caching is
used, the first run after a boot can be drastically
different than subsequent runs. Today's multi core
processors and seriously fast multi gigabyte memory may very
well make a cached table scan quick enough, maybe nearly as
fast as an index scan.

Then there is the issue of what other active processes are
also using the system's resources. A full table scan in
cached memory on a lightly loaded system might be pretty
fast, but when there are other processes making large
demands for processor cycles and memory, the result could be
a completely different story.

Tom's performance over a WAN (how fast?) is interesting and
warrants further analysis under varying user scenarios.

More than a little intriguing is the show plan use of Not
0=0 when the query used some other expression for False.
This strongly implies that the query optimizer can recognize
expressions that evaluate to False. Why it would then say
it will use a full table scan seems contradictory to me.

--
Marsh
MVP [MS Access]

From: John W. Vinson on 18 Jan 2010 12:51

On Mon, 18 Jan 2010 10:00:33 -0600, Marshall Barton <marshbarton(a)wowway.com>
wrote:

>More than a little intriguing is the show plan use of Not
>0=0 when the query used some other expression for False.
>This strongly implies that the query optimizer can recognize
>expressions that evaluate to False. Why it would then say
>it will use a full table scan seems contradictory to me.

I wonder if it's the same effect that causes references to Rnd() in a query to
return the same value for all rows: the query engine determines that no table
field is included in the expression and evaluates it only once. Might the same
be the case with 0=0 or False as a criterion?
--

John W. Vinson [MVP]

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13
Prev: não acho meu orkut
Next: maike yordano pirela vera