Prev: não acho meu orkut
Next: maike yordano pirela vera
From: Marshall Barton on 18 Jan 2010 15:58 Banana wrote: >Marshall Barton wrote: >> Banana, he performance results you and Tom are reporting are >> very interesting, but I am not totally convinced at this >> time. Performance testing can be very difficult and tricky, >> especially with things that involve I/O. When caching is >> used, the first run after a boot can be drastically >> different than subsequent runs. > >I don't doubt there will be many variables that could skew the tests >which was why I wanted to cross reference David's test to check what I >could have had missed. > >FWIW- using OLE automation to start up an Access application and run >only one query using CurrentDb.Execute (e.g. no saved query), then >immediately closing, quitting and cleaning up does not appear to >materially affect the time. The Showplan seems to suggest that a new >instance of engine was started up via OLE Automation, so this should >reflect the boot time/cold start/no caching, I'd think. I would not make that assumption because I have no idea how vaeious versions of Windows handle its cache when a file closed. I have seen many/most chacing schemes that kept the file in cache until the space was needed and then chose the least recently used to abandon. Some schemes even go so far as to keep usage statistics and will drop the file with the lowest usage stats regardless of its age. > >> Today's multi core processors and seriously fast multi gigabyte memory may very >> well make a cached table scan quick enough, maybe nearly as >> fast as an index scan. > >I'm not so sure about multi-cores/CPU processing. AFAIK, Access >continues to be single-threaded so it'd always use one core so the >benefit of mulitple cores/CPU would be indirect in that OS may be able >to allocate competing resources to other cores/CPUs but beyond that, not >much difference. Note that while JET/ACE has a setting for numbers of >threads, KB articles suggests that thread safety was achieved only if >you use DAO or OLEDB provider and even so, thread-safety is limited. I was thinking more in terms of the OS and other processes competing for CPU cycles. > >I also tried tweaking the threads setting from 1 to 4 (default is 3) and >the timing was not affected at least slightest, suggesting to me that >threading was not relevant here. > >KB Article: >http://support.microsoft.com/kb/169395 >http://support.microsoft.com/kb/299973 I looks to me that these articles are referring to using Jet outside Access so I don't see their relevance to our discussion. > >As for processor & memory enabling a table scan to be as fast as index >scan, this was reason why I did extra check of doing a For...Next Loop >to get a rough idea of how much time it would take to do the false >comparisons that many time. If table scan actually were being done, the >time should have had increased when I expanded the tables. It didn't >while the For...Next loop increased just as much as the increase of >records. As a further comparison, doing a "Not PK = PK" was ridiculously >slow, around 40,000 milliseconds. It's also unfair because of comparing >two variables rather than two hard coded values, but I think the point >remains- if 1=0 actually did do a table scan, it should have been just >as slow as the For...Next loop at least and certainly four times slower >after the table expanded fourfold. "Not PK = PK was ridiculously slow" is very telling, if your test environment was nearly the same in both cases. The drastic difference seems to clearly imply that a "real" table scan is defierent from the WHERE False case so maybe the oprimizer (in Access/Jet version ??) is now using the information to skip the table scan. While Not PK = PK is not as trivial as False or 1=0, it is still something that can be reduced to False, so it does appear that WHERE False has been dealt with in some version post A97. This is very good to know and I think I can forget about posting comments like the one that started this subthread. > >> Then there is the issue of what other active processes are >> also using the system's resources. A full table scan in >> cached memory on a lightly loaded system might be pretty >> fast, but when there are other processes making large >> demands for processor cycles and memory, the result could be >> a completely different story. > >I suppose I could try the test again by re-allocating only 512 MB to the >VM and see what happens. If your test table were wide as well as tall, that might(?) provide useful information, but with 2 GB ram, it may not. > >> More than a little intriguing is the show plan use of Not >> 0=0 when the query used some other expression for False. >> This strongly implies that the query optimizer can recognize >> expressions that evaluate to False. Why it would then say >> it will use a full table scan seems contradictory to me. > >Well, I've kind of found Showplan to be quite lacking in the details. >For instance, it still does not support showing plans when subqueries >are used, despite having had went through 2 more JET versions and 2 ACE >versions. Another telltale sign is that when the ShowPlan starts up, it >records the Date of 0, suggesting to me that any bugs, issues and/or >enhancements associated with ShowPlan wasn't considered high-priority >and for those reasons, I think it is prudent to take the Showplan's >output with a grain of salt. That strikes me as odd. I thought Showplan was originally a debugging tool used by the query optimizer developers. Either they have a different tool now or they are very confident of their ability to make code changes ;-) -- Marsh MVP [MS Access]
From: Marshall Barton on 18 Jan 2010 16:57 Tom Wickerath <AOS168b AT comcast DOT net> wrote: >I sent a private e-mail to David late last night, inquiring about any past >test results. He replied as follows: > > "I don't recall anything of the sort, and don't see it in Google > Groups. Nor do I have archives of my posts from way back when. > > Sorry I can't help. Maybe Marshall has a better reference?" Not with the Google archives in such disarray. I definitely remember the gist of David's tests because I tried it on a client's A97 db that was kind of slow and it made a big difference. >> Performance testing can be very difficult and tricky, >> especially with things that involve I/O. > >I agree. When I am doing such testing at work, I typically reboot several >times, between each test, and I make sure to shut down as many applications >as I can, including Outlook, to try to make a somewhat stable baseline. On my >work PC, I cannot shut off the antivirus, so I just have to live with the >possibility that it may be adding some network load. > >I think a good way to test the 1=0 question might be to start with a really >large table, like Banana did, and monitor the amount of data transferred for: > >1.) An intentional table scan >Run some query that forces all records to be brought over a network wire, >perhaps an aggregrate query to sum a numeric field. > >2.) WHERE conditions that include 1=0, WHERE False, etc. > >One needs to first display the network monitor icon in the System Tray. In >Windows XP, one does this via: > > Control Panel | Network Connections > >Double-click the connection of interest, and then left-click on Properties. >Select the check box for the option that reads "Show icon in notification >area when connected". When you double-click the resulting System Tray icon, >you will see Packets Sent and Packets Received values. With a baseline that >is as repeatable as possible (ie. Outlook and any other networked >applications shut down), read the Packets Received value before and after >each individual test. The difference (delta) represents how many packets of >data was transferred to solve each query. Of course, one must keep in mind >that some other application that you may not be able to shut down may have >caused some of the traffic for a given test. So, one can run the test several >times, in an attempt to make sure there is not a contributing influence from >some other application. You need a split application, with the BE on a >network share, in order to use this method. Several years ago, John Viescas >recommended this method to me as a way of testing how much data actually >comes over the network wire. Good to know. I think that means the test needs to use a wide table so you can tell the difference between index retrieval and data retrieval. -- Marsh MVP [MS Access]
From: Banana on 18 Jan 2010 18:46 Marshall Barton wrote: > I would not make that assumption because I have no idea how > vaeious versions of Windows handle its cache when a file > closed. I have seen many/most chacing schemes that kept the > file in cache until the space was needed and then chose the > least recently used to abandon. Some schemes even go so far > as to keep usage statistics and will drop the file with the > lowest usage stats regardless of its age. I could be dead wrong but I would think that how OS caches file is irrelevant because the caching JET would do would be in memory and thus destroyed when JET is disposed of ?? > I looks to me that these articles are referring to using Jet > outside Access so I don't see their relevance to our > discussion. Those are the only two places where you could find 'Jet' and 'multi-threading' / 'thread safety' together, and yes they deal with non-Access uses, but what I didn't explicitly state was whether Jet was capable of breaking a query up between threads for parallel execution. If my tests tweaking threads are reliable, it doesn't seem to be the case which was what I wanted to rule out. > "Not PK = PK was ridiculously slow" is very telling, if your > test environment was nearly the same in both cases. The > drastic difference seems to clearly imply that a "real" > table scan is defierent from the WHERE False case so maybe > the oprimizer (in Access/Jet version ??) is now using the > information to skip the table scan. While Not PK = PK is > not as trivial as False or 1=0, it is still something that > can be reduced to False, so it does appear that WHERE False > has been dealt with in some version post A97. It seems to suggest this, but I should try Tom's idea of monitoring the network traffic to be sure we didn't miss anything here. > This is very good to know and I think I can forget about posting comments > like the one that started this subthread. Well, actually I was glad you posted it. It's too easy to get in a rut and repeating the old truism. Indeed, I was quite shocked when I was told that for SQL Server, existence check runs equally well if not slightly better than the frustrated join while I've been sprouting this "truism" for a while. (With JET, I'm inclined to say 'it depends.') Anyway, it's always good to check the facts as time passes. > If your test table were wide as well as tall, that might(?) > provide useful information, but with 2 GB ram, it may not. Well, the table contains five columns, 3 longs and 2 dates, or 28 bytes per row. There's only one index, and that's the primary key. It would seem to me that an index page would contains 7 as much records than a data page for the same table. > That strikes me as odd. I thought Showplan was originally a > debugging tool used by the query optimizer developers. > Either they have a different tool now or they are very > confident of their ability to make code changes ;-) But that's exactly my point - If it was a tool they were using, they would have had enhanced it over time. It hasn't, which lead me to think they moved on to something else for their internal tests. Indeed, I hope the latter isn't true! ;) On the flip side, though, I _think_ there wasn't any serious performance enhancement between JET 3.0 and ACE 14.0 (Access 2010). It seemed to me that the changes in between were usually more features rather than enhancement or fixing bugs. At least, I've yet to hear of such enhancement in between.
From: Paul on 18 Jan 2010 20:02 John, Here are both my office and home addresses in case one of them filters out the attachment. Work - my first and last names (below) separated by a dot, followed by the "at." The remainder is dgs ca gov, but with dots instead of spaces between the three strings. Home - my last name only, followed by the "at. The remainder is surewest net, again with a period instead of the space. I'm looking forward to receiving your Email. Thanks so much. Paul Ponzelli Staff Real Estate Officer DGS Central Leasing
From: Marshall Barton on 19 Jan 2010 00:10
Banana wrote: >Marshall Barton wrote: >> I would not make that assumption because I have no idea how >> vaeious versions of Windows handle its cache when a file >> closed. I have seen many/most chacing schemes that kept the >> file in cache until the space was needed and then chose the >> least recently used to abandon. Some schemes even go so far >> as to keep usage statistics and will drop the file with the >> lowest usage stats regardless of its age. > >I could be dead wrong but I would think that how OS caches file is >irrelevant because the caching JET would do would be in memory and thus >destroyed when JET is disposed of ?? This is getting deeper than my knowledge, but there may very well be multiple layers of caching going on. > >> I looks to me that these articles are referring to using Jet >> outside Access so I don't see their relevance to our >> discussion. > >Those are the only two places where you could find 'Jet' and >'multi-threading' / 'thread safety' together, and yes they deal with >non-Access uses, but what I didn't explicitly state was whether Jet was >capable of breaking a query up between threads for parallel execution. >If my tests tweaking threads are reliable, it doesn't seem to be the >case which was what I wanted to rule out. > >> "Not PK = PK was ridiculously slow" is very telling, if your >> test environment was nearly the same in both cases. The >> drastic difference seems to clearly imply that a "real" >> table scan is defierent from the WHERE False case so maybe >> the oprimizer (in Access/Jet version ??) is now using the >> information to skip the table scan. While Not PK = PK is >> not as trivial as False or 1=0, it is still something that >> can be reduced to False, so it does appear that WHERE False >> has been dealt with in some version post A97. > >It seems to suggest this, but I should try Tom's idea of monitoring the >network traffic to be sure we didn't miss anything here. > >> This is very good to know and I think I can forget about posting comments >> like the one that started this subthread. > >Well, actually I was glad you posted it. It's too easy to get in a rut >and repeating the old truism. Indeed, I was quite shocked when I was >told that for SQL Server, existence check runs equally well if not >slightly better than the frustrated join while I've been sprouting this >"truism" for a while. (With JET, I'm inclined to say 'it depends.') > >Anyway, it's always good to check the facts as time passes. > > >> If your test table were wide as well as tall, that might(?) >> provide useful information, but with 2 GB ram, it may not. > >Well, the table contains five columns, 3 longs and 2 dates, or 28 bytes >per row. There's only one index, and that's the primary key. It would >seem to me that an index page would contains 7 as much records than a >data page for the same table. Yes but - if the table was already in RAM the time to scan the indexes would be the same as scanning the table. However, they probably use a Btree or hashing to search the indexes instead of scanning them but a million cycles to loop through the table in RAM is still only a millisecond compared to less than a microsecond to check the index. Big difference, but a human would probably not notice it. Way back when I tried David's idea in a real table with about 50K records of about 1K each on a 1GH machine with 256MB, the difference went from about 2 or 3 seconds to near 0, very noticeable. > >> That strikes me as odd. I thought Showplan was originally a >> debugging tool used by the query optimizer developers. >> Either they have a different tool now or they are very >> confident of their ability to make code changes ;-) > >But that's exactly my point - If it was a tool they were using, they >would have had enhanced it over time. It hasn't, which lead me to think >they moved on to something else for their internal tests. Indeed, I hope >the latter isn't true! ;) > >On the flip side, though, I _think_ there wasn't any serious performance >enhancement between JET 3.0 and ACE 14.0 (Access 2010). It seemed to me >that the changes in between were usually more features rather than >enhancement or fixing bugs. At least, I've yet to hear of such >enhancement in between. I don't know the details, but there were a lot of bug fixes in A2007, maybe some even in Jet. ACE had a lot of changes from Jet and who knows what they might have done with performance while adding all those new multi value field, attachments, ... thingies. -- Marsh MVP [MS Access] |