From: Le Chaud Lapin on 22 Oct 2009 01:00 Hi All, I doth seek to remain a sloth... I have an application that does a bunch of ReadFile's against a 470MB file. I have to read the entire file from end to end, always sequentially. Each read can be anywhere from a few bytes to several kilobytes. Process Explorer shows roughly 41,000,000 reads during an 8- minute run. Mean is roughly 50 bytes, with stand-dev I'm guessing maybe 10 bytes. I realized that performance will improve dramatically when I eliminate so many U/K transitions by doing block reads, but in the meantime, I was wondering how much improvement to expect by doing a kind of priming, where I pump all blocks to RAM at least once. I just found the following link for setting cache size on Windows: http://support.microsoft.com/kb/837331 I would essentially write a bit of code that slammed the entire 470MB into RAM before doing my ReadFiles, which, btw, is supporting a kind of serialization. I'd like to know what I can expect for improvement (roughly of course). TIA, -Le Chaud Lapin-
From: Mihai N. on 22 Oct 2009 04:37 > I realized that performance will improve dramatically when I eliminate > so many U/K transitions by doing block reads, Somewhere at low level there are already block reads. So I don't expect you will gain much. It depends a lot on the read pattern. Is it sequencial, or do you have to jump back and forth? You read "records" knowing the length, or you read lines (looking for "\n")? First rule of optimization: measure, to make sure you know what to optimize. (ok, it is not the first rule, I think it is the 3rd :-) -- Mihai Nita [Microsoft MVP, Visual C++] http://www.mihai-nita.net ------------------------------------------ Replace _year_ with _ to get the real email
From: Uwe Sieber on 22 Oct 2009 06:58 http://support.microsoft.com/kb/837331 says "There is no limit to the physical cache size" and this seems to be true. I've no idea whan "Virtual cache size" means. The values 512 MB and 960 MB sound familiar, these seem to be the max cache's working set sizes. To get the desired 'all in RAM' effect, open the file with FILE_FLAG_RANDOM_ACCESS and read it from start to end. If there was enough free memory then it will completely be hold in RAM. Uwe Le Chaud Lapin wrote: > Hi All, > > I doth seek to remain a sloth... > > I have an application that does a bunch of ReadFile's against a 470MB > file. I have to read the entire file from end to end, always > sequentially. Each read can be anywhere from a few bytes to several > kilobytes. Process Explorer shows roughly 41,000,000 reads during an 8- > minute run. Mean is roughly 50 bytes, with stand-dev I'm guessing > maybe 10 bytes. > > I realized that performance will improve dramatically when I eliminate > so many U/K transitions by doing block reads, but in the meantime, I > was wondering how much improvement to expect by doing a kind of > priming, where I pump all blocks to RAM at least once. I just found > the following link for setting cache size on Windows: > > http://support.microsoft.com/kb/837331 > > I would essentially write a bit of code that slammed the entire 470MB > into RAM before doing my ReadFiles, which, btw, is supporting a kind > of serialization. > > I'd like to know what I can expect for improvement (roughly of > course). > > TIA, > > -Le Chaud Lapin-
From: Paul Baker [MVP, Windows Desktop Experience] on 22 Oct 2009 09:03 If it's sequential, consider using FILE_FLAG_SEQUENTIAL_SCAN. If it's random access, consider using FILE_FLAG_RANDOM_ACCESS. For performance reasons, I wrote a class many years ago that will use a buffer, by default of 64 KB, when reading sequentially. This was many years ago and at the time it seemed to improve performance on Windows 9x hard drives a little and on Windows NT 4.0 floppy drives a lot. But in other cases, there was no noticable difference and on modern systems, I am doubting there is any difference at all. Mihai is correct, the system is caching reads and ought to be able to take care of this for you. I never saw any improvement under any circumstances with a buffer over 64 KB. Don't make the mistake of using a huge buffer in a single call to ReadFile. This could actually negatively impact performance as well as system resources and stability. We saw this problem recently. On some systems I could use a buffer of hundreds of megabytes without issue, whereas on others I could not use a buffer size over about 64 MB. After digging through low level documentation, I am unable to explain it fully, but it is something to do with the fact that the device driver needs contiguous physical memory and depending on the type of buffer management used, may allocate a new buffer equal in size to the callers. And it may be from kernel memory (nonpaged pool), which is a scarce resource. However, the only way to truly know what the performance is is to measure it, as Mihai said. See if any of the above things help. Don't forget to reboot between tests and repeat them multiple times to ensure you are getting consistent results. I believe SysInternals has a tool that clears the system cache, but rebooting is easy and eliminates some other possible causes of interference too. If you want to "slam" the entire file into memory, why not use a memory mapped file? I don't think I would mess with system wide caching settings. The defaults should be fine. Paul "Le Chaud Lapin" <jaibuduvin(a)gmail.com> wrote in message news:74647255-58cc-4a84-b8af-bdc6a93e166a(a)d4g2000vbm.googlegroups.com... > Hi All, > > I doth seek to remain a sloth... > > I have an application that does a bunch of ReadFile's against a 470MB > file. I have to read the entire file from end to end, always > sequentially. Each read can be anywhere from a few bytes to several > kilobytes. Process Explorer shows roughly 41,000,000 reads during an 8- > minute run. Mean is roughly 50 bytes, with stand-dev I'm guessing > maybe 10 bytes. > > I realized that performance will improve dramatically when I eliminate > so many U/K transitions by doing block reads, but in the meantime, I > was wondering how much improvement to expect by doing a kind of > priming, where I pump all blocks to RAM at least once. I just found > the following link for setting cache size on Windows: > > http://support.microsoft.com/kb/837331 > > I would essentially write a bit of code that slammed the entire 470MB > into RAM before doing my ReadFiles, which, btw, is supporting a kind > of serialization. > > I'd like to know what I can expect for improvement (roughly of > course). > > TIA, > > -Le Chaud Lapin-
From: Chris M. Thomasson on 22 Oct 2009 13:51
"Le Chaud Lapin" <jaibuduvin(a)gmail.com> wrote in message news:74647255-58cc-4a84-b8af-bdc6a93e166a(a)d4g2000vbm.googlegroups.com... > Hi All, > > I doth seek to remain a sloth... > > I have an application that does a bunch of ReadFile's against a 470MB > file. I have to read the entire file from end to end, always > sequentially. Each read can be anywhere from a few bytes to several > kilobytes. Process Explorer shows roughly 41,000,000 reads during an 8- > minute run. Mean is roughly 50 bytes, with stand-dev I'm guessing > maybe 10 bytes. > > I realized that performance will improve dramatically when I eliminate > so many U/K transitions by doing block reads, but in the meantime, I > was wondering how much improvement to expect by doing a kind of > priming, where I pump all blocks to RAM at least once. I just found > the following link for setting cache size on Windows: > > http://support.microsoft.com/kb/837331 > > I would essentially write a bit of code that slammed the entire 470MB > into RAM before doing my ReadFiles, which, btw, is supporting a kind > of serialization. > > I'd like to know what I can expect for improvement (roughly of > course). Use a memory mapped file, and process the memory in sequential order (e.g., base to base + size_of_file). BTW, what type of processing are you doing? Does processing of one part of the file always depend on the processing results of a previous portion of the file? |