Prev: Is it a bug in mciSendCommand function?
Next: AFX_MANAGE_STATE() macro disturbs the startup position of the ATL dialog
From: Tom Serface on 15 Jan 2010 01:40 I think you need to be really careful not to use up all the real memory as well so that you don't start swapping to disk. That is a killer that you don't even see coming, although 50MB shouldn't be a problem on most modern computers. Tom "Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message news:686vk59l8chun24uceekvdc8pt2uj4n811(a)4ax.com... > By the way, did anyone really notice that ReadFile and WriteFile in Win64 > cannot read or > write more than 4.2GB? Seems really, really strange the length and bytes > read did not > become DWORD_PTR values... > joe > > On Thu, 14 Jan 2010 16:37:26 -0500, Joseph M. Newcomer > <newcomer(a)flounder.com> wrote: > >>Yes, but the file size was given as 50MB. >> joe >> >>On Thu, 14 Jan 2010 14:24:30 -0600, Stephen Myers >><""StephenMyers\"@discussions(a)microsoft.com"> wrote: >> >>>Just to verify my (admittedly limited) understanding... >>> >>>I assume that the code posted will fail for files greater than 2GB or so >>>with a 32 bit OS due to available address space. >>> >>>Steve >>> >>>Joseph M. Newcomer wrote: >>>> See below... >>>> On Thu, 14 Jan 2010 09:01:47 -0600, "Peter Olcott" >>>> <NoSpam(a)SeeScreen.com> wrote: >>>> >>>>> "Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message >>>>> news:OzySgEPlKHA.2132(a)TK2MSFTNGP05.phx.gbl... >>>>>> Peter Olcott wrote: >>>>>> >>>>>>> "Hector Santos" <sant9442(a)nospam.gmail.com> wrote in >>>>>>> message news:%23OQCOfNlKHA.1824(a)TK2MSFTNGP04.phx.gbl... >>>>>>>> Peter Olcott wrote: >>>>>>>> >>>>>>>> By File Copy, you mean DOS copy command or the >>>>>>>> CopyFile() API? >>>>>>> I am using the DOS command prompt's copy command. This >>>>>>> is fast. >>>>>>> >>>>>>> >>>>>>> The problem is the contradiction formed by the fact that >>>>>>> reading and writng the file is fast, while reading and >>>>>>> not wrting this same file is slow. >>>>>>> I am currently using fopen() and fread(); I am using >>>>>>> Windows XP. >>>>>> True, if the DOS copy command is fast,then I believe the >>>>>> code you are using is not optimal. The DOS Copy is using >>>>>> the same CreateFile() API which fopen() also finally uses >>>>>> in the RTL. So you should be able to match the same >>>>>> performance of the DOS Copy command. >>>>>> >>>>>> Have you tried using setvbuf to set a buffer cache? >>>>>> >>>>>> Here is a small test code that opens a 50 meg file: >>>>>> >>>>>> // File: V:\wc7beta\testbufsize.cpp >>>>>> // Compile with: cl testbufsize.cpp >>>>>> >>>>>> #include <stdio.h> >>>>>> #include <windows.h> >>>>>> >>>>>> void main(char argc, char *argv[]) >>>>>> { >>>>>> char _cache[1024*16] = {0}; // 16K cache >>>>>> BYTE buf[1024*1] = {0}; // 1K buffer >>>> **** >>>> Reading a 50MB file, why such an incredibly tiny buffer? >>>> **** >>>>>> FILE *fv = fopen("largefile.dat","rb"); >>>>>> if (fv) { >>>>>> int res = setvbuf(fv, _cache, _IOFBF, >>>>>> sizeof(_cache)); >>>>>> DWORD nTotal = 0; >>>>>> DWORD nDisks = 0; >>>>>> DWORD nLoops = 0; >>>>>> DWORD nStart = GetTickCount(); >>>>>> while (!feof(fv)) { >>>>>> nLoops++; >>>>>> memset(&buf,sizeof(buf),0); >>>> **** >>>> The memset is silly. Wastes time, accomplishes nothing. You are >>>> setting a buffer to 0 >>>> right before completely overwriting it! This is like writing >>>> int a; >>>> >>>> a = 0; // make sure a is 0 before assigning b >>>> a = b; >>>> **** >>>>>> int nRead = fread(buf,1,sizeof(buf),fv); >>>>>> nTotal +=nRead; >>>>>> if (nRead > 0 && !fv->_cnt) nDisks++; >>>>>> } >>>>>> fclose(fv); >>>>>> printf("Time: %d | Size: %d | Reads: %d | Disks: >>>>>> %d\n", >>>>>> GetTickCount()-nStart, >>>>>> nTotal, >>>>>> nLoops, >>>>>> nDisks); >>>>>> } >>>>>> } >>>> **** >>>> If I were reading a small 50MB file, I would do >>>> >>>> void tmain(int argc, _TCHAR * argv[]) >>>> { >>>> HANDLE h = CreateFile(_T("largefile.dat"), GENERIC_READ, 0, NULL, >>>> OPEN_EXISTING, >>>> FILE_ATTRIBUTE_NORMAL, NULL); >>>> >>>> LARGE_INTEGER size; >>>> >>>> GetFileSizeEx(h, &size); >>>> >>>> // This code assumes file is < 4.2GB! >>>> LPVOID p = VirtualAlloc(NULL, (SIZE_T)size.LowPart, MEM_COMMIT, >>>> PAGE_READWRITE); >>>> DWORD bytesRead; >>>> ReadFile(h, p, size.LowPart, &bytesRead, NULL); >>>> ... process data >>>> VirtualFree(p, (SIZE_T)size.LowPart, MEM_DECOMMIT); >>>> return 0; >>>> } >>>> >>>> Note that the above does not do any error checking; the obvious error >>>> checking is left as >>>> an Exercise For The Reader. No read loops, no gratuitous memsets, just >>>> simple code that >>>> does exactly ONE ReadFile. >>>> joe >>>> >>>>>> What this basically shows is the number of disk hits it >>>>>> makes >>>>>> by checking the fv->_cnt value. It shows that as long as >>>>>> the cache size is larger than the read buffer size, you >>>>>> get the same number of disk hits. I also spit out the >>>>>> milliseconds. Subsequent runs, of course, is faster since >>>>>> the OS API CreateFile() is used by the RTL in buffer mode. >>>>>> >>>>>> Also do you know what protocol you have Samba using? >>>>> I am guessing that the code above will work with a file of >>>>> any size? >>>>> If that is the case, then you solved my problem. >>>>> The only Samba protocol that I am aware of is smb. >>>>> >>>>>> >>>>>> -- >>>>>> HLS >>>> Joseph M. Newcomer [MVP] >>>> email: newcomer(a)flounder.com >>>> Web: http://www.flounder.com >>>> MVP Tips: http://www.flounder.com/mvp_tips.htm >>Joseph M. Newcomer [MVP] >>email: newcomer(a)flounder.com >>Web: http://www.flounder.com >>MVP Tips: http://www.flounder.com/mvp_tips.htm > Joseph M. Newcomer [MVP] > email: newcomer(a)flounder.com > Web: http://www.flounder.com > MVP Tips: http://www.flounder.com/mvp_tips.htm
From: Tom Serface on 15 Jan 2010 01:42 I still had the problem on XP and Vista. I haven't tried testing it on Win7 yet, but I should do that since managing mapped drives is a holy nightmare especially if you are using services that can't access them. Tom "Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message news:#230xRPlKHA.5728(a)TK2MSFTNGP06.phx.gbl... > Tom Serface wrote: > >> Not sure if this helps or not, but I've also noticed that network access >> speeds up if you map a drive letter to the share rather than using a UNC. >> I've seen as much as 3x speed up. I'm not sure why. > > > Thats a good point. That was differently something I had to do under NT > and always keep that in mind with other OSes. But after running out of > drive letters and using UNC in certain new scenarios, I have not seen a > slow down. I figured MS addressed this old problem. I seem to recall > this was explained once back in the day. > > -- > HLS
From: Tom Serface on 15 Jan 2010 01:44 Well, as someone pointed out, it may not be that big of a difference these days. I know that Microsoft is not a big fan of supporting mapped drives any longer so they may not be putting a lot of effort into improving it. It was worth mentioning. Tom "Peter Olcott" <NoSpam(a)SeeScreen.com> wrote in message news:AOydnRtcXMeGutLWnZ2dnUVZ_sKdnZ2d(a)giganews.com... > > "Tom Serface" <tom(a)camaswood.com> wrote in message > news:uWT%23vsOlKHA.1536(a)TK2MSFTNGP06.phx.gbl... >> Not sure if this helps or not, but I've also noticed that network access >> speeds up if you map a drive letter to the share rather than using a UNC. >> I've seen as much as 3x speed up. I'm not sure why. >> >> Tom > > I am already doing that. > >> >> "Peter Olcott" <NoSpam(a)SeeScreen.com> wrote in message >> news:ut-dnZaUeubM69PWnZ2dnUVZ_vWdnZ2d(a)giganews.com... >>> I can copy a file to and from my Samba network in 20 seconds, yet it >>> takes 80 seconds to compute its MD5 value. The delay is not related to >>> the MD5 processing time because it can compute this in 12 seconds from >>> the local drive. >>> >>> What is going on here? >>> > >
From: Tom Serface on 15 Jan 2010 01:46 That's what my test showed me as well. To be fair, I am often copying files from optical media (CD, DVD, Blu-ray) so my buffer (16K) works well in that environment. Tom "Peter Olcott" <NoSpam(a)SeeScreen.com> wrote in message news:yt2dnXnWr6fyANLWnZ2dnUVZ_h2dnZ2d(a)giganews.com... > > "Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message > news:blruk59utgoph0al9saun1gu93dooccr60(a)4ax.com... >> See below... >> On Wed, 13 Jan 2010 23:55:56 -0600, "Peter Olcott" <NoSpam(a)SeeScreen.com> >> wrote: >> >>> >>>"Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message >>>news:%23OQCOfNlKHA.1824(a)TK2MSFTNGP04.phx.gbl... >>>> Peter Olcott wrote: >>>> >>>>> >>>>> I am doing block I/O, and it is very fast on the local >>>>> drive and much slower on the 1.0 gb LAN, yet file copies >>>>> to and from the LAN are still fast. >>>>> >>>>> (1) File copy to and from the LAN is faster than local >>>>> drive copies, 20 seconds for LAN, 25 seconds for local. >>>>> (2) Block I/O is fast on the local drive, 12 seconds for >>>>> 632 MB. >>>>> (3) Block I/O is slow on the LAN, 80 seconds for 632 MB. >>>>> I also tried changing the block size from 4K to 1500 >>>>> bytes and 9000 bytes (consistent with Ethernet frame >>>>> size), this did not help. >>>> >>>> By File Copy, you mean DOS copy command or the CopyFile() >>>> API? >>>I am using the DOS command prompt's copy command. This is >>>fast. >>> >>>> >>>> To me, the above appears to be consistent with a caching >>>> issue that your code is not enabling when the file is >>>> first open. The "File Copy" is doing it, but you are not. >>>> Probably showing how you are opening the file will help, >>>> i.e. the CreateFile() function or fopen(). >>>> >>>> Another thing is maybe to check google >>>> >>>> Search: SAMBA Slow File Copy >>> >>>The problem is the contradiction formed by the fact that >>>reading and writng the file is fast, while reading and not >>>wrting this same file is slow. >>>I am currently using fopen() and fread(); I am using >>>Windows XP. >> **** >> Use of fopen/fread would certainly indicate that you are not doing this >> in anything like >> an optimal fashion. >> >> If you want to read a file that is under 100MB, it is usually best just >> to allocate a >> buffer the size of the file, > > I tested this and increasing the buffer size beyond 64K and 4K > respectively does not measurably increase speed. Also in my case I must > process files of arbitrary sizes to compute their MD5. > >> CreateFile, do a single ReadFile, do your computation, do a >> WriteFile, and you are done. You are comparing two completely unrelated >> concepts: >> fopen/fread and a copy command; what you didn't ask was "what is the >> fastest way to read a >> file"; instead, you observe that two completely different technologies >> have different >> performance. You did not actually state this in your original question; >> you just used a >> generic concept of "copy". Details matter! >> >> Note that fread, called thousands of times, is amazingly slow in >> comparison to a single >> ReadFile. >> >> By failing to supply all the critical information, you essentially asked >> "Why is it that I >> can get from city A to city B in 20 minutes, but my friend takes two >> hours?" and neglected >> to mention you took the high-speed train while your friend went by >> bicycle. >> joe >> **** >>> >>>> >>>> There are some interesting hits, in particular if you are >>>> using Vista, this MS Samba hotfix for Vista, >>>> >>>> http://support.microsoft.com/kb/931770 >>>> >>>> There was another 2007 thread that the original poster >>>> said turning off indexing improved the speed. >>>> >>>> -- >>>> HLS >>> >> Joseph M. Newcomer [MVP] >> email: newcomer(a)flounder.com >> Web: http://www.flounder.com >> MVP Tips: http://www.flounder.com/mvp_tips.htm > >
From: Joseph M. Newcomer on 15 Jan 2010 13:11
Of course, this clearly represents a failure of imagination during the original design. joe On Thu, 14 Jan 2010 19:43:07 -0800, "Alexander Grigoriev" <alegr(a)earthlink.net> wrote: >Historically, I/O sizes in the kernel drivers has been stored in ULONG. And >Length in IO_STACK_LOCATION is ULONG, too. That would be a bit too much >hassle to convert everything to SIZE_T... > >"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message >news:686vk59l8chun24uceekvdc8pt2uj4n811(a)4ax.com... >> By the way, did anyone really notice that ReadFile and WriteFile in Win64 >> cannot read or >> write more than 4.2GB? Seems really, really strange the length and bytes >> read did not >> become DWORD_PTR values... >> joe >> >> On Thu, 14 Jan 2010 16:37:26 -0500, Joseph M. Newcomer >> <newcomer(a)flounder.com> wrote: >> >>>Yes, but the file size was given as 50MB. >>> joe >>> >>>On Thu, 14 Jan 2010 14:24:30 -0600, Stephen Myers >>><""StephenMyers\"@discussions(a)microsoft.com"> wrote: >>> >>>>Just to verify my (admittedly limited) understanding... >>>> >>>>I assume that the code posted will fail for files greater than 2GB or so >>>>with a 32 bit OS due to available address space. >>>> >>>>Steve >>>> >>>>Joseph M. Newcomer wrote: >>>>> See below... >>>>> On Thu, 14 Jan 2010 09:01:47 -0600, "Peter Olcott" >>>>> <NoSpam(a)SeeScreen.com> wrote: >>>>> >>>>>> "Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message >>>>>> news:OzySgEPlKHA.2132(a)TK2MSFTNGP05.phx.gbl... >>>>>>> Peter Olcott wrote: >>>>>>> >>>>>>>> "Hector Santos" <sant9442(a)nospam.gmail.com> wrote in >>>>>>>> message news:%23OQCOfNlKHA.1824(a)TK2MSFTNGP04.phx.gbl... >>>>>>>>> Peter Olcott wrote: >>>>>>>>> >>>>>>>>> By File Copy, you mean DOS copy command or the >>>>>>>>> CopyFile() API? >>>>>>>> I am using the DOS command prompt's copy command. This >>>>>>>> is fast. >>>>>>>> >>>>>>>> >>>>>>>> The problem is the contradiction formed by the fact that >>>>>>>> reading and writng the file is fast, while reading and >>>>>>>> not wrting this same file is slow. >>>>>>>> I am currently using fopen() and fread(); I am using >>>>>>>> Windows XP. >>>>>>> True, if the DOS copy command is fast,then I believe the >>>>>>> code you are using is not optimal. The DOS Copy is using >>>>>>> the same CreateFile() API which fopen() also finally uses >>>>>>> in the RTL. So you should be able to match the same >>>>>>> performance of the DOS Copy command. >>>>>>> >>>>>>> Have you tried using setvbuf to set a buffer cache? >>>>>>> >>>>>>> Here is a small test code that opens a 50 meg file: >>>>>>> >>>>>>> // File: V:\wc7beta\testbufsize.cpp >>>>>>> // Compile with: cl testbufsize.cpp >>>>>>> >>>>>>> #include <stdio.h> >>>>>>> #include <windows.h> >>>>>>> >>>>>>> void main(char argc, char *argv[]) >>>>>>> { >>>>>>> char _cache[1024*16] = {0}; // 16K cache >>>>>>> BYTE buf[1024*1] = {0}; // 1K buffer >>>>> **** >>>>> Reading a 50MB file, why such an incredibly tiny buffer? >>>>> **** >>>>>>> FILE *fv = fopen("largefile.dat","rb"); >>>>>>> if (fv) { >>>>>>> int res = setvbuf(fv, _cache, _IOFBF, >>>>>>> sizeof(_cache)); >>>>>>> DWORD nTotal = 0; >>>>>>> DWORD nDisks = 0; >>>>>>> DWORD nLoops = 0; >>>>>>> DWORD nStart = GetTickCount(); >>>>>>> while (!feof(fv)) { >>>>>>> nLoops++; >>>>>>> memset(&buf,sizeof(buf),0); >>>>> **** >>>>> The memset is silly. Wastes time, accomplishes nothing. You are >>>>> setting a buffer to 0 >>>>> right before completely overwriting it! This is like writing >>>>> int a; >>>>> >>>>> a = 0; // make sure a is 0 before assigning b >>>>> a = b; >>>>> **** >>>>>>> int nRead = fread(buf,1,sizeof(buf),fv); >>>>>>> nTotal +=nRead; >>>>>>> if (nRead > 0 && !fv->_cnt) nDisks++; >>>>>>> } >>>>>>> fclose(fv); >>>>>>> printf("Time: %d | Size: %d | Reads: %d | Disks: >>>>>>> %d\n", >>>>>>> GetTickCount()-nStart, >>>>>>> nTotal, >>>>>>> nLoops, >>>>>>> nDisks); >>>>>>> } >>>>>>> } >>>>> **** >>>>> If I were reading a small 50MB file, I would do >>>>> >>>>> void tmain(int argc, _TCHAR * argv[]) >>>>> { >>>>> HANDLE h = CreateFile(_T("largefile.dat"), GENERIC_READ, 0, NULL, >>>>> OPEN_EXISTING, >>>>> FILE_ATTRIBUTE_NORMAL, NULL); >>>>> >>>>> LARGE_INTEGER size; >>>>> >>>>> GetFileSizeEx(h, &size); >>>>> >>>>> // This code assumes file is < 4.2GB! >>>>> LPVOID p = VirtualAlloc(NULL, (SIZE_T)size.LowPart, MEM_COMMIT, >>>>> PAGE_READWRITE); >>>>> DWORD bytesRead; >>>>> ReadFile(h, p, size.LowPart, &bytesRead, NULL); >>>>> ... process data >>>>> VirtualFree(p, (SIZE_T)size.LowPart, MEM_DECOMMIT); >>>>> return 0; >>>>> } >>>>> >>>>> Note that the above does not do any error checking; the obvious error >>>>> checking is left as >>>>> an Exercise For The Reader. No read loops, no gratuitous memsets, just >>>>> simple code that >>>>> does exactly ONE ReadFile. >>>>> joe >>>>> >>>>>>> What this basically shows is the number of disk hits it >>>>>>> makes >>>>>>> by checking the fv->_cnt value. It shows that as long as >>>>>>> the cache size is larger than the read buffer size, you >>>>>>> get the same number of disk hits. I also spit out the >>>>>>> milliseconds. Subsequent runs, of course, is faster since >>>>>>> the OS API CreateFile() is used by the RTL in buffer mode. >>>>>>> >>>>>>> Also do you know what protocol you have Samba using? >>>>>> I am guessing that the code above will work with a file of >>>>>> any size? >>>>>> If that is the case, then you solved my problem. >>>>>> The only Samba protocol that I am aware of is smb. >>>>>> >>>>>>> >>>>>>> -- >>>>>>> HLS >>>>> Joseph M. Newcomer [MVP] >>>>> email: newcomer(a)flounder.com >>>>> Web: http://www.flounder.com >>>>> MVP Tips: http://www.flounder.com/mvp_tips.htm >>>Joseph M. Newcomer [MVP] >>>email: newcomer(a)flounder.com >>>Web: http://www.flounder.com >>>MVP Tips: http://www.flounder.com/mvp_tips.htm >> Joseph M. Newcomer [MVP] >> email: newcomer(a)flounder.com >> Web: http://www.flounder.com >> MVP Tips: http://www.flounder.com/mvp_tips.htm > Joseph M. Newcomer [MVP] email: newcomer(a)flounder.com Web: http://www.flounder.com MVP Tips: http://www.flounder.com/mvp_tips.htm |