From: Hector Santos on 22 Mar 2010 12:06 Peter Olcott wrote: > Since my process (currently) requires unpredictable access > to far more memory than can fit into the largest cache, I > see no possible way that adding 1000-fold slower disk access > could possibly speed things up. This seems absurd to me. And I would agree it would be seem to be absurd to inexperience people. But you need to TRUST the power of your multi-processor computer because YOU are most definitely under utilizing it by a long shot. The code I posted is the proof! Your issue is akin to having a pickup truck, overloading the back, piling things on each other, overweight beyond the recommended safety levels per specifications of the car manufacturer (and city/state ordinances), and now your driving, speed, vision of your truck are all altered. Your truck won't go as fast now and if even if you could, things can fall, people can die, crashes can happen. You have two choices: - You can stop and unload stuff and come back and pick it up on 2nd strip, your total travel time doubled. - you can get a 2nd pick up truck, split the load and get on a four lanes highway and drive side by side, sometimes one creeps ahead, and the other moves ahead, and both reach the destination at the near same expected time. Same thing! You are overloading your machine to the point it working very very hard to satisfy your single thread process needs. You may "believe" it is working at optimal speeds because it has uninterrupted exclusive access but it is not reality. You are under utilizing the power of your machine. Whether you realize it or not, the overloaded pickup truck is smart and is stopping you every X milliseconds checking if you have a 2nd pickup truck to offload some work and do some moving for you!! You need to change your thinking. However, at this point, I don't think you have any coding skills, because if you did, you would be EAGERLY JUMPING at the code I provided to see for yourself. -- HLS
From: Hector Santos on 22 Mar 2010 12:20 Peter Olcott wrote: >> If you can see that in the code, then quite honestly, you >> don't know how to program or understand the concept of >> programming. > I am telling you the truth, I am almost compulsive about > telling the truth. When the conclusions are final I will post a link here. What GROUP is this? No one will trust your SUMMARY unless you cite the group. Until you do so, you're lying and making things up. I repeat: If you can't see the code I posted proves your thinking is incorrect, you don't know what you are talking about and its becoming obvious now you don't have any kind of programming or even engineering skills. -- HLS
From: Peter Olcott on 22 Mar 2010 13:15 "Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message news:uGVpjmdyKHA.5948(a)TK2MSFTNGP06.phx.gbl... > Peter Olcott wrote: > >> Since my process (currently) requires unpredictable >> access to far more memory than can fit into the largest >> cache, I see no possible way that adding 1000-fold slower >> disk access could possibly speed things up. This seems >> absurd to me. > > > And I would agree it would be seem to be absurd to > inexperience people. > > But you need to TRUST the power of your multi-processor > computer because YOU are most definitely under utilizing > it by a long shot. > > The code I posted is the proof! If it requires essentially nothing besides random access to entirely different places of 100 MB of memory, thenn (then and only then) would it be reasonably representative of my process. Nearly all the my process does is look up in memory the next place to look up in memory. > > Your issue is akin to having a pickup truck, overloading > the back, piling things on each other, overweight beyond > the recommended safety levels per specifications of the > car manufacturer (and city/state ordinances), and now your > driving, speed, vision of your truck are all altered. > Your truck won't go as fast now and if even if you could, > things can fall, people can die, crashes can happen. > > You have two choices: > > - You can stop and unload stuff and come back and pick > it up on > 2nd strip, your total travel time doubled. > > - you can get a 2nd pick up truck, split the load and > get > on a four lanes highway and drive side by side, > sometimes > one creeps ahead, and the other moves ahead, and > both reach > the destination at the near same expected time. > > Same thing! > > You are overloading your machine to the point it working > very very hard to satisfy your single thread process > needs. You may "believe" it is working at optimal speeds > because it has uninterrupted exclusive access but it is > not reality. You are under utilizing the power of your > machine. > > Whether you realize it or not, the overloaded pickup truck > is smart and is stopping you every X milliseconds checking > if you have a 2nd pickup truck to offload some work and do > some moving for you!! > > You need to change your thinking. > > However, at this point, I don't think you have any coding > skills, because if you did, you would be EAGERLY JUMPING > at the code I provided to see for yourself. > > -- > HLS
From: Peter Olcott on 22 Mar 2010 13:21 "Hector Santos" <sant9442(a)nospam.gmail.com> wrote in message news:%23o9WRudyKHA.3304(a)TK2MSFTNGP06.phx.gbl... > Peter Olcott wrote: > > >>> If you can see that in the code, then quite honestly, >>> you don't know how to program or understand the concept >>> of programming. > > >> I am telling you the truth, I am almost compulsive about >> telling the truth. When the conclusions are final I will >> post a link here. > > > What GROUP is this? No one will trust your SUMMARY unless > you cite the group. Until you do so, you're lying and > making things up. > > I repeat: If you can't see the code I posted proves your > thinking is incorrect, you don't know what you are talking > about and its becoming obvious now you don't have any kind > of programming or even engineering skills. > > -- > HLS I did not examine the code because I did not want to spend time looking at something that is not representative of my process. Looks at the criteria on my other post, and if you agree that it meets this criteria, then I will look at your code. You keep bringing up memory mapped files. Although this may very well be a very good way to use disk as RAM, or to load RAM from disk, I do not see any possible reasoning that could every possibly show that a hybrid combination of disk and RAM could ever exceed the speed of pure RAM alone. If you can then please show me the reasoning that supports this. Reasoning is the ONLY source of truth that I trust, all other sources of truth are subject to errors. Reasoning is also subject to errors, but, these errors can be readily discerned as breaking one or more of the rules of correct reasoning.
From: Hector Santos on 22 Mar 2010 13:27
On Mar 22, 11:02 am, "Peter Olcott" <NoS...(a)OCR4Screen.com> wrote: > (2) When a process requires essentially random (mostly > unpredictable) access to far more memory than can possibly > fit into the largest cache, then actual memory access time > becomes a much more significant factor in determining actual > response time. As a follow up, in the simulator ProcessData() function: void ProcessData() { KIND num; for(DWORD r = 0; r < nRepeat; r++) { Sleep(1); for (DWORD i=0; i < size; i++) { //num = data[i]; // array num = fmdata[i]; // file mapping array view } } } This is a serialize access to the data. Its not random. When you have multi-threads, you approach a empirical boundary condition where multiple accessors are requesting the same memory. So in one hand, the peter viewpoint, you have contention issue hence slow downs. On the other hand, the you have a CACHING effect, where the reading done by one thread benefits all others. Now, we can alter this ProcessData() by adding a random access logic: void ProcessData() { KIND num; for(DWORD r = 0; r < nRepeat; r++) { Sleep(1); for (DWORD i=0; i < size; i++) { DWORD j = (rand() % size); //num = data[j]; // array num = fmdata[j]; // file mapping array view } } } One would suspect higher pressures to move virtual memory into the process working set in random fashion. But in reality, that randomness may not be as over pressuring as you expect. Lets test this randomness. First a test with serialized access with two thread using a 1.5GB file map. V:\wc5beta>testpeter3t /r:2 /s:3000000 /t:2 - size : 3000000 - memory : 1536000000 (1500000K) - repeat : 2 - Memory Load : 22% - Allocating Data .... 0 * Starting threads - Creating thread 0 - Creating thread 1 * Resuming threads - Resuming thread# 0 in 743 msecs. - Resuming thread# 1 in 868 msecs. * Wait For Thread Completion - Memory Load: 95% * Done --------------------------------------- 0 | Time: 5734 | Elapsed: 0 1 | Time: 4906 | Elapsed: 0 --------------------------------------- Total Time: 10640 Notice the MEMORY LOAD climbed to 95%, thats because the entire spectrum of the data was read in. Now lets try unpredictable random access. I added a /j switch to enable the random indexing. V:\wc5beta>testpeter3t /r:2 /s:3000000 /t:2 /j - size : 3000000 - memory : 1536000000 (1500000K) - repeat : 2 - Memory Load : 22% - Allocating Data .... 0 * Starting threads - Creating thread 0 - Creating thread 1 * Resuming threads - Resuming thread# 0 in 116 msecs. - Resuming thread# 1 in 522 msecs. * Wait For Thread Completion - Memory Load: 23% * Done --------------------------------------- 0 | Time: 4250 | Elapsed: 0 1 | Time: 4078 | Elapsed: 0 --------------------------------------- Total Time: 8328 BEHOLD, it is even faster because of the randomness. The memory loading didn't climb because it didn't need to virtually load the entire 1.5GB into the process working set. So once again, your engineering (and lack thereof) philosophy is completely off base. You are under utilizing the power of your machine. -- HLS |