Prev: Learn about proxy sites and how to use them to open blocked sites unlimited downloads from RapidShare and megaupload and increase the speed of the Internet with new sites for free
Next: Why can't I catch a SIGBUS on my Mac (when strlen(NULL) is used)?
From: Moi on 5 Apr 2010 16:15 On Mon, 05 Apr 2010 14:43:29 -0500, Peter Olcott wrote: > "Nicolas George" <nicolas$george(a)salle-s.org> wrote in message > news:4bba35de$0$29842$426a74cc(a)news.free.fr... >> "Peter Olcott" wrote in message >> <j-qdnXbpHJorrSfWnZ2dnUVZ_jWdnZ2d(a)giganews.com>: >>> Not if you force every access to be at (least size * 0.25) >>> distance from the prior access. >> >> Distance has nothing to do with it. You can cycle between the first and >> last >> element of the array, that makes size - epsilon, but it still is a very >> small cycle. > > I am trying to find algorithm that makes cache as ineffective as > possible. I want to get an accurate measure of the worst case > performance of my deterministic finite automaton. The only thing that I > can think of is to make sure that each memory access is more than max > cache size away from the prior one. This should eliminate spatial > locality of reference. I've one done a similar thing for a disk-exerciser: grab each diskblok exactly once, but in pseudo-random order, and sufficiently distant to avoid read-ahead to be effective. For N disk blocks, I chose the increment M around N/3, and relatively prime to N (this guarantees No more than 1 cycle). Something like: for (x=m; x != 0 ; x = (x+m) %n) { fetch block x } I found M by starting at n/3, and counting down until there is no GCD between (N,M) IIRC. HTH, AvK
From: Ersek, Laszlo on 5 Apr 2010 16:16 On Mon, 5 Apr 2010, Peter Olcott wrote: > > "Nicolas George" <nicolas$george(a)salle-s.org> wrote in > message news:4bba35de$0$29842$426a74cc(a)news.free.fr... >> "Peter Olcott" wrote in message >> <j-qdnXbpHJorrSfWnZ2dnUVZ_jWdnZ2d(a)giganews.com>: >>> Not if you force every access to be at (least size * 0.25) distance >>> from the prior access. >> >> Distance has nothing to do with it. You can cycle between the first and >> last element of the array, that makes size - epsilon, but it still is a >> very small cycle. > > I am trying to find algorithm that makes cache as ineffective as > possible. I want to get an accurate measure of the worst case > performance of my deterministic finite automaton. The only thing that I > can think of is to make sure that each memory access is more than max > cache size away from the prior one. This should eliminate spatial > locality of reference. Caches utilize temporal locality too. Will you give my code a chance? I'm really curious how the CPU times would compare if you matched the array sizes and the number of accesses in both programs. Especially whether the worst-of-ten result of your implementation pessimizes the average access time as much as my implementation of Sattolo's. Just a fleeting thought. lacos
From: David Schwartz on 5 Apr 2010 16:18 On Apr 5, 6:52 am, "Peter Olcott" <NoS...(a)OCR4Screen.com> wrote: > It is a "given" (something I can't change) that the web > server will have one thread per HTTP request. It is also a > "given" architectural decision to provide optimal > performance that I will have at least one thread per > processor core that doe the OCR processing. I might have two > threads per processor core to do OCR processing because I > will have two level of OCR processing priority, and the > first level of priority has absolute priority over the > second level. I would not recommend using priorities this way. It usually doesn't work. If you want some requests to have priority over others, do those requests first. Attempting to do them with higher-priority threads tends to lead to extreme pain unless you have expertise in dealing with thread priorities. (Search 'priority inversion' for just one of the horrible things that can happen to you.) DS
From: Scott Lurndal on 5 Apr 2010 17:10 "Peter Olcott" <NoSpam(a)OCR4Screen.com> writes: > >"Mark Hobley" <markhobley(a)hotpop.donottypethisbit.com> wrote >in message news:v4nl87-ula.ln1(a)neptune.markhobley.yi.org... >> In comp.unix.programmer Peter Olcott >> <NoSpam(a)ocr4screen.com> wrote: >>> I will be receiving up to 100 web requests per second >>> (that >>> is the maximum capacity) and I want to begin processing >>> them >>> as soon as they arrive hopefully without polling for >>> them. >> >> That will happen. If 100 web requests come down the pipe, >> the receiving process >> will get them. >> >> It is only when no requests come down the pipe that the >> receiving process will >> have to wait for a request to come in. This is no big >> deal. >> >> Mark. >> >> -- >> Mark Hobley >> Linux User: #370818 http://markhobley.yi.org/ >> > >So it is in an infinite loop eating up all of the CPU time >only when there are no requests to process? I don't think >that I want this either because I will have two priorities >of requests. If there are no high priority requests I want >it to begin working on the low priority requests. man poll
From: Scott Lurndal on 5 Apr 2010 17:13 "Peter Olcott" <NoSpam(a)OCR4Screen.com> writes: > >"Ersek, Laszlo" <lacos(a)caesar.elte.hu> wrote in message >news:Pine.LNX.4.64.1004051406400.9882(a)login01.caesar.elte.hu... >> On Mon, 5 Apr 2010, Paul wrote: >> >>> Sorry to interrupt the thread. You had a question about >>> variability in performance of a small test program. (Your >>> "memory bandwidth" test...) >>> >>> Maybe you've figured it out already. It isn't a problem >>> with TLB performance. The notion of a random walk is >>> wrong. >> >>> In other words, for about the first (roughly) 4143 steps, >>> the program is doing a random walk. Then, the array walk >>> gets stuck in a repeating, 23 step loop. >> >>> In any event, the random walk is not random. It is >>> getting stuck in a (short) loop. The TLB might still be a >>> factor, but you'll need a better test design, to evaluate >>> that. >> >> Would you comment on the code herein? >> >> http://groups.google.com/group/comp.os.linux.development.apps/msg/aaf49b54b2501740?dmode=source >> >> Thanks, >> lacos > >uint32 GetRandom(uint32 size) { > return (rand() * (RAND_MAX + 1) + rand()) % size; >} > >The only thing that I care about is making this function >produce sequences of values that are more widely >distributed. I want to force a cache busting algorithm on >memory accesses. > > http://en.wikipedia.org/wiki/Mersenne_twister
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Prev: Learn about proxy sites and how to use them to open blocked sites unlimited downloads from RapidShare and megaupload and increase the speed of the Internet with new sites for free Next: Why can't I catch a SIGBUS on my Mac (when strlen(NULL) is used)? |