Prev: xlogViewer / xlogdump
Next: CVS corruption/mistagging?
From: Tom Lane on 5 Mar 2007 12:44 "Pavan Deolasee" <pavan(a)enterprisedb.com> writes: > Isn't the size of the shared buffer pool itself acting as a performance > penalty in this case ? May be StrategyGetBuffer() needs to make multiple > passes over the buffers before the usage_count of any buffer is reduced > to zero and the buffer is chosen as replacement victim. I read that and thought you were onto something, but it's not acting quite the way I expect. I made a quick hack in StrategyGetBuffer() to count the number of buffers it looks at before finding a victim. Running it with just 32 buffers on a large count(*), the behavior after the initial startup transient is quite odd: got buffer 2 after 4 tries got buffer 3 after 1 tries got buffer 4 after 1 tries got buffer 5 after 1 tries got buffer 6 after 1 tries got buffer 7 after 1 tries got buffer 8 after 1 tries got buffer 12 after 4 tries got buffer 14 after 2 tries got buffer 21 after 7 tries got buffer 26 after 5 tries got buffer 27 after 1 tries got buffer 31 after 4 tries got buffer 0 after 1 tries got buffer 1 after 1 tries got buffer 9 after 8 tries got buffer 10 after 1 tries got buffer 11 after 1 tries got buffer 13 after 2 tries got buffer 15 after 2 tries got buffer 16 after 1 tries got buffer 17 after 1 tries got buffer 18 after 1 tries got buffer 19 after 1 tries got buffer 20 after 1 tries got buffer 22 after 2 tries got buffer 23 after 1 tries got buffer 24 after 1 tries got buffer 25 after 1 tries got buffer 28 after 3 tries got buffer 29 after 1 tries got buffer 30 after 1 tries got buffer 2 after 4 tries got buffer 3 after 1 tries got buffer 4 after 1 tries got buffer 5 after 1 tries got buffer 6 after 1 tries got buffer 7 after 1 tries got buffer 8 after 1 tries got buffer 12 after 4 tries got buffer 14 after 2 tries got buffer 21 after 7 tries got buffer 26 after 5 tries got buffer 27 after 1 tries got buffer 31 after 4 tries got buffer 0 after 1 tries got buffer 1 after 1 tries got buffer 9 after 8 tries got buffer 10 after 1 tries got buffer 11 after 1 tries got buffer 13 after 2 tries got buffer 15 after 2 tries got buffer 16 after 1 tries got buffer 17 after 1 tries got buffer 18 after 1 tries got buffer 19 after 1 tries got buffer 20 after 1 tries got buffer 22 after 2 tries got buffer 23 after 1 tries got buffer 24 after 1 tries got buffer 25 after 1 tries got buffer 28 after 3 tries got buffer 29 after 1 tries got buffer 30 after 1 tries got buffer 2 after 4 tries got buffer 3 after 1 tries got buffer 4 after 1 tries got buffer 5 after 1 tries got buffer 6 after 1 tries got buffer 7 after 1 tries got buffer 8 after 1 tries got buffer 12 after 4 tries got buffer 14 after 2 tries got buffer 21 after 7 tries got buffer 26 after 5 tries got buffer 27 after 1 tries got buffer 31 after 4 tries got buffer 0 after 1 tries got buffer 1 after 1 tries got buffer 9 after 8 tries got buffer 10 after 1 tries got buffer 11 after 1 tries got buffer 13 after 2 tries got buffer 15 after 2 tries got buffer 16 after 1 tries got buffer 17 after 1 tries got buffer 18 after 1 tries got buffer 19 after 1 tries got buffer 20 after 1 tries got buffer 22 after 2 tries got buffer 23 after 1 tries got buffer 24 after 1 tries got buffer 25 after 1 tries got buffer 28 after 3 tries got buffer 29 after 1 tries got buffer 30 after 1 tries got buffer 2 after 4 tries got buffer 3 after 1 tries got buffer 4 after 1 tries got buffer 5 after 1 tries got buffer 6 after 1 tries got buffer 7 after 1 tries got buffer 8 after 1 tries got buffer 12 after 4 tries got buffer 14 after 2 tries got buffer 21 after 7 tries got buffer 26 after 5 tries got buffer 27 after 1 tries got buffer 31 after 4 tries got buffer 0 after 1 tries got buffer 1 after 1 tries got buffer 9 after 8 tries got buffer 10 after 1 tries got buffer 11 after 1 tries got buffer 13 after 2 tries got buffer 15 after 2 tries got buffer 16 after 1 tries got buffer 17 after 1 tries got buffer 18 after 1 tries got buffer 19 after 1 tries got buffer 20 after 1 tries got buffer 22 after 2 tries got buffer 23 after 1 tries got buffer 24 after 1 tries got buffer 25 after 1 tries got buffer 28 after 3 tries got buffer 29 after 1 tries got buffer 30 after 1 tries got buffer 2 after 4 tries got buffer 3 after 1 tries got buffer 4 after 1 tries got buffer 5 after 1 tries got buffer 6 after 1 tries got buffer 7 after 1 tries got buffer 8 after 1 tries got buffer 12 after 4 tries got buffer 14 after 2 tries got buffer 21 after 7 tries got buffer 26 after 5 tries got buffer 27 after 1 tries got buffer 31 after 4 tries got buffer 0 after 1 tries got buffer 1 after 1 tries got buffer 9 after 8 tries got buffer 10 after 1 tries got buffer 11 after 1 tries got buffer 13 after 2 tries got buffer 15 after 2 tries got buffer 16 after 1 tries got buffer 17 after 1 tries got buffer 18 after 1 tries got buffer 19 after 1 tries got buffer 20 after 1 tries got buffer 22 after 2 tries got buffer 23 after 1 tries got buffer 24 after 1 tries got buffer 25 after 1 tries got buffer 28 after 3 tries got buffer 29 after 1 tries got buffer 30 after 1 tries got buffer 2 after 4 tries got buffer 3 after 1 tries got buffer 4 after 1 tries got buffer 5 after 1 tries got buffer 6 after 1 tries got buffer 7 after 1 tries got buffer 8 after 1 tries got buffer 12 after 4 tries got buffer 14 after 2 tries got buffer 21 after 7 tries got buffer 26 after 5 tries got buffer 27 after 1 tries got buffer 31 after 4 tries got buffer 0 after 1 tries got buffer 1 after 1 tries got buffer 9 after 8 tries got buffer 10 after 1 tries got buffer 11 after 1 tries got buffer 13 after 2 tries got buffer 15 after 2 tries got buffer 16 after 1 tries got buffer 17 after 1 tries got buffer 18 after 1 tries got buffer 19 after 1 tries got buffer 20 after 1 tries got buffer 22 after 2 tries got buffer 23 after 1 tries got buffer 24 after 1 tries got buffer 25 after 1 tries got buffer 28 after 3 tries got buffer 29 after 1 tries got buffer 30 after 1 tries got buffer 2 after 4 tries got buffer 3 after 1 tries got buffer 4 after 1 tries got buffer 5 after 1 tries got buffer 6 after 1 tries got buffer 7 after 1 tries got buffer 8 after 1 tries got buffer 12 after 4 tries got buffer 14 after 2 tries got buffer 21 after 7 tries got buffer 26 after 5 tries got buffer 27 after 1 tries got buffer 31 after 4 tries got buffer 0 after 1 tries got buffer 1 after 1 tries got buffer 9 after 8 tries got buffer 10 after 1 tries got buffer 11 after 1 tries got buffer 13 after 2 tries got buffer 15 after 2 tries got buffer 16 after 1 tries got buffer 17 after 1 tries got buffer 18 after 1 tries got buffer 19 after 1 tries got buffer 20 after 1 tries got buffer 22 after 2 tries got buffer 23 after 1 tries got buffer 24 after 1 tries got buffer 25 after 1 tries got buffer 28 after 3 tries got buffer 29 after 1 tries got buffer 30 after 1 tries got buffer 2 after 4 tries got buffer 3 after 1 tries got buffer 4 after 1 tries got buffer 5 after 1 tries got buffer 6 after 1 tries got buffer 7 after 1 tries got buffer 8 after 1 tries got buffer 12 after 4 tries got buffer 14 after 2 tries got buffer 21 after 7 tries got buffer 26 after 5 tries got buffer 27 after 1 tries got buffer 31 after 4 tries got buffer 0 after 1 tries got buffer 1 after 1 tries got buffer 9 after 8 tries got buffer 10 after 1 tries got buffer 11 after 1 tries got buffer 13 after 2 tries got buffer 15 after 2 tries got buffer 16 after 1 tries got buffer 17 after 1 tries got buffer 18 after 1 tries got buffer 19 after 1 tries got buffer 20 after 1 tries got buffer 22 after 2 tries got buffer 23 after 1 tries got buffer 24 after 1 tries got buffer 25 after 1 tries got buffer 28 after 3 tries got buffer 29 after 1 tries got buffer 30 after 1 tries got buffer 2 after 4 tries Yes, autovacuum is off, and bgwriter shouldn't have anything useful to do either, so I'm a bit at a loss what's going on --- but in any case, it doesn't look like we are cycling through the entire buffer space for each fetch. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
From: Josh Berkus on 5 Mar 2007 13:20 Tom, > Yes, autovacuum is off, and bgwriter shouldn't have anything useful to > do either, so I'm a bit at a loss what's going on --- but in any case, > it doesn't look like we are cycling through the entire buffer space > for each fetch. I'd be happy to DTrace it, but I'm a little lost as to where to look in the kernel. I'll see if I can find someone who knows more about memory management than me (that ought to be easy). -- Josh Berkus PostgreSQL @ Sun San Francisco ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
From: "Luke Lonergan" on 5 Mar 2007 13:21 Tom, On 3/5/07 8:53 AM, "Tom Lane" <tgl(a)sss.pgh.pa.us> wrote: > Hm, that seems to blow the "it's an L2 cache effect" theory out of the > water. If it were a cache effect then there should be a performance > cliff at the point where the cache size is exceeded. I see no such > cliff, in fact the middle part of the curve is darn near a straight > line on a log scale ... Here's that cliff you were looking for: Size of Orders table: 7178MB Blocksize: 8KB Shared_buffers Select Count Vacuum (KB) (s) (s) ======================================= 248 5.52 2.46 368 4.77 2.40 552 5.82 2.40 824 6.20 2.43 1232 5.60 3.59 1848 6.02 3.14 2768 5.53 4.56 All of these were run three times and the *lowest* time reported. Also, the behavior of "fast VACUUM after SELECT" begins abruptly at 1232KB of shared_buffers. These are Opterons with 2MB of L2 cache shared between two cores. - Luke ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend
From: Tom Lane on 5 Mar 2007 13:24 I wrote: > "Pavan Deolasee" <pavan(a)enterprisedb.com> writes: >> Isn't the size of the shared buffer pool itself acting as a performance >> penalty in this case ? May be StrategyGetBuffer() needs to make multiple >> passes over the buffers before the usage_count of any buffer is reduced >> to zero and the buffer is chosen as replacement victim. > I read that and thought you were onto something, but it's not acting > quite the way I expect. I made a quick hack in StrategyGetBuffer() to > count the number of buffers it looks at before finding a victim. > ... > Yes, autovacuum is off, and bgwriter shouldn't have anything useful to > do either, so I'm a bit at a loss what's going on --- but in any case, > it doesn't look like we are cycling through the entire buffer space > for each fetch. Nope, Pavan's nailed it: the problem is that after using a buffer, the seqscan leaves it with usage_count = 1, which means it has to be passed over once by the clock sweep before it can be re-used. I was misled in the 32-buffer case because catalog accesses during startup had left the buffer state pretty confused, so that there was no long stretch before hitting something available. With a large number of buffers, the behavior is that the seqscan fills all of shared memory with buffers having usage_count 1. Once the clock sweep returns to the first of these buffers, it will have to pass over all of them, reducing all of their counts to 0, before it returns to the first one and finds it now usable. Subsequent tries find a buffer immediately, of course, until we have again filled shared_buffers with usage_count 1 everywhere. So the problem is not so much the clock sweep overhead as that it's paid in a very nonuniform fashion: with N buffers you pay O(N) once every N reads and O(1) the rest of the time. This is no doubt slowing things down enough to delay that one read, instead of leaving it nicely I/O bound all the time. Mark, can you detect "hiccups" in the read rate using your setup? I seem to recall that we've previously discussed the idea of letting the clock sweep decrement the usage_count before testing for 0, so that a buffer could be reused on the first sweep after it was initially used, but that we rejected it as being a bad idea. But at least with large shared_buffers it doesn't sound like such a bad idea. Another issue nearby to this is whether to avoid selecting buffers that are dirty --- IIRC someone brought that up again recently. Maybe predecrement for clean buffers, postdecrement for dirty ones would be a cute compromise. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org
From: "Luke Lonergan" on 5 Mar 2007 13:44
Here's four more points on the curve - I'd use a "dirac delta function" for your curve fit ;-) Shared_buffers Select Count Vacuum (KB) (s) (s) ======================================= 248 5.52 2.46 368 4.77 2.40 552 5.82 2.40 824 6.20 2.43 1232 5.60 3.59 1848 6.02 3.14 2768 5.53 4.56 5536 6.05 3.95 8304 5.80 4.37 12456 5.86 4.12 18680 5.83 4.10 28016 6.11 4.46 WRT what you found on the selection algorithm, it might also explain the L2 effects I think. I'm also still of the opinion that polluting the shared buffer cache for a seq scan does not make sense. - Luke On 3/5/07 10:21 AM, "Luke Lonergan" <llonergan(a)greenplum.com> wrote: > Tom, > > On 3/5/07 8:53 AM, "Tom Lane" <tgl(a)sss.pgh.pa.us> wrote: > >> Hm, that seems to blow the "it's an L2 cache effect" theory out of the >> water. If it were a cache effect then there should be a performance >> cliff at the point where the cache size is exceeded. I see no such >> cliff, in fact the middle part of the curve is darn near a straight >> line on a log scale ... > > Here's that cliff you were looking for: > > Size of Orders table: 7178MB > Blocksize: 8KB > > Shared_buffers Select Count Vacuum > (KB) (s) (s) > ======================================= > 248 5.52 2.46 > 368 4.77 2.40 > 552 5.82 2.40 > 824 6.20 2.43 > 1232 5.60 3.59 > 1848 6.02 3.14 > 2768 5.53 4.56 > > All of these were run three times and the *lowest* time reported. Also, the > behavior of "fast VACUUM after SELECT" begins abruptly at 1232KB of > shared_buffers. > > These are Opterons with 2MB of L2 cache shared between two cores. > > - Luke > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo(a)postgresql.org so that your message can get through to the mailing list cleanly |