From: Tom Lane on
"Pavan Deolasee" <pavan(a)enterprisedb.com> writes:
> Isn't the size of the shared buffer pool itself acting as a performance
> penalty in this case ? May be StrategyGetBuffer() needs to make multiple
> passes over the buffers before the usage_count of any buffer is reduced
> to zero and the buffer is chosen as replacement victim.

I read that and thought you were onto something, but it's not acting
quite the way I expect. I made a quick hack in StrategyGetBuffer() to
count the number of buffers it looks at before finding a victim.
Running it with just 32 buffers on a large count(*), the behavior after
the initial startup transient is quite odd:

got buffer 2 after 4 tries
got buffer 3 after 1 tries
got buffer 4 after 1 tries
got buffer 5 after 1 tries
got buffer 6 after 1 tries
got buffer 7 after 1 tries
got buffer 8 after 1 tries
got buffer 12 after 4 tries
got buffer 14 after 2 tries
got buffer 21 after 7 tries
got buffer 26 after 5 tries
got buffer 27 after 1 tries
got buffer 31 after 4 tries
got buffer 0 after 1 tries
got buffer 1 after 1 tries
got buffer 9 after 8 tries
got buffer 10 after 1 tries
got buffer 11 after 1 tries
got buffer 13 after 2 tries
got buffer 15 after 2 tries
got buffer 16 after 1 tries
got buffer 17 after 1 tries
got buffer 18 after 1 tries
got buffer 19 after 1 tries
got buffer 20 after 1 tries
got buffer 22 after 2 tries
got buffer 23 after 1 tries
got buffer 24 after 1 tries
got buffer 25 after 1 tries
got buffer 28 after 3 tries
got buffer 29 after 1 tries
got buffer 30 after 1 tries
got buffer 2 after 4 tries
got buffer 3 after 1 tries
got buffer 4 after 1 tries
got buffer 5 after 1 tries
got buffer 6 after 1 tries
got buffer 7 after 1 tries
got buffer 8 after 1 tries
got buffer 12 after 4 tries
got buffer 14 after 2 tries
got buffer 21 after 7 tries
got buffer 26 after 5 tries
got buffer 27 after 1 tries
got buffer 31 after 4 tries
got buffer 0 after 1 tries
got buffer 1 after 1 tries
got buffer 9 after 8 tries
got buffer 10 after 1 tries
got buffer 11 after 1 tries
got buffer 13 after 2 tries
got buffer 15 after 2 tries
got buffer 16 after 1 tries
got buffer 17 after 1 tries
got buffer 18 after 1 tries
got buffer 19 after 1 tries
got buffer 20 after 1 tries
got buffer 22 after 2 tries
got buffer 23 after 1 tries
got buffer 24 after 1 tries
got buffer 25 after 1 tries
got buffer 28 after 3 tries
got buffer 29 after 1 tries
got buffer 30 after 1 tries
got buffer 2 after 4 tries
got buffer 3 after 1 tries
got buffer 4 after 1 tries
got buffer 5 after 1 tries
got buffer 6 after 1 tries
got buffer 7 after 1 tries
got buffer 8 after 1 tries
got buffer 12 after 4 tries
got buffer 14 after 2 tries
got buffer 21 after 7 tries
got buffer 26 after 5 tries
got buffer 27 after 1 tries
got buffer 31 after 4 tries
got buffer 0 after 1 tries
got buffer 1 after 1 tries
got buffer 9 after 8 tries
got buffer 10 after 1 tries
got buffer 11 after 1 tries
got buffer 13 after 2 tries
got buffer 15 after 2 tries
got buffer 16 after 1 tries
got buffer 17 after 1 tries
got buffer 18 after 1 tries
got buffer 19 after 1 tries
got buffer 20 after 1 tries
got buffer 22 after 2 tries
got buffer 23 after 1 tries
got buffer 24 after 1 tries
got buffer 25 after 1 tries
got buffer 28 after 3 tries
got buffer 29 after 1 tries
got buffer 30 after 1 tries
got buffer 2 after 4 tries
got buffer 3 after 1 tries
got buffer 4 after 1 tries
got buffer 5 after 1 tries
got buffer 6 after 1 tries
got buffer 7 after 1 tries
got buffer 8 after 1 tries
got buffer 12 after 4 tries
got buffer 14 after 2 tries
got buffer 21 after 7 tries
got buffer 26 after 5 tries
got buffer 27 after 1 tries
got buffer 31 after 4 tries
got buffer 0 after 1 tries
got buffer 1 after 1 tries
got buffer 9 after 8 tries
got buffer 10 after 1 tries
got buffer 11 after 1 tries
got buffer 13 after 2 tries
got buffer 15 after 2 tries
got buffer 16 after 1 tries
got buffer 17 after 1 tries
got buffer 18 after 1 tries
got buffer 19 after 1 tries
got buffer 20 after 1 tries
got buffer 22 after 2 tries
got buffer 23 after 1 tries
got buffer 24 after 1 tries
got buffer 25 after 1 tries
got buffer 28 after 3 tries
got buffer 29 after 1 tries
got buffer 30 after 1 tries
got buffer 2 after 4 tries
got buffer 3 after 1 tries
got buffer 4 after 1 tries
got buffer 5 after 1 tries
got buffer 6 after 1 tries
got buffer 7 after 1 tries
got buffer 8 after 1 tries
got buffer 12 after 4 tries
got buffer 14 after 2 tries
got buffer 21 after 7 tries
got buffer 26 after 5 tries
got buffer 27 after 1 tries
got buffer 31 after 4 tries
got buffer 0 after 1 tries
got buffer 1 after 1 tries
got buffer 9 after 8 tries
got buffer 10 after 1 tries
got buffer 11 after 1 tries
got buffer 13 after 2 tries
got buffer 15 after 2 tries
got buffer 16 after 1 tries
got buffer 17 after 1 tries
got buffer 18 after 1 tries
got buffer 19 after 1 tries
got buffer 20 after 1 tries
got buffer 22 after 2 tries
got buffer 23 after 1 tries
got buffer 24 after 1 tries
got buffer 25 after 1 tries
got buffer 28 after 3 tries
got buffer 29 after 1 tries
got buffer 30 after 1 tries
got buffer 2 after 4 tries
got buffer 3 after 1 tries
got buffer 4 after 1 tries
got buffer 5 after 1 tries
got buffer 6 after 1 tries
got buffer 7 after 1 tries
got buffer 8 after 1 tries
got buffer 12 after 4 tries
got buffer 14 after 2 tries
got buffer 21 after 7 tries
got buffer 26 after 5 tries
got buffer 27 after 1 tries
got buffer 31 after 4 tries
got buffer 0 after 1 tries
got buffer 1 after 1 tries
got buffer 9 after 8 tries
got buffer 10 after 1 tries
got buffer 11 after 1 tries
got buffer 13 after 2 tries
got buffer 15 after 2 tries
got buffer 16 after 1 tries
got buffer 17 after 1 tries
got buffer 18 after 1 tries
got buffer 19 after 1 tries
got buffer 20 after 1 tries
got buffer 22 after 2 tries
got buffer 23 after 1 tries
got buffer 24 after 1 tries
got buffer 25 after 1 tries
got buffer 28 after 3 tries
got buffer 29 after 1 tries
got buffer 30 after 1 tries
got buffer 2 after 4 tries
got buffer 3 after 1 tries
got buffer 4 after 1 tries
got buffer 5 after 1 tries
got buffer 6 after 1 tries
got buffer 7 after 1 tries
got buffer 8 after 1 tries
got buffer 12 after 4 tries
got buffer 14 after 2 tries
got buffer 21 after 7 tries
got buffer 26 after 5 tries
got buffer 27 after 1 tries
got buffer 31 after 4 tries
got buffer 0 after 1 tries
got buffer 1 after 1 tries
got buffer 9 after 8 tries
got buffer 10 after 1 tries
got buffer 11 after 1 tries
got buffer 13 after 2 tries
got buffer 15 after 2 tries
got buffer 16 after 1 tries
got buffer 17 after 1 tries
got buffer 18 after 1 tries
got buffer 19 after 1 tries
got buffer 20 after 1 tries
got buffer 22 after 2 tries
got buffer 23 after 1 tries
got buffer 24 after 1 tries
got buffer 25 after 1 tries
got buffer 28 after 3 tries
got buffer 29 after 1 tries
got buffer 30 after 1 tries
got buffer 2 after 4 tries
got buffer 3 after 1 tries
got buffer 4 after 1 tries
got buffer 5 after 1 tries
got buffer 6 after 1 tries
got buffer 7 after 1 tries
got buffer 8 after 1 tries
got buffer 12 after 4 tries
got buffer 14 after 2 tries
got buffer 21 after 7 tries
got buffer 26 after 5 tries
got buffer 27 after 1 tries
got buffer 31 after 4 tries
got buffer 0 after 1 tries
got buffer 1 after 1 tries
got buffer 9 after 8 tries
got buffer 10 after 1 tries
got buffer 11 after 1 tries
got buffer 13 after 2 tries
got buffer 15 after 2 tries
got buffer 16 after 1 tries
got buffer 17 after 1 tries
got buffer 18 after 1 tries
got buffer 19 after 1 tries
got buffer 20 after 1 tries
got buffer 22 after 2 tries
got buffer 23 after 1 tries
got buffer 24 after 1 tries
got buffer 25 after 1 tries
got buffer 28 after 3 tries
got buffer 29 after 1 tries
got buffer 30 after 1 tries
got buffer 2 after 4 tries

Yes, autovacuum is off, and bgwriter shouldn't have anything useful to
do either, so I'm a bit at a loss what's going on --- but in any case,
it doesn't look like we are cycling through the entire buffer space
for each fetch.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

From: Josh Berkus on
Tom,

> Yes, autovacuum is off, and bgwriter shouldn't have anything useful to
> do either, so I'm a bit at a loss what's going on --- but in any case,
> it doesn't look like we are cycling through the entire buffer space
> for each fetch.

I'd be happy to DTrace it, but I'm a little lost as to where to look in the
kernel. I'll see if I can find someone who knows more about memory
management than me (that ought to be easy).

--
Josh Berkus
PostgreSQL @ Sun
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

From: "Luke Lonergan" on
Tom,

On 3/5/07 8:53 AM, "Tom Lane" <tgl(a)sss.pgh.pa.us> wrote:

> Hm, that seems to blow the "it's an L2 cache effect" theory out of the
> water. If it were a cache effect then there should be a performance
> cliff at the point where the cache size is exceeded. I see no such
> cliff, in fact the middle part of the curve is darn near a straight
> line on a log scale ...

Here's that cliff you were looking for:

Size of Orders table: 7178MB
Blocksize: 8KB

Shared_buffers Select Count Vacuum
(KB) (s) (s)
=======================================
248 5.52 2.46
368 4.77 2.40
552 5.82 2.40
824 6.20 2.43
1232 5.60 3.59
1848 6.02 3.14
2768 5.53 4.56

All of these were run three times and the *lowest* time reported. Also, the
behavior of "fast VACUUM after SELECT" begins abruptly at 1232KB of
shared_buffers.

These are Opterons with 2MB of L2 cache shared between two cores.

- Luke




---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

From: Tom Lane on
I wrote:
> "Pavan Deolasee" <pavan(a)enterprisedb.com> writes:
>> Isn't the size of the shared buffer pool itself acting as a performance
>> penalty in this case ? May be StrategyGetBuffer() needs to make multiple
>> passes over the buffers before the usage_count of any buffer is reduced
>> to zero and the buffer is chosen as replacement victim.

> I read that and thought you were onto something, but it's not acting
> quite the way I expect. I made a quick hack in StrategyGetBuffer() to
> count the number of buffers it looks at before finding a victim.
> ...
> Yes, autovacuum is off, and bgwriter shouldn't have anything useful to
> do either, so I'm a bit at a loss what's going on --- but in any case,
> it doesn't look like we are cycling through the entire buffer space
> for each fetch.

Nope, Pavan's nailed it: the problem is that after using a buffer, the
seqscan leaves it with usage_count = 1, which means it has to be passed
over once by the clock sweep before it can be re-used. I was misled in
the 32-buffer case because catalog accesses during startup had left the
buffer state pretty confused, so that there was no long stretch before
hitting something available. With a large number of buffers, the
behavior is that the seqscan fills all of shared memory with buffers
having usage_count 1. Once the clock sweep returns to the first of
these buffers, it will have to pass over all of them, reducing all of
their counts to 0, before it returns to the first one and finds it now
usable. Subsequent tries find a buffer immediately, of course, until we
have again filled shared_buffers with usage_count 1 everywhere. So the
problem is not so much the clock sweep overhead as that it's paid in a
very nonuniform fashion: with N buffers you pay O(N) once every N reads
and O(1) the rest of the time. This is no doubt slowing things down
enough to delay that one read, instead of leaving it nicely I/O bound
all the time. Mark, can you detect "hiccups" in the read rate using
your setup?

I seem to recall that we've previously discussed the idea of letting the
clock sweep decrement the usage_count before testing for 0, so that a
buffer could be reused on the first sweep after it was initially used,
but that we rejected it as being a bad idea. But at least with large
shared_buffers it doesn't sound like such a bad idea.

Another issue nearby to this is whether to avoid selecting buffers that
are dirty --- IIRC someone brought that up again recently. Maybe
predecrement for clean buffers, postdecrement for dirty ones would be a
cute compromise.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

From: "Luke Lonergan" on
Here's four more points on the curve - I'd use a "dirac delta function" for
your curve fit ;-)

Shared_buffers Select Count Vacuum
(KB) (s) (s)
=======================================
248 5.52 2.46
368 4.77 2.40
552 5.82 2.40
824 6.20 2.43
1232 5.60 3.59
1848 6.02 3.14
2768 5.53 4.56
5536 6.05 3.95
8304 5.80 4.37
12456 5.86 4.12
18680 5.83 4.10
28016 6.11 4.46

WRT what you found on the selection algorithm, it might also explain the L2
effects I think.

I'm also still of the opinion that polluting the shared buffer cache for a
seq scan does not make sense.

- Luke

On 3/5/07 10:21 AM, "Luke Lonergan" <llonergan(a)greenplum.com> wrote:

> Tom,
>
> On 3/5/07 8:53 AM, "Tom Lane" <tgl(a)sss.pgh.pa.us> wrote:
>
>> Hm, that seems to blow the "it's an L2 cache effect" theory out of the
>> water. If it were a cache effect then there should be a performance
>> cliff at the point where the cache size is exceeded. I see no such
>> cliff, in fact the middle part of the curve is darn near a straight
>> line on a log scale ...
>
> Here's that cliff you were looking for:
>
> Size of Orders table: 7178MB
> Blocksize: 8KB
>
> Shared_buffers Select Count Vacuum
> (KB) (s) (s)
> =======================================
> 248 5.52 2.46
> 368 4.77 2.40
> 552 5.82 2.40
> 824 6.20 2.43
> 1232 5.60 3.59
> 1848 6.02 3.14
> 2768 5.53 4.56
>
> All of these were run three times and the *lowest* time reported. Also, the
> behavior of "fast VACUUM after SELECT" begins abruptly at 1232KB of
> shared_buffers.
>
> These are Opterons with 2MB of L2 cache shared between two cores.
>
> - Luke
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo(a)postgresql.org so that your
message can get through to the mailing list cleanly

First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Prev: xlogViewer / xlogdump
Next: CVS corruption/mistagging?