Unexpected page allocation behavior on insert-only tables [PgSql]

Prev: Keepalive for max_standby_delay
Next: Unexpected page allocation behavior on insert-onlytables

From: Tom Lane on 31 May 2010 01:23

Greg Stark <gsstark(a)mit.edu> writes:
> This is an analyze-only scan? Why does analyze need to issue a
> relcache flush?

Directly: to cause other backends to pick up the updated pg_class row
(with new relpages/reltuples data).

Indirectly: to cause cached plans for the rel to be invalidated,
so that they can get replanned with updated pg_statistic entries.

So we can't just not have a relcache flush here. However, we
might be able to decouple targblock reset from the rest of it.
In particular, now that there's a distinction between smgr flush
and relcache flush, maybe we could associate targblock reset with
smgr flush (only) and arrange to not flush the smgr level during
ANALYZE --- basically, smgr flush would only be needed when truncating
or reassigning the relfilenode. I think this might work out nicely but
haven't chased the details.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 30 May 2010 23:44

On Sun, May 30, 2010 at 10:42 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> pretty clear what is going on. See the logic in
> RelationGetBufferForTuple, and note that at no time do we have any FSM
> data for the bid table:

Is this because, in the absence of updates or deletes, we never vacuum it?

> 4. Now, all the backends again decide to try to insert into the last
> available block. So everybody jams into the partly-filled block 10,
> until it gets filled.

Would it be (a) feasible and (b) useful to inject some entropy into this step?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Stark on 31 May 2010 01:09

On Mon, May 31, 2010 at 3:42 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> note that at no time do we have any FSM
> data for the bid table:
>
>
> 3. After awhile, autovacuum notices all the insert activity and kicks
> off an autoanalyze on the bid table. When committed, this forces a
> relcache flush for each other backend's relcache entry for "bid".
> In particular, the smgr targblock gets reset.

This is an analyze-only scan? Why does analyze need to issue a
relcache flush? Maybe we only need to issue one for an actual vacuum
which would also populate the fsm?

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 31 May 2010 16:47

I wrote:
> In particular, now that there's a distinction between smgr flush
> and relcache flush, maybe we could associate targblock reset with
> smgr flush (only) and arrange to not flush the smgr level during
> ANALYZE --- basically, smgr flush would only be needed when truncating
> or reassigning the relfilenode. I think this might work out nicely but
> haven't chased the details.

I looked into that a bit more and decided that it'd be a ticklish
change: the coupling between relcache and smgr cache is pretty tight,
and there just isn't any provision for having an smgr cache entry live
longer than its owning relcache entry. Even if we could fix it to
work reliably, this approach does nothing for the case where a backend
actually exits after filling just part of a new page, as noted by
Takahiro-san.

The next most promising fix is to have RelationGetBufferForTuple tell
the FSM about the new page immediately on creation. I made a draft
patch for that (attached). It fixes Michael's scenario nicely ---
all pages get filled completely --- and a simple test with pgbench
didn't reveal any obvious change in performance. However there is
clear *potential* for performance loss, due to both the extra FSM
access and the potential for increased contention because of multiple
backends piling into the same new page. So it would be good to do
some real performance testing on insert-heavy scenarios before we
consider applying this. Any volunteers?

Note: patch is against HEAD but should work in 8.4, if you reverse out
the use of the rd_targblock access macros.

regards, tom lane

First | Prev |
Pages: 1 2
Prev: Keepalive for max_standby_delay
Next: Unexpected page allocation behavior on insert-onlytables