From: Takahiro Itagaki on

Simon Riggs <simon(a)2ndQuadrant.com> wrote:

> I think we
> either need to implement that or document that vacuum will not skip
> all-visible pages when running VACUUM FULL.

All-visible does not always mean "filled enough", no? We will need to
check both visibility and fillfactor when we optimize copying tuples.

> Old VACUUM FULL was substantially faster than this on tables that had
> nothing to remove.

Yeah, that's why traditional VACUUM FULL is still kept as INPLACE vacuum.

> Please can you arrange for the cluster operation to skip rebuilding
> indexes if the table is not reduced in size?

Do you think we should dispose the rewritten table when we find the
VACUUM FULL (or CLUSTER) is useless? We could save the time to reindex,
but we've already consumed time to rewrite tables. It will be still
slower than VACUUM FULL INPLACE because it is read-only.

Instead, I'd suggest to have conditional database-wide maintenance
commands, something like:
VACUUM FULL WHERE <the table is fragmented>

We don't have to support the feature by SQL syntax; it could be done
in client tools. How about pg_maintain or pg_ctl maintain that cleans
up each relation with appropriate command? We could replace vacuumdb,
clusterdb, and reindexdb with it then.


> Part of the reason why these happen is that the code hasn't been
> refactored much at all from its original use for cluster. There are
> almost zero comments to explain the additional use of this code for
> VACUUM, or at least to explain it still all works even when there is no
> index.

Ok, I'll check for additional comments. The flow of code might be still
confusable because vacuum() calls cluster(), but we need major replacement
of codes to refactor it. I'm not sure we need the code rewrites for it.

Also, I think we need additional messages for VACUUM VERBOSE
(and CLUSTER VERBOSE), though it might be adjusted in another patch.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Tue, 2009-12-22 at 18:11 +0900, Takahiro Itagaki wrote:
>
> > Old VACUUM FULL was substantially faster than this on tables that
> had
> > nothing to remove.

> Yeah, that's why traditional VACUUM FULL is still kept as INPLACE
> vacuum.
>
> > Please can you arrange for the cluster operation to skip rebuilding
> > indexes if the table is not reduced in size?
>
> Do you think we should dispose the rewritten table when we find the
> VACUUM FULL (or CLUSTER) is useless? We could save the time to
> reindex,
> but we've already consumed time to rewrite tables.

The main purpose of doing a new VF is to compact space, so its role has
slightly changed from earlier versions. We need much clearer docs about
this.

On a production system, it seems better to skip the re-indexing, which
could take a long, long time. I don't see any way to skip completely
re-writing the table though, other than scanning the table with a normal
VACUUM first, as old VF does.

I would be inclined towards the idea that if somebody does a VF of a
whole database then we should look out for and optimise for tables with
no changes, but when operating on a single table we just do as
instructed and rebuild everything, including indexes. That seems like we
should do it, but we're running out of time.

For now, I think we can easily and should skip the index build, if
appropriate. That just takes a little reworking of code, which appears
necessary anyway. We should just document that this works a little
differently now and a complete db VF is now not either necessary or a
good thing. And running
VACUUM; REINDEX DATABASE foo;
will now be very pointless.

> It will be still
> slower than VACUUM FULL INPLACE because it is read-only.

Understood, lets document that.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Tue, 2009-12-22 at 18:11 +0900, Takahiro Itagaki wrote:

> Instead, I'd suggest to have conditional database-wide maintenance
> commands, something like:
> VACUUM FULL WHERE <the table is fragmented>
>
> We don't have to support the feature by SQL syntax; it could be done
> in client tools. How about pg_maintain or pg_ctl maintain that cleans
> up each relation with appropriate command? We could replace vacuumdb,
> clusterdb, and reindexdb with it then.

Some broad thoughts...

Our perception of acceptable change is much higher than most users. If
we tell people "use VACUUM FULL" or vacuumdb -f, then that command
should, if possible, continue to work well across many releases.
vacuumdb in most people's minds is the command you run to ease
maintenance and make everything right, rather than a specific set of
features.

We have "It just works" as a principle. I think the corollary of that is
that we should also have "It just continues to work the same way".

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Takahiro Itagaki on

Simon Riggs <simon(a)2ndQuadrant.com> wrote:

> Our perception of acceptable change is much higher than most users. If
> we tell people "use VACUUM FULL" or vacuumdb -f, then that command
> should, if possible, continue to work well across many releases.
> vacuumdb in most people's minds is the command you run to ease
> maintenance and make everything right, rather than a specific set of
> features.
>
> We have "It just works" as a principle. I think the corollary of that is
> that we should also have "It just continues to work the same way".

I used "VACUUM FULL" because we were discussing to drop VFI completely,
but I won't replace the behavior if hot-standby can support VFI.
We can use any keywords without making it reserved in "VACUUM (...)" syntax.
VACUUM (REWRITE), the first idea, can be used here. We can also take on
entirely-different syntax for it -- ex, "ALTER TABLE REWRITE or SHRINK".

I think the ALTER TABLE idea is not so bad because it does _not_ support
database-wide maintenance. REWRITE is not the best maintenance in normal
cases because a database should contain some rarely-updated tables.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Tue, 2009-12-22 at 19:45 +0900, Takahiro Itagaki wrote:

> I used "VACUUM FULL" because we were discussing to drop VFI completely,
> but I won't replace the behavior if hot-standby can support VFI.

HS can't support VFI now, by definition. We agreed to spend the time
getting rid of VFI, which working on this with you is part of.

If we can just skip the index rebuild, I think that's all the additional
code changes we need. I'll improve the docs as I review-to-commit.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers