From: Robert Haas on
On Tue, Dec 22, 2009 at 7:29 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> On Tue, 2009-12-22 at 19:45 +0900, Takahiro Itagaki wrote:
>
>> I used "VACUUM FULL" because we were discussing to drop VFI completely,
>> but I won't replace the behavior if hot-standby can support VFI.
>
> HS can't support VFI now, by definition. We agreed to spend the time
> getting rid of VFI, which working on this with you is part of.
>
> If we can just skip the index rebuild, I think that's all the additional
> code changes we need. I'll improve the docs as I review-to-commit.

So, what is the roadmap for getting this done? It seems like to get
rid of VFI completely, we would need to implement something like what
Tom described here:

http://archives.postgresql.org/pgsql-hackers/2009-09/msg00249.php

I'm not sure whether the current patch is a good intermediate step
towards that ultimate goal, or whether events have overtaken it.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Takahiro Itagaki on

Robert Haas <robertmhaas(a)gmail.com> wrote:

> So, what is the roadmap for getting this done? It seems like to get
> rid of VFI completely, we would need to implement something like what
> Tom described here:
>
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg00249.php
>
> I'm not sure whether the current patch is a good intermediate step
> towards that ultimate goal, or whether events have overtaken it.

I think the most desirable roadmap is:
1. Enable CLUSTER to non-critical system catalogs.
2. Also enable CLUSTER and REINDEX to critical system catalogs.
3. Remove VFI and re-implement VACUUM FULL with CLUSTER-based approach.
It should be also optimized as Simon's suggestion.

My patch was intended to do 3, but we should not skip 1 and 2. In the roadmap,
we don't have two versions of VACUUM FULL (INPLACE and REWRITE) at a time.

I think we can do 1 immediately. The comment in cluster says "might work",
and I also think so. CLUSTERable toast tables are obviously useful.
/*
* Disallow clustering system relations. This will definitely NOT work
* for shared relations (we have no way to update pg_class rows in other
* databases), nor for nailed-in-cache relations (the relfilenode values
* for those are hardwired, see relcache.c). It might work for other
* system relations, but I ain't gonna risk it.
*/

For 2, we need some kinds of "relfilenode mapper" for shared relations
and critical local tables (pg_class, pg_attribute, pg_proc, and pg_type).
I'm thinking that we only store "virtual" relfilenodes for them in pg_class
and remember the actual relfilenodes in shared memory. For example,
smgropen(1248:pg_database) is redirected to smgropen(mapper[1248]).
Since we cannot touch pg_class in non-login databases, we need to avoid
updating pg_class when we assign new relfilenodes for shared relations.

We also need to store the nodes in additional flat file. There might be
another approach to store them in control file for shared relation
(ControlFileData.shared_relfilenode_mapper as Oid[]), or pg_database
for local tables (pg_database.datclsssnode, datprocnode etc.)

What approach would be better?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
Happy New Year,

On Mon, 2010-01-04 at 11:50 +0900, Takahiro Itagaki wrote:
> Robert Haas <robertmhaas(a)gmail.com> wrote:
>
> > So, what is the roadmap for getting this done? It seems like to get
> > rid of VFI completely, we would need to implement something like what
> > Tom described here:
> >
> > http://archives.postgresql.org/pgsql-hackers/2009-09/msg00249.php
> >
> > I'm not sure whether the current patch is a good intermediate step
> > towards that ultimate goal, or whether events have overtaken it.
>
> I think the most desirable roadmap is:
> 1. Enable CLUSTER to non-critical system catalogs.
> 2. Also enable CLUSTER and REINDEX to critical system catalogs.
> 3. Remove VFI and re-implement VACUUM FULL with CLUSTER-based approach.
> It should be also optimized as Simon's suggestion.
>
> My patch was intended to do 3, but we should not skip 1 and 2. In the roadmap,
> we don't have two versions of VACUUM FULL (INPLACE and REWRITE) at a time.
>
> I think we can do 1 immediately. The comment in cluster says "might work",
> and I also think so. CLUSTERable toast tables are obviously useful.

You make some good points.

I would prefer this slightly modified version

1. Commit your patch, as-is (you/me)
2. Work on infrastructure for VFC (VACUUM FULL using CLUSTER) for system
relations (Simon)
3. Enable CLUSTER and REINDEX on critical system catalogs (Itagaki)
4. Optimise VFC, as discussed earlier (Itagaki)

I have put names in brackets, but this is just a suggestion.

This differs from your sequence in only a few ways
* We implement the basic VFC now, so everybody knows what we have
* We separate the infrastructure for (2) from the enabling of this
infrastructure for CLUSTER and REINDEX. There may be additional issues
to consider for those cases and we should think through and test them as
a different task
* We do not remove VFI in this release

This is a more cautious approach. Completely removing VFI in this
release is a big risk that we need not take; we have little to gain from
doing so and putting it back again will be harder. I am always keen to
push forwards when a new feature is worthwhile, but cleaning up code is
not an important thing this late in release cycle.

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Mon, Jan 4, 2010 at 3:04 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> This is a more cautious approach. Completely removing VFI in this
> release is a big risk that we need not take; we have little to gain from
> doing so and putting it back again will be harder. I am always keen to
> push forwards when a new feature is worthwhile, but cleaning up code is
> not an important thing this late in release cycle.

I don't have a strong opinion one way or the other on whether we
should remove VFI this release cycle, but I thought the reason why
there was pressure to do that was because we will otherwise need to
make changes to Hot Standby to cope with VFI. Or in other words, I
thought that in order to wrap a release we would need to do one of (1)
remove VFI and (2) fix HS to cope with VFI, and maybe there was a
theory that the former was easier than the latter. But it's possible
I may have totally misunderstood the situation. What is your thought
on how to handle this?

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Simon Riggs on
On Mon, 2010-01-04 at 10:31 -0500, Robert Haas wrote:
> On Mon, Jan 4, 2010 at 3:04 AM, Simon Riggs <simon(a)2ndquadrant.com> wrote:
> > This is a more cautious approach. Completely removing VFI in this
> > release is a big risk that we need not take; we have little to gain from
> > doing so and putting it back again will be harder. I am always keen to
> > push forwards when a new feature is worthwhile, but cleaning up code is
> > not an important thing this late in release cycle.
>
> I don't have a strong opinion one way or the other on whether we
> should remove VFI this release cycle, but I thought the reason why
> there was pressure to do that was because we will otherwise need to
> make changes to Hot Standby to cope with VFI.

What I should have said, in addition: VFI will be kept as a non-default
option, in case it is required. We will document that use of VFI will
not work correctly with HS and that its use is deprecated and should be
in emergencies only in any case. I will enjoy removing VFI when that
eventually occurs, but its not a priority. (And if you think, why keep
it? I'll say - how else can we run a VFI - not by a stored proc,
certainly).

--
Simon Riggs www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers