From: Jeff Davis on
On Fri, 2009-11-13 at 10:47 +0900, Itagaki Takahiro wrote:
> Alvaro Herrera <alvherre(a)commandprompt.com> wrote:
>
> > BTW I think the vacstmt.options change, which seems a reasonable idea to
> > me, could be extracted from the patch and applied separately. That'd
> > reduce the size of the patch somewhat.
>
> It's a good idea. I separated the part into the attached patch.
> It replaces VacuumStmt's "vacuum", "full", "analyze", and "verbose"
> fields into one "options" flag field.
>
> The only user-visible change is to support "VACUUM (options)" syntax:
> VACUUM ( FULL, FREEZE, VERBOSE, ANALYZE ) table (columns)
> We don't bother with the order of options in this form :)
>
> There is a debatable issue that we can use "VACUUM (VERBOSE) table (col)"
> in the abobe syntax. Columns are specified but no ANALYZE option there.
> An ANALYZE option is added automatically in the current implementation,
> but we should have thrown an syntax error in such case.

I have just begun review by reading some of the previous threads.

I'd like to try to summarize the goals we have for VACUUM to put these
patches in perspective:

1. Implement a rewrite version of VACUUM FULL, which is closer to
CLUSTER.

2. Use the new options syntax, to make the order of vacuum options
irrelevant.

3. Implement several flatfile maps to allow the rewrite version of
VACUUM FULL to work on catalogs:
http://archives.postgresql.org/message-id/19750.1252094460(a)sss.pgh.pa.us

4. Implement some kind of concurrent tuple mover that can slowly update
tuples and lazily VACUUM in such a way that they migrate to lower heap
pages (assuming no long-running transactions). This would be done with
normal updates (new xmin) and would not remove the old tuple until it
was really dead; otherwise there are concurrency problems. Such a
utility would be useful in cases where a very small fraction of tuples
need to be moved, or you don't have enough space for a rewrite.

5. Remove VACUUM FULL (in it's current form) completely.

Some other ideas also came out of the thread, like:

* Provide a way to truncate the dead space from the end of a relation in
a blocking manner. Right now, lazy vacuum won't shrink the file unless
it can acquire an exclusive lock without waiting, and there's no way to
actually make it wait. This can be a separate command, not necessarily a
part of VACUUM.

* Josh Berkus suggested allowing the user to specify a different
tablespace in which to rebuild the tablespace.

I'll begin looking at the patches themselves now, which implement #1 and
#2.

If we can implement enough of these features (say, #3 also) to remove
the current form of VACUUM FULL, then we can just call the new feature
VACUUM FULL, and save ourselves from syntactical headaches.

Regards,
Jeff Davis


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Jeff Davis <pgsql(a)j-davis.com> writes:
> I'd like to try to summarize the goals we have for VACUUM to put these
> patches in perspective:

Good summary, but ...

> Some other ideas also came out of the thread, like:

> * Provide a way to truncate the dead space from the end of a relation in
> a blocking manner. Right now, lazy vacuum won't shrink the file unless
> it can acquire an exclusive lock without waiting, and there's no way to
> actually make it wait. This can be a separate command, not necessarily a
> part of VACUUM.

What I remembered was actually closer to the opposite: people are
concerned about lazy vac holding the exclusive lock too long once it
does acquire it. In a manual vacuum this leads to unexpected blockage
of concurrent queries, and in an autovac this leads to a forced cancel
preventing autovac from completing the operation (which means no
space removal at all, and no stats update either). The code is designed
on the assumption that it won't spend very long holding the ex lock,
and so I think a reasonable TODO item is to ensure that by having it
limit how many blocks it will scan during the shrinkage attempt.

FWIW, once we provide the extensible option syntax, it doesn't seem
like it'd be hard to have an option to make it wait for the lock,
so I'd recommend that approach over anything with a separate command.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jeff Davis on
On Fri, 2009-11-13 at 10:47 +0900, Itagaki Takahiro wrote:
> Alvaro Herrera <alvherre(a)commandprompt.com> wrote:
>
> > BTW I think the vacstmt.options change, which seems a reasonable idea to
> > me, could be extracted from the patch and applied separately. That'd
> > reduce the size of the patch somewhat.
>
> It's a good idea. I separated the part into the attached patch.
> It replaces VacuumStmt's "vacuum", "full", "analyze", and "verbose"
> fields into one "options" flag field.

I added a separate commitfest item for this patch to track it separately
from the rewrite version of VACUUM:

https://commitfest.postgresql.org/action/patch_view?id=222

> The only user-visible change is to support "VACUUM (options)" syntax:
> VACUUM ( FULL, FREEZE, VERBOSE, ANALYZE ) table (columns)
> We don't bother with the order of options in this form :)

I looked at this patch. You left INPLACE in the patch, which looks like
an oversight when you were separating the two. Please remove that from
this part.

> There is a debatable issue that we can use "VACUUM (VERBOSE) table (col)"
> in the abobe syntax. Columns are specified but no ANALYZE option there.
> An ANALYZE option is added automatically in the current implementation,
> but we should have thrown an syntax error in such case.

Sounds fine, but worth a mention in the documentation. Just add to the
"columns" part of the VACUUM page something like: "If specified, implies
ANALYZE".

Other than these two minor issues, I don't see any problems with the
patch. Please post an updated version to the new commitfest entry.

Regards,
Jeff Davis


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jeff Davis on
On Tue, 2009-10-27 at 13:55 +0900, Itagaki Takahiro wrote:
> Here is a patch to support "rewrite" version of VACUUM FULL.

Can you please provide a patch that applies cleanly on top of the vacuum
options patch and on top of CVS HEAD (there are a couple minor conflicts
with this version). That would make review easier.

Initial comments:

1. Do we want to introduce syntax for INPLACE at all, if we are
eventually going to remove the current mechanism? If not, then we should
probably use REWRITE, because that's a better word than "REPLACE", I
think.

My opinion is that if we really still need the current in-place
mechanism, then VACUUM (FULL) should use the current in-place mechanism;
and VACUUM (FULL REWRITE) should use your new rewrite mechanism. That
gives us a nice migration path toward always using the rewrite
mechanism.

2. Why do all of the following exist: VACOPT_FULL, VACOPT_REPLACE, and
VACOPT_INPLACE? Shouldn't VACOPT_FULL be equivalent to one of the other
two? This is essentially what Simon was getting at, I think.

3. Some options are being set in vacuum() itself. It looks like the
options should already be set in gram.y, so should that be an Assert
instead? I think it's cleaner to set all of the options properly early
on, rather than waiting until vacuum() to interpret the combinations.

I haven't looked at the implementation in detail yet, but at a glance,
it seems to be a good approach.

Regards,
Jeff Davis


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Jeff Davis <pgsql(a)j-davis.com> writes:
> 3. Some options are being set in vacuum() itself. It looks like the
> options should already be set in gram.y, so should that be an Assert
> instead? I think it's cleaner to set all of the options properly early
> on, rather than waiting until vacuum() to interpret the combinations.

As a rule of thumb, I'd recommend keeping as much complexity as possible
*out* of gram.y. It's best if that code just reports the facts, ie what
options the user entered. Deriving conclusions from that (like what
default behaviors should be used) is best done later. One example of
why is that if you want the defaults to depend on GUC settings then
that logic must *not* happen in gram.y, since the parse tree might
outlive the current GUC settings.

I haven't read the patch but it sounds like vacuum() is the right
place for this type of activity.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers