patch - per-tablespace random_page_cost/seq_page

Prev: DTrace compiler warnings
Next: add more frame types in window functions (ROWS)

From: Greg Stark on 14 Nov 2009 15:06

On Sat, Nov 14, 2009 at 7:28 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
> I ended up not reusing the reloptions.c code. It looks like a lot of
> extra complexity for no obvious benefit, considering that there is no
> equivalent of AMs for tablespaces and therefore no need to support
> AM-specific options. I did reuse the reloptions syntax, and I think
> the internal representation could always be redone later, if we find
> that there's a use case for something more complicated.

a) effective_io_concurrency really deserves to be in the list as well.

b) I thought Tom came down pretty stridently against any data model
which hard codes a specific list of supported options. I can't
remember exactly what level of flexibility he wanted but I think
"doesn't require catalog changes to add a new option" might have been
it.

I agree that having everything smashed to text is a bit kludgy though.
I'm not sure we have the tools to do much better though.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 14 Nov 2009 16:02

On Sat, Nov 14, 2009 at 3:06 PM, Greg Stark <gsstark(a)mit.edu> wrote:
> On Sat, Nov 14, 2009 at 7:28 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
>> I ended up not reusing the reloptions.c code. It looks like a lot of
>> extra complexity for no obvious benefit, considering that there is no
>> equivalent of AMs for tablespaces and therefore no need to support
>> AM-specific options. I did reuse the reloptions syntax, and I think
>> the internal representation could always be redone later, if we find
>> that there's a use case for something more complicated.
>
> a) effective_io_concurrency really deserves to be in the list as well.
>
> b) I thought Tom came down pretty stridently against any data model
> which hard codes a specific list of supported options. I can't
> remember exactly what level of flexibility he wanted but I think
> "doesn't require catalog changes to add a new option" might have been
> it.
>
> I agree that having everything smashed to text is a bit kludgy though.
> I'm not sure we have the tools to do much better though.

I'm hoping Tom will reconsider, or at least flesh out his thinking.
What the reloptions code does is create a general framework for
handling options. Everything gets smashed down to text[], and then
when we actually need to use the reloptions we parse them into a C
struct appropriate to the underlying object type. This is really the
only feasible design, because pg_class contains multiple different
types of objects - in particular, tables and indices - and indices in
turn come in multiple types, depending on the AM. So the exact
options that are legal depend on the the type of object, and for
indices the AM, and we populate a *different* C struct depending on
the situation. pg_tablespace, on the other hand, only contains one
type of object: a tablespace. So, if we stored the options as text[],
we'd parse them out into a C struct just as we do for pg_class, but
unlike the pg_class case, it would always be the *same* C struct.

In other words, we CAN'T use dedicated columns for pg_class because we
can't know in advance precisely what columns will be needed - it
depends on what AMs someone chooses to load up. For pg_tablespace, we
know exactly what columns will be needed, and the answer is exactly
those options that we choose to support, because tablespaces are not
extensible.

That's my thinking, anyway... YMMV.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 14 Nov 2009 16:58

Robert Haas <robertmhaas(a)gmail.com> writes:
> .... pg_tablespace, on the other hand, only contains one
> type of object: a tablespace. So, if we stored the options as text[],
> we'd parse them out into a C struct just as we do for pg_class, but
> unlike the pg_class case, it would always be the *same* C struct.

The same, until it's different. There is no reason at all to suppose
that the set of options people will want to apply to a tablespace will
remain constant over time --- in fact, I don't think there's even a
solid consensus right now on which GUCs people would want to set at the
tablespace level. I don't believe it is wise to hardwire this into the
catalog schema. Yes, it would look marginally nicer from a theoretical
standpoint, but we'd be forever having to revise the schema, plus a lot
of downstream code (pg_dump for example); which is not only significant
work but absolutely prevents making any adjustments except at major
version boundaries. And I don't see any concrete benefit that we get
out of a hardwired schema for these things. It's not like we care about
optimizing searches for tablespaces having a particular option setting,
for example.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 14 Nov 2009 17:56

On Sat, Nov 14, 2009 at 4:58 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas(a)gmail.com> writes:
>> .... pg_tablespace, on the other hand, only contains one
>> type of object: a tablespace. So, if we stored the options as text[],
>> we'd parse them out into a C struct just as we do for pg_class, but
>> unlike the pg_class case, it would always be the *same* C struct.
>
> The same, until it's different. There is no reason at all to suppose
> that the set of options people will want to apply to a tablespace will
> remain constant over time --- in fact, I don't think there's even a
> solid consensus right now on which GUCs people would want to set at the
> tablespace level. I don't believe it is wise to hardwire this into the
> catalog schema. Yes, it would look marginally nicer from a theoretical
> standpoint, but we'd be forever having to revise the schema, plus a lot
> of downstream code (pg_dump for example); which is not only significant
> work but absolutely prevents making any adjustments except at major
> version boundaries. And I don't see any concrete benefit that we get
> out of a hardwired schema for these things. It's not like we care about
> optimizing searches for tablespaces having a particular option setting,
> for example.

I can tell I've lost this argument, but I still don't get it. Why do
we care if we have to change the schema? It's not a lot of work, and
the number of times we would likely bump catversion for new
pg_tablespace options seems unlikely to be significant in the grand
scheme of things. I don't think there are very many parameters that
make sense to set per-tablespace. As for major version boundaries, it
seems almost unimaginable that we would backpatch code to add a new
tablespace option whether the schema permits it or not. Can you
clarify the nature of your concern here?

What I'm concerned about with text[] is that I *think* it's going to
force us to invent an analog of the relcache for tablespaces. With
hardwired columns, a regular catcache is all we need. But the
reloptions stuff is designed to populate a struct, and once we
populate that struct we have to have someplace to hang it - or I guess
maybe we could reparse it on every call to cost_seqscan(),
cost_index(), genericcostestimate(), etc, but that doesn't seem like a
great idea. So it seems like we'll need another caching layer sitting
over the catcache. If we already had such a beast it would be
reasonable to add this in, but I would assume that we wouldn't want to
add such a thing without a fairly clear use case that I'm not sure we
have. Maybe you see it differently? Or do you have some idea for a
simpler way to set this up?

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 14 Nov 2009 18:36

Robert Haas <robertmhaas(a)gmail.com> writes:
> I can tell I've lost this argument, but I still don't get it. Why do
> we care if we have to change the schema? It's not a lot of work,

Try doing it a few times. Don't forget to keep psql and pg_dump
apprised of which PG versions contain which columns. Not to mention
other tools such as pgAdmin that might like to show these settings.
It gets old pretty fast.

> What I'm concerned about with text[] is that I *think* it's going to
> force us to invent an analog of the relcache for tablespaces.

I'm not really convinced of that, but even if we do, so what? It's not
that much code to have an extra cache watching the syscache traffic.
There's an example in parse_oper.c of a specialized cache that's about
as complicated as this would be. It's about 150 lines including copious
comments. We didn't even bother to split it out into its own source
file.

> With
> hardwired columns, a regular catcache is all we need. But the
> reloptions stuff is designed to populate a struct, and once we
> populate that struct we have to have someplace to hang it - or I guess
> maybe we could reparse it on every call to cost_seqscan(),
> cost_index(), genericcostestimate(), etc, but that doesn't seem like a
> great idea.

Well, no, we would not do it that way. I would imagine instead that
plancat.c would be responsible for attaching appropriate cost values to
each RelOptInfo struct, so it'd be more like one lookup per referenced
table per query. It's possible that a cache would be useful even at
that load level, but I'm not convinced.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

| Next | Last
Pages: 1 2 3 4 5 6
Prev: DTrace compiler warnings
Next: add more frame types in window functions (ROWS)

patch - per-tablespace random_page_cost/seq_page_cost