From: Tom Lane on
Robert Haas <robertmhaas(a)gmail.com> writes:
> Tom remarked in another email that he wasn't too happy with the
> opclass changes. They seem kind of grotty to me, too, but I don't
> immediately have a better idea. My fear is that there may be places
> in the code that rely on opclass operators only ever returning bool,
> and that changing that may break things. It also feels like allowing
> non-bool-returning opclass members only for this one specific case is
> kind of a hack: is this an instance of some more general problem that
> we ought to be solving in some more general way? Not sure.

Yes, that's exactly what I didn't like about it: the proposed changes
create confusion between opclass members that represent
index-optimizable WHERE conditions and those that represent
index-optimizable ORDER BY conditions. You can get away with that to
some extent as long as you assume that the latter type of operator
never yields boolean and so can never appear at the top level of
WHERE. But that assumption sucks. There are plenty of cases where
people ORDER BY boolean values, so who's to argue that we will never
want an operator returning boolean in the second category? And as
soon as you put it in, the planner is going to think that it's also
a potential index-qualification operator, which is something the AM
might or might not be prepared to support.

I think this is really unacceptable and there needs to be some cleaner
way of distinguishing the two types of operators. Possibly a couple of
boolean columns added to pg_amop (you'd need two because there are three
possible states, in case an operator really can serve both purposes in a
particular opclass). Or maybe we should do something else. But
ignoring the issue won't do.

Maybe a more general idea would be to invent "categories" of opclass
members, where the only existing category is "index search qualifier",
and these new knngist thingies are another, and maybe plus and minus for
window function ranges are a third. But I'm not sure what you do if one
operator can be in more than one category.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Fri, Feb 12, 2010 at 7:30 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Maybe a more general idea would be to invent "categories" of opclass
> members, where the only existing category is "index search qualifier",
> and these new knngist thingies are another, and maybe plus and minus for
> window function ranges are a third.  But I'm not sure what you do if one
> operator can be in more than one category.

Well, if you were willing to change pg_amop so that the key was
(amopfamily, amoplefttype, amoprighttype, amopcategory) rather than
just (amopfamily, amoplefttype, amoprighttype), the issue of what to
do if an operator can be in more than one category becomes moot. You
just specify the operator more than once if need be.

If you don't want the amopcategory to be part of the key, then you
just need to define it as a type that can handle multiple values yet
has a fast membership test. A character string of some type would be
flexible - you could use any single character as a category identifier
- but given that we don't expect many categories and we do want good
performance, it seems like an int4 used as a bitmap field would be
more appropriate.

I think the first approach is better, partly because it seems to lend
itself to a cleaner syntax for CREATE OPERATOR CLASS. Something like:

CREATE OPERATOR CLASS blah blah AS
OPERATOR 3 &&,
OPERATOR ORDER 15 <->;

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Fri, Feb 12, 2010 at 9:10 PM, Robert Haas <robertmhaas(a)gmail.com> wrote:
> On Fri, Feb 12, 2010 at 7:30 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> Maybe a more general idea would be to invent "categories" of opclass
>> members, where the only existing category is "index search qualifier",
>> and these new knngist thingies are another, and maybe plus and minus for
>> window function ranges are a third.  But I'm not sure what you do if one
>> operator can be in more than one category.
>
> Well, if you were willing to change pg_amop so that the key was
> (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than
> just (amopfamily, amoplefttype, amoprighttype), the issue of what to
> do if an operator can be in more than one category becomes moot.  You
> just specify the operator more than once if need be.

Except I'm full of it, because amopstrategy is in there too. Hmm.
And that's unfortunate because the syscache machinery is limited to
four columns as lookup keys.

This is a bit ugly, but one idea that occurs to me is to change
amopstrategy from int16 to int32. Internally, we'll treat the low 16
bits as the strategy number and the high 16 bits as the strategy
category, with strategy category 0 being "index search qualifier".

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Robert Haas <robertmhaas(a)gmail.com> writes:
>> Well, if you were willing to change pg_amop so that the key was
>> (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than
>> just (amopfamily, amoplefttype, amoprighttype), the issue of what to
>> do if an operator can be in more than one category becomes moot. �You
>> just specify the operator more than once if need be.

Yeah, that occurred to me too after sending my earlier email.

> Except I'm full of it, because amopstrategy is in there too. Hmm.
> And that's unfortunate because the syscache machinery is limited to
> four columns as lookup keys.

Ugh. Still, we could certainly change the 4-key limit to 5, though it
might be a tad tedious to go round and edit all the SearchSysCache and
related calls. Maybe while we were at it we should change them to
SearchSysCache1, SearchSysCache2, etc to not have the limit hardwired
textually in quite so many places...

> This is a bit ugly, but one idea that occurs to me is to change
> amopstrategy from int16 to int32. Internally, we'll treat the low 16
> bits as the strategy number and the high 16 bits as the strategy
> category, with strategy category 0 being "index search qualifier".

Hm, yeah that would work, but I agree it's ugly.

While thinking about different possible solutions here: one of the
things that was worrying me is that for cases where the same operator
can serve in more than one role, it might have to have either the same
opstrategy or different ones in different roles, depending on how the AM
has assigned strategy numbers. The method with an extra index column
side-steps that nicely since there are two unrelated pg_amop entries.
If there's only one entry then you lose if you need different
strategies. Robert's use-the-high-bits method works too, since there
would still be two separate entries, but some other possible
representations are eliminated by that worry.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on
On Fri, Feb 12, 2010 at 9:45 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas(a)gmail.com> writes:
>>> Well, if you were willing to change pg_amop so that the key was
>>> (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than
>>> just (amopfamily, amoplefttype, amoprighttype), the issue of what to
>>> do if an operator can be in more than one category becomes moot.  You
>>> just specify the operator more than once if need be.
>
> Yeah, that occurred to me too after sending my earlier email.
>
>> Except I'm full of it, because amopstrategy is in there too.  Hmm.
>> And that's unfortunate because the syscache machinery is limited to
>> four columns as lookup keys.
>
> Ugh.  Still, we could certainly change the 4-key limit to 5, though it
> might be a tad tedious to go round and edit all the SearchSysCache and
> related calls.  Maybe while we were at it we should change them to
> SearchSysCache1, SearchSysCache2, etc to not have the limit hardwired
> textually in quite so many places...

Maybe. It sounds sort of awful though; and there's probably a
distributed performance penalty involved

>> This is a bit ugly, but one idea that occurs to me is to change
>> amopstrategy from int16 to int32.  Internally, we'll treat the low 16
>> bits as the strategy number and the high 16 bits as the strategy
>> category, with strategy category 0 being "index search qualifier".
>
> Hm, yeah that would work, but I agree it's ugly.

On further review there's a serious problem with this idea:
pg_amop_opr_fam_index.

> While thinking about different possible solutions here: one of the
> things that was worrying me is that for cases where the same operator
> can serve in more than one role, it might have to have either the same
> opstrategy or different ones in different roles, depending on how the AM
> has assigned strategy numbers.  The method with an extra index column
> side-steps that nicely since there are two unrelated pg_amop entries.
> If there's only one entry then you lose if you need different
> strategies.  Robert's use-the-high-bits method works too, since there
> would still be two separate entries, but some other possible
> representations are eliminated by that worry.

OK, here's another idea. Let's just add a new column to pg_amop
called amoporderstrategy. If an operator can only be used for one
purpose or the other, we'll set the other value to -1.

....Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers