Prev: [HACKERS] knngist patch support
Next: pgsql: Remove old-style VACUUM FULL (which wasknown for a little while
From: Tom Lane on 12 Feb 2010 19:30 Robert Haas <robertmhaas(a)gmail.com> writes: > Tom remarked in another email that he wasn't too happy with the > opclass changes. They seem kind of grotty to me, too, but I don't > immediately have a better idea. My fear is that there may be places > in the code that rely on opclass operators only ever returning bool, > and that changing that may break things. It also feels like allowing > non-bool-returning opclass members only for this one specific case is > kind of a hack: is this an instance of some more general problem that > we ought to be solving in some more general way? Not sure. Yes, that's exactly what I didn't like about it: the proposed changes create confusion between opclass members that represent index-optimizable WHERE conditions and those that represent index-optimizable ORDER BY conditions. You can get away with that to some extent as long as you assume that the latter type of operator never yields boolean and so can never appear at the top level of WHERE. But that assumption sucks. There are plenty of cases where people ORDER BY boolean values, so who's to argue that we will never want an operator returning boolean in the second category? And as soon as you put it in, the planner is going to think that it's also a potential index-qualification operator, which is something the AM might or might not be prepared to support. I think this is really unacceptable and there needs to be some cleaner way of distinguishing the two types of operators. Possibly a couple of boolean columns added to pg_amop (you'd need two because there are three possible states, in case an operator really can serve both purposes in a particular opclass). Or maybe we should do something else. But ignoring the issue won't do. Maybe a more general idea would be to invent "categories" of opclass members, where the only existing category is "index search qualifier", and these new knngist thingies are another, and maybe plus and minus for window function ranges are a third. But I'm not sure what you do if one operator can be in more than one category. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 12 Feb 2010 21:10 On Fri, Feb 12, 2010 at 7:30 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > Maybe a more general idea would be to invent "categories" of opclass > members, where the only existing category is "index search qualifier", > and these new knngist thingies are another, and maybe plus and minus for > window function ranges are a third. But I'm not sure what you do if one > operator can be in more than one category. Well, if you were willing to change pg_amop so that the key was (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than just (amopfamily, amoplefttype, amoprighttype), the issue of what to do if an operator can be in more than one category becomes moot. You just specify the operator more than once if need be. If you don't want the amopcategory to be part of the key, then you just need to define it as a type that can handle multiple values yet has a fast membership test. A character string of some type would be flexible - you could use any single character as a category identifier - but given that we don't expect many categories and we do want good performance, it seems like an int4 used as a bitmap field would be more appropriate. I think the first approach is better, partly because it seems to lend itself to a cleaner syntax for CREATE OPERATOR CLASS. Something like: CREATE OPERATOR CLASS blah blah AS OPERATOR 3 &&, OPERATOR ORDER 15 <->; ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 12 Feb 2010 21:27 On Fri, Feb 12, 2010 at 9:10 PM, Robert Haas <robertmhaas(a)gmail.com> wrote: > On Fri, Feb 12, 2010 at 7:30 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: >> Maybe a more general idea would be to invent "categories" of opclass >> members, where the only existing category is "index search qualifier", >> and these new knngist thingies are another, and maybe plus and minus for >> window function ranges are a third. But I'm not sure what you do if one >> operator can be in more than one category. > > Well, if you were willing to change pg_amop so that the key was > (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than > just (amopfamily, amoplefttype, amoprighttype), the issue of what to > do if an operator can be in more than one category becomes moot. You > just specify the operator more than once if need be. Except I'm full of it, because amopstrategy is in there too. Hmm. And that's unfortunate because the syscache machinery is limited to four columns as lookup keys. This is a bit ugly, but one idea that occurs to me is to change amopstrategy from int16 to int32. Internally, we'll treat the low 16 bits as the strategy number and the high 16 bits as the strategy category, with strategy category 0 being "index search qualifier". ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 12 Feb 2010 21:45 Robert Haas <robertmhaas(a)gmail.com> writes: >> Well, if you were willing to change pg_amop so that the key was >> (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than >> just (amopfamily, amoplefttype, amoprighttype), the issue of what to >> do if an operator can be in more than one category becomes moot. �You >> just specify the operator more than once if need be. Yeah, that occurred to me too after sending my earlier email. > Except I'm full of it, because amopstrategy is in there too. Hmm. > And that's unfortunate because the syscache machinery is limited to > four columns as lookup keys. Ugh. Still, we could certainly change the 4-key limit to 5, though it might be a tad tedious to go round and edit all the SearchSysCache and related calls. Maybe while we were at it we should change them to SearchSysCache1, SearchSysCache2, etc to not have the limit hardwired textually in quite so many places... > This is a bit ugly, but one idea that occurs to me is to change > amopstrategy from int16 to int32. Internally, we'll treat the low 16 > bits as the strategy number and the high 16 bits as the strategy > category, with strategy category 0 being "index search qualifier". Hm, yeah that would work, but I agree it's ugly. While thinking about different possible solutions here: one of the things that was worrying me is that for cases where the same operator can serve in more than one role, it might have to have either the same opstrategy or different ones in different roles, depending on how the AM has assigned strategy numbers. The method with an extra index column side-steps that nicely since there are two unrelated pg_amop entries. If there's only one entry then you lose if you need different strategies. Robert's use-the-high-bits method works too, since there would still be two separate entries, but some other possible representations are eliminated by that worry. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 12 Feb 2010 21:57
On Fri, Feb 12, 2010 at 9:45 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas(a)gmail.com> writes: >>> Well, if you were willing to change pg_amop so that the key was >>> (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than >>> just (amopfamily, amoplefttype, amoprighttype), the issue of what to >>> do if an operator can be in more than one category becomes moot. You >>> just specify the operator more than once if need be. > > Yeah, that occurred to me too after sending my earlier email. > >> Except I'm full of it, because amopstrategy is in there too. Hmm. >> And that's unfortunate because the syscache machinery is limited to >> four columns as lookup keys. > > Ugh. Still, we could certainly change the 4-key limit to 5, though it > might be a tad tedious to go round and edit all the SearchSysCache and > related calls. Maybe while we were at it we should change them to > SearchSysCache1, SearchSysCache2, etc to not have the limit hardwired > textually in quite so many places... Maybe. It sounds sort of awful though; and there's probably a distributed performance penalty involved >> This is a bit ugly, but one idea that occurs to me is to change >> amopstrategy from int16 to int32. Internally, we'll treat the low 16 >> bits as the strategy number and the high 16 bits as the strategy >> category, with strategy category 0 being "index search qualifier". > > Hm, yeah that would work, but I agree it's ugly. On further review there's a serious problem with this idea: pg_amop_opr_fam_index. > While thinking about different possible solutions here: one of the > things that was worrying me is that for cases where the same operator > can serve in more than one role, it might have to have either the same > opstrategy or different ones in different roles, depending on how the AM > has assigned strategy numbers. The method with an extra index column > side-steps that nicely since there are two unrelated pg_amop entries. > If there's only one entry then you lose if you need different > strategies. Robert's use-the-high-bits method works too, since there > would still be two separate entries, but some other possible > representations are eliminated by that worry. OK, here's another idea. Let's just add a new column to pg_amop called amoporderstrategy. If an operator can only be used for one purpose or the other, we'll set the other value to -1. ....Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |