From: Tom Lane on 31 Jul 2010 23:15 Robert Haas <robertmhaas(a)gmail.com> writes: > On Sat, Jul 31, 2010 at 9:16 PM, Stephen Frost <sfrost(a)snowman.net> wrote: >> * Kevin Grittner (Kevin.Grittner(a)wicourts.gov) wrote: >>> Robert Haas �07/31/10 12:33 PM >>> >>>> Tom Lane �wrote: >>>>> Failing to store stats isn't a bug? >>>> Well, it kind of sounds more like you're removing a known >>>> limitation than fixing a bug. >>> It's operating as designed and documented. >> I have to disagree with this, to be honest. �The fact that we've >> documented what is completely unexpected and frustrating behaviour >> doesn't mean we get to say it's not a bug. �Not collecting stats, at >> all, is a pretty bad bug, in my view. I'm a bit bemused by the claim that this behavior is "documented". One comment buried deep in the bowels of the source is not user-visible documentation in my book. > I guess I'd appreciate it if someone could explain in more detail in > what cases we fail to collect stats. Do we have a typanalyze function > here that can't possibly work for anything, ever? Or is it just some > subset of the cases? ANALYZE normally collects stats for any expression that there is an expression index for. However, it will punt and fail to collect stats if the expression index uses an opclass whose opckeytype (ie, storage datatype) is different from the actual expression datatype. A quick look into the system catalogs shows that that applies to these opclasses: amname | opcname | opcintype | opckeytype --------+------------------+-------------------------------+----------------------------- btree | name_ops | name | cstring gist | point_ops | point | box gist | poly_ops | polygon | box gist | circle_ops | circle | box gin | _int4_ops | integer[] | integer gin | _text_ops | text[] | text gin | _abstime_ops | abstime[] | abstime gin | _bit_ops | bit[] | bit gin | _bool_ops | boolean[] | boolean gin | _bpchar_ops | character[] | character gin | _bytea_ops | bytea[] | bytea gin | _char_ops | "char"[] | "char" gin | _cidr_ops | cidr[] | cidr gin | _date_ops | date[] | date gin | _float4_ops | real[] | real gin | _float8_ops | double precision[] | double precision gin | _inet_ops | inet[] | inet gin | _int2_ops | smallint[] | smallint gin | _int8_ops | bigint[] | bigint gin | _interval_ops | interval[] | interval gin | _macaddr_ops | macaddr[] | macaddr gin | _name_ops | name[] | name gin | _numeric_ops | numeric[] | numeric gin | _oid_ops | oid[] | oid gin | _oidvector_ops | oidvector[] | oidvector gin | _time_ops | time without time zone[] | time without time zone gin | _timestamptz_ops | timestamp with time zone[] | timestamp with time zone gin | _timetz_ops | time with time zone[] | time with time zone gin | _varbit_ops | bit varying[] | bit varying gin | _varchar_ops | character varying[] | character varying gin | _timestamp_ops | timestamp without time zone[] | timestamp without time zone gin | _money_ops | money[] | money gin | _reltime_ops | reltime[] | reltime gin | _tinterval_ops | tinterval[] | tinterval gist | tsvector_ops | tsvector | gtsvector gin | tsvector_ops | tsvector | text gist | tsquery_ops | tsquery | bigint (37 rows) Now, of the above the only cases where we'd be likely to be able to do anything very useful with stats on the expression value are the name case, which isn't that exciting in practice, and the tsvector cases. For tsvector it was only with 8.4 that we had non-toy stats code, so while the limitation is ancient it's only recently that it started to be meaningful. I don't think this can be claimed to be a corner case. If you set up an FTS index according to the first alternative offered in http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX you will find that the system fails to collect stats for it and so you get stupid default estimates for your FTS queries. If this were a "documented" limitation I'd expect to see a big red warning there to *not* do it that way. The only way that you actually get usable tsvector stats at the moment is to explicitly store the tsvector as an ordinary column, as in the second approach offered in the above documentation section. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 31 Jul 2010 21:48 On Sat, Jul 31, 2010 at 9:16 PM, Stephen Frost <sfrost(a)snowman.net> wrote: > * Kevin Grittner (Kevin.Grittner(a)wicourts.gov) wrote: >> Robert Haas �07/31/10 12:33 PM >>> >> > Tom Lane �wrote: >> >> Failing to store stats isn't a bug? >> > >> > Well, it kind of sounds more like you're removing a known >> > limitation than fixing a bug. >> >> It's operating as designed and documented. �There is room for >> enhancement, but the only thing which could possibly justify this as >> 9.0 material is if there was a demonstrated performance regression in >> 9.0 for which this was the safest cure. > > I have to disagree with this, to be honest. �The fact that we've > documented what is completely unexpected and frustrating behaviour > doesn't mean we get to say it's not a bug. �Not collecting stats, at > all, is a pretty bad bug, in my view. I guess I'd appreciate it if someone could explain in more detail in what cases we fail to collect stats. Do we have a typanalyze function here that can't possibly work for anything, ever? Or is it just some subset of the cases? (Apologies if this has been discussed on the original thread; I was unable to find it in the archives.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Robert Haas on 1 Aug 2010 08:54 On Sat, Jul 31, 2010 at 11:15 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: > Now, of the above the only cases where we'd be likely to be able to do > anything very useful with stats on the expression value are the name > case, which isn't that exciting in practice, and the tsvector cases. > For tsvector it was only with 8.4 that we had non-toy stats code, so > while the limitation is ancient it's only recently that it started to be > meaningful. > > I don't think this can be claimed to be a corner case. �If you set up > an FTS index according to the first alternative offered in > > http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX > > you will find that the system fails to collect stats for it and so you > get stupid default estimates for your FTS queries. �If this were a > "documented" limitation I'd expect to see a big red warning there to > *not* do it that way. �The only way that you actually get usable > tsvector stats at the moment is to explicitly store the tsvector as an > ordinary column, as in the second approach offered in the above > documentation section. Yeah, maybe you're right. But I'd still prefer to see us break the ABI and do this just in 9.0 rather than changing 8.4. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: "Kevin Grittner" on 1 Aug 2010 11:34 Tom Lane wrote: Robert Haas writes: >> I guess I'd appreciate it if someone could explain in more detail >> in what cases we fail to collect stats. > [detailed description] > I don't think this can be claimed to be a corner case. If you set > up an FTS index according to the first alternative offered in > > http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX > > you will find that the system fails to collect stats for it and so > you get stupid default estimates for your FTS queries. Objection to a fix in 9.0 withdrawn. No opinion on backpatching. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: Tom Lane on 1 Aug 2010 11:54 Robert Haas <robertmhaas(a)gmail.com> writes: > On Sat, Jul 31, 2010 at 11:15 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote: >> I don't think this can be claimed to be a corner case. �If you set up >> an FTS index according to the first alternative offered in >> >> http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX >> >> you will find that the system fails to collect stats for it and so you >> get stupid default estimates for your FTS queries. > Yeah, maybe you're right. But I'd still prefer to see us break the > ABI and do this just in 9.0 rather than changing 8.4. OK, I can live with that. I'll take a look at it shortly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 Prev: [HACKERS] More fun with GIN lossy-page pointers Next: Don't let your libido decrease |