From: Andrew Dunstan on


Tom Lane wrote:
> Andrew Dunstan <andrew(a)dunslane.net> writes:
>
>> Another thought: could we add a column to pg_type with a flag that's
>> true if the oids are in sort order? Then the comparison routines could
>> just look that up in the type cache and if it's true (as it often will
>> be) just return the oid comparison.
>>
>
> Well, having to do a cache lookup already makes it a couple orders of
> magnitude more expensive than an OID comparison. However, it's hard to
> say how much that matters in terms of total application performance.
> We really could do with a bit of performance testing here ...
>
>
>

I have done some. The performance hit is fairly horrible. Adding cache
lookups for the enum rows to the comarison routines made a REINDEX on a
1m row table where the index is on an enum column (the enum has 500
randomly ordered labels) jump from around 10s to around 70s. I think
that probably rules out doing anything like this for the existing enum
types. I think the most we can reasonably do there is to allow adding a
label to the end of the enum list. I'm fairly resistant to doing
something which will have a major performance impact, as I know there
are users who are relying on enums for performce reasons. I'm also
fairly resistant to doing things which will require table rewriting.

So the question then is: do we want to allow lots of flexibility for
positioning new labels with significant degradation in comparison
performace for a new enum variant, or have a new variant with some
restrictions which probably won't impact most users but would have
equivalent performance to the current enum family, or do nothing?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Dunstan on


Tom Lane wrote:
> Andrew Dunstan <andrew(a)dunslane.net> writes:
>
>> Tom Lane wrote:
>>
>>> Well, having to do a cache lookup already makes it a couple orders of
>>> magnitude more expensive than an OID comparison. However, it's hard to
>>> say how much that matters in terms of total application performance.
>>> We really could do with a bit of performance testing here ...
>>>
>
>
>> I have done some. The performance hit is fairly horrible. Adding cache
>> lookups for the enum rows to the comarison routines made a REINDEX on a
>> 1m row table where the index is on an enum column (the enum has 500
>> regards, tom lane
>>
>> randomly ordered labels) jump from around 10s to around 70s.
>>
>
> Hmmm... that's bad, but I bet it's still less than the cost of comparing
> NUMERICs. Also, did you make any attempt to avoid repetitive cache
> lookups by storing a pointer in fn_extra (cf array comparisons)?
>
>
>

OK, making a bit of progress. Attached is a sort of proof of concept
patch that does that. It stores a bsearchable list of {enum, sort_order}
pairs in fn_extra, along with a flag that indicates if the oids are in
fact ordered. This flag, which would be maintained in and populated from
pg_type, would allow avoidance of any significant performance penalty in
such cases by relying on straight Oid comparison. We'd probably need to
keep a count of labels in pg_type too so we could size the cache
appropriately. This approach just about buys the best of both worlds.
The execution time for the test mentioned above is down from around 70s
to around 20s. I think for a worst case that's not too bad, especially
when it is completely avoided unless we have perturbed the sort order.

If anyone wants to play along, my test set is available at
<http://developer.postgresql.org/~adunstan/enumtest.dmp> It's about 8.5Mb.

cheers

andrew
From: Tom Lane on
Bruce Momjian <bruce(a)momjian.us> writes:
> Sorry to be commenting late, but don't most people want to add to the
> end or beginning of the enum list, rather than in the middle, and can't
> we support that already?

We could allow adding a value, but we couldn't guarantee where it would
appear in the type's sort ordering. Depending on the current OID
counter it would usually show up either at the end or the beginning.
I think the general feeling is that this is too implementation-dependent
to be acceptable.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Bruce Momjian on
Tom Lane wrote:
> Bruce Momjian <bruce(a)momjian.us> writes:
> > Sorry to be commenting late, but don't most people want to add to the
> > end or beginning of the enum list, rather than in the middle, and can't
> > we support that already?
>
> We could allow adding a value, but we couldn't guarantee where it would
> appear in the type's sort ordering. Depending on the current OID
> counter it would usually show up either at the end or the beginning.
> I think the general feeling is that this is too implementation-dependent
> to be acceptable.

Well, we don't need the enum value to map into the entire oid range.
Can't we just add one to the top-most value and see if there is a
conflict?

--
Bruce Momjian <bruce(a)momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on
Bruce Momjian <bruce(a)momjian.us> writes:
> Well, we don't need the enum value to map into the entire oid range.
> Can't we just add one to the top-most value and see if there is a
> conflict?

If you don't use the OID counter to generate the new value, you're going
to have problems with race conditions. There's also that small chance
that there is no free value before 2^32.

The bottom line here is not wanting a feature that "usually" works but
fails once in awhile on the basis of conditions the user can't control.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9
Prev: [HACKERS] extensible enum types
Next: [HACKERS] About tapes