From: Andrew Dunstan on

A recent discussion with a developer convinced me that enums would be
much more useful if new values could be added to the label list easily
without changing the stored values. Given the current representation of
enums as a globally unique oid, I think this is just about impossible if
the new label is to go somewhere other than on the end of the list
(assuming that's where the next oid would get allocated).

So I have been thinking about a new type family which for the sake of
argument I will call varenum (please hold the bikeshedding till later).
Staying with 4 bytes as the stored value, the idea is that we partition
it into two 16 bit pieces. The high-order piece is the varenum type
identifier. The low-order piece would uniquely identify the label within
the set for that varenum. Initial values would be allocated like this:
calculate the space between values p as 2**16 / (16 + number_of_labels).
Then set the first value at 8 * p, then next at 9* p and so on. This is
designed to allow more space to add labels at the beginning and end of
the list, where this is more likely. Adding a label would be a matter of
finding the labels adjacent to the position where we want to add the new
label, and placing it half way between them, possibly with special rules
for the list ends. If we want to add the label between two labels having
values n and n+1 the addition would fail.

All this would mean a) we can't have more than 65536 varenum types in a
system, and no varenum type could have more than 65536 values (and
possibly less, depending on how they are added). In practice I think
these are quite reasonable restrictions, and 99.99% of users would never
come close to bumping up against them.

Given all this, we could then allow things like:

ALTER TYPE varenumtype ADD 'newlabel' [ BEFORE | AFTER 'existinglabel' ]


I haven't looked at how we'd set this up in the catalog, but I assume
that by analogy with pg_enum it should be fairly straightforward.

We could actually allow something like the above for existing enum types
without the optional clause, which would just add the label wherever the
next oid happened to fall, which would commonly be at the end of the
list. That would handle the common case where the application doesn't
care about the label ordering. That should be a much simpler change and
is probably worth doing first.

There was some discussion of this here:
<http://archives.postgresql.org/message-id/20080425182718.GD5888(a)alvh.no-ip.org>
But I'm fairly reluctant to do things which will require table rewrites.
I'm also less excited about the idea of removing values - that is
something I don't ever recall being asked about, but I have often been
asked about adding labels. For people prepared to rewrite tables, there
is currently a workaround: create a new enum type, alter the table to
use the new enum type, drop the old enum type, rename the new enum type
to the old enum type.

Thoughts?

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers