reducing NUMERIC size for 9.1 [PgSql]

Prev: SHOW TABLES
Next: [HACKERS] Listen/Notify in 9.0

From: Tom Lane on 29 Jul 2010 17:35

Robert Haas <robertmhaas(a)gmail.com> writes:
> Did you look at the
> patch to move the numeric stuff out of the .h file that I attached a
> few emails back? If that looks OK, I can commit it and then redo the
> rest of this along the lines we've discussed.

A couple of thoughts:

* It'd be good to get NUMERIC_HDRSZ out of there too, especially since
it'll have only the shakiest connection to reality after this patch goes in.
It looks like doing that would only require modifying type_maximum_size().
I'd suggest inventing another small function along the lines of
int32 numeric_maximum_size(int32 typemod)
so that the NUMERIC-specific knowledge can be pulled out of format_type.c.

* I'd suggest leaving a comment along the lines of
/* The actual contents of Numeric are private to numeric.c */
with the now-opaque typedef for Numeric.

Otherwise +1.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 30 Jul 2010 00:36

On Thu, Jul 29, 2010 at 5:35 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas(a)gmail.com> writes:
>> Did you look at the
>> patch to move the numeric stuff out of the .h file that I attached a
>> few emails back? �If that looks OK, I can commit it and then redo the
>> rest of this along the lines we've discussed.
>
> A couple of thoughts:
>
> * It'd be good to get NUMERIC_HDRSZ out of there too, especially since
> it'll have only the shakiest connection to reality after this patch goes in.
> It looks like doing that would only require modifying type_maximum_size().
> I'd suggest inventing another small function along the lines of
> � � � �int32 numeric_maximum_size(int32 typemod)
> so that the NUMERIC-specific knowledge can be pulled out of format_type.c.
>
> * I'd suggest leaving a comment along the lines of
> � � � �/* The actual contents of Numeric are private to numeric.c */
> with the now-opaque typedef for Numeric.
>
> Otherwise +1.

Done, with those changes.

But, looking at it a bit more carefully, isn't the maximum-size logic
for numeric rather bogus?

rhaas=# \d n
Table "public.n"
Column | Type | Modifiers
--------+--------------+-----------
a | numeric(2,1) |

rhaas=# select atttypmod from pg_attribute where attrelid =
'n'::regclass and attname = 'a';
atttypmod
-----------
131077
(1 row)

(gdb) p numeric_maximum_size(131077)
$1 = 9

rhaas=# select a, pg_column_size(a),
pg_column_size(a::text::numeric(2,1)) from n;
a | pg_column_size | pg_column_size
-----+----------------+----------------
1.1 | 9 | 12
(1 row)

i.e. According to the max-size logic, the ostensible maximum size of a
numeric(2,1) is 9 bytes, but in fact the real maximum is 12 bytes = 4
byte varlena header + 2 bytes for sign/dscale + 2 bytes for weight +
(2 NumericDigits * 2 bytes/digit).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 29 Jul 2010 13:20

Robert Haas <robertmhaas(a)gmail.com> writes:
> On Wed, Jul 28, 2010 at 3:00 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> No, you can do something like this:
>>
>> typedef union numeric
>> {
>> uint16 word1;
>> numeric_short short;
>> numeric_long long;
>> } numeric;

> That doesn't quite work because there's also a varlena header that has
> to be accounted for, so the third member of the union can't be a
> simple uint16.

Yeah, you would need an additional layer of struct to represent the
numeric with a length word in front of it. I think this is not
necessarily bad because it would perhaps open the door to working
directly with short-varlena-header values, which is never going to
be possible with this:

> typedef struct NumericData
> {
> int32 varlen;
> int16 n_header;
> union { ...

OTOH alignment considerations may make that idea hopeless anyway.

> Why n_data as char[1] instead of NumericDigit, you ask?

Yes, we'd have to export NumericDigit if we wanted to declare these
structs "properly" in numeric.h. I wonder if that decision should
be revisited. I'd lean to making the whole struct local to numeric.c
though. Is there anyplace else that really ought to see it?

>> I hadn't actually looked. I think though that it's a mistake to break
>> compatibility on both dscale and weight when you only need to break one.
>> Also, weight is *certainly* uninteresting for NaNs since it's not even
>> meaningful unless there are digits. dscale could conceivably be worth
>> something.

> I don't think I'm breaking compatibility on anything. Can you clarify
> what part of the code you're referring to here? I'm sort of lost.

On-disk is what I'm thinking about. Right now, a NaN's first word is
all dscale except the sign bits. You're proposing to change that
but I think it's unnecessary to do so. If we do it the way I'm
thinking, dscale would still mean the same in a NaN, and we'd simply
be ignoring the weight field (which might or might not be there
physically).

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 29 Jul 2010 17:25

On Thu, Jul 29, 2010 at 5:03 PM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas(a)gmail.com> writes:
>> OK. �I think you're misinterpreting the point of that comment, which
>> may mean that it needs some clarification. �By "the two byte header
>> format is also used", I think I really meant "the header (and in fact
>> the entire value) is just 2 bytes". �Really, the low order bits have
>> neither the old interpretation nor the new interpretation: they don't
>> have any interpretation at all - they're completely meaningless.
>> That's what the part after the word "but" was intended to clarify.
>> Every routine in numeric.c checks for NUMERIC_IS_NAN() and gives it
>> some special handling before doing anything else, so NUMERIC_WEIGHT()
>> and NUMERIC_DSCALE() are never called in that case.
>
> I would suggest the comment ought to read something like
>
> � � � �NaN values also use a two-byte header (in fact, the
> � � � �whole value is always only two bytes). �The low order bits of
> � � � �the header word are available to store dscale, though dscale
> � � � �is not currently used with NaNs.

I can do something along those lines, though I'm reluctant to mention
dscale specifically since we have no agreement that such a thing makes
sense, or is the only/best use for those bits. Did you look at the
patch to move the numeric stuff out of the .h file that I attached a
few emails back? If that looks OK, I can commit it and then redo the
rest of this along the lines we've discussed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 30 Jul 2010 01:13

Robert Haas <robertmhaas(a)gmail.com> writes:
> But, looking at it a bit more carefully, isn't the maximum-size logic
> for numeric rather bogus?

Perhaps, but I think you're confused on at least one point.
numeric(2,1) has to be able to hold 2 decimal digits, not 2
NumericDigits (which'd actually be 8 decimal digits given
the current code).

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: SHOW TABLES
Next: [HACKERS] Listen/Notify in 9.0