reducing NUMERIC size for 9.1 [PgSql]

Prev: SHOW TABLES
Next: [HACKERS] Listen/Notify in 9.0

From: Brendan Jurd on 15 Jul 2010 12:58

On 10 July 2010 00:58, Robert Haas <robertmhaas(a)gmail.com> wrote:
> EnterpriseDB asked me to develop the attached patch to reduce the
> on-disk size of numeric and to submit it for inclusion in PG 9.1.
> After searching the archives, I found a possible design for this by
> Tom Lane based on an earlier proposal by Simon Riggs.

Hi Robert,

I'm reviewing this patch for the commitfest, and so far everything in
the patch looks good. Compile and regression tests worked fine.

However, I was trying to find a simple way to verify that it really
was reducing the on-disk size of compact numeric values and didn't get
the results I was expecting.

I dropped one thousand numerics with value zero into a table and
checked the on-disk size of the relation with your patch and on a
stock 8.4 instance. In both cases the result was exactly the same.

Shouldn't the table be smaller with your patch? Or is there something
wrong with my test?

CREATE TEMP TABLE numeric_short (a numeric);

INSERT INTO numeric_short (a)
SELECT 0::numeric FROM generate_series(1, 1000) i;

Regards,
BJ

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 15 Jul 2010 13:47

On Jul 15, 2010, at 11:58 AM, Brendan Jurd <direvus(a)gmail.com> wrote:
> On 10 July 2010 00:58, Robert Haas <robertmhaas(a)gmail.com> wrote:
>> EnterpriseDB asked me to develop the attached patch to reduce the
>> on-disk size of numeric and to submit it for inclusion in PG 9.1.
>> After searching the archives, I found a possible design for this by
>> Tom Lane based on an earlier proposal by Simon Riggs.
>
> Hi Robert,
>
> I'm reviewing this patch for the commitfest, and so far everything in
> the patch looks good. Compile and regression tests worked fine.
>
> However, I was trying to find a simple way to verify that it really
> was reducing the on-disk size of compact numeric values and didn't get
> the results I was expecting.
>
> I dropped one thousand numerics with value zero into a table and
> checked the on-disk size of the relation with your patch and on a
> stock 8.4 instance. In both cases the result was exactly the same.
>
> Shouldn't the table be smaller with your patch? Or is there something
> wrong with my test?
>
> CREATE TEMP TABLE numeric_short (a numeric);
>
> INSERT INTO numeric_short (a)
> SELECT 0::numeric FROM generate_series(1, 1000) i;

Well, on that test, you'll save only 2000 bytes, which is less than a full block, so there's no guarantee the difference would be noticeable at the relation level. Scale it up by a factor of 10 and the difference should be measurable.

You might also look at testing with pg_column_size().

....Robert
--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Thom Brown on 16 Jul 2010 09:17

On 16 July 2010 14:14, Brendan Jurd <direvus(a)gmail.com> wrote:
> On 16 July 2010 22:51, Richard Huxton <dev(a)archonet.com> wrote:
>> On 16/07/10 13:44, Brendan Jurd wrote:>
>>> At this scale we should be seeing around 2 million bytes saved, but
>>> instead the tables are identical. �Is there some kind of disconnect in
>>> how the new short numeric is making it to the disk, or perhaps another
>>> effect interfering with my test?
>>
>> You've probably got rows being aligned to a 4-byte boundary. You're probably
>> not going to see any change unless you have a couple of 1-byte columns that
>> get placed after the numeric. If you went from 10 bytes down to 8, that
>> should be visible.
>
> Ah, thanks for the hint Richard. �I didn't see any change with two
> 1-byte columns after the numeric, but with four such columns I did
> finally see a difference.
>
> Test script:
>
> BEGIN;
>
> CREATE TEMP TABLE foo (a numeric, b bool, c bool, d bool, e bool);
>
> INSERT INTO foo (a, b, c, d, e)
> SELECT 0::numeric, false, true, i % 2 = 0, i % 2 = 1
> FROM generate_series(1, 1000000) i;
>
> SELECT pg_total_relation_size('foo'::regclass);
>
> ROLLBACK;
>
> Results:
>
> 8.4: 44326912
> HEAD with patch: 36290560
>
> That settles my concern and I'm happy to pass this along to a commiter.
>
> Cheers,
> BJ
>

Joy! :) Nice patch Robert.

Thom

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Brendan Jurd on 16 Jul 2010 08:44

On 16 July 2010 03:47, Robert Haas <robertmhaas(a)gmail.com> wrote:
> On Jul 15, 2010, at 11:58 AM, Brendan Jurd <direvus(a)gmail.com> wrote:
>> I dropped one thousand numerics with value zero into a table and
>> checked the on-disk size of the relation with your patch and on a
>> stock 8.4 instance. �In both cases the result was exactly the same.
>>
>> Shouldn't the table be smaller with your patch? �Or is there something
>> wrong with my test?
>
> Well, on that test, you'll save only 2000 bytes, which is less than a full block, so there's no guarantee the difference would be noticeable at the relation level. �Scale it up by a factor of 10 and the difference should be measurable.
>
> You might also look at testing with pg_column_size().
>

pg_column_size() did return the results I was expecting.
pg_column_size(0::numeric) is 8 bytes on 8.4 and it's 6 bytes on HEAD
with your patch.

However, even with 1 million rows of 0::numeric in my test table,
there was no difference at all in the on-disk relation size (36290560
with 36249600 in the table and 32768 in the fsm).

At this scale we should be seeing around 2 million bytes saved, but
instead the tables are identical. Is there some kind of disconnect in
how the new short numeric is making it to the disk, or perhaps another
effect interfering with my test?

Cheers,
BJ

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Richard Huxton on 16 Jul 2010 08:51

On 16/07/10 13:44, Brendan Jurd wrote:
>
> pg_column_size() did return the results I was expecting.
> pg_column_size(0::numeric) is 8 bytes on 8.4 and it's 6 bytes on HEAD
> with your patch.

> At this scale we should be seeing around 2 million bytes saved, but
> instead the tables are identical. Is there some kind of disconnect in
> how the new short numeric is making it to the disk, or perhaps another
> effect interfering with my test?

You've probably got rows being aligned to a 4-byte boundary. You're
probably not going to see any change unless you have a couple of 1-byte
columns that get placed after the numeric. If you went from 10 bytes
down to 8, that should be visible.

--
Richard Huxton
Archonet Ltd

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

| Next | Last
Pages: 1 2 3 4 5 6
Prev: SHOW TABLES
Next: [HACKERS] Listen/Notify in 9.0