Functional dependencies and GROUP BY [PgSql]

Prev: [HACKERS] Functional dependencies and GROUP BY
Next: [HACKERS] How to get permission to debug postgres?

From: Alex Hunsaker on 17 Jul 2010 00:29

On Fri, Jun 25, 2010 at 14:06, Peter Eisentraut <peter_e(a)gmx.net> wrote:
> Second version:

Hi!

Ive looked this over. Looks great! I have some nits about the
documentation and comments ( non issues like referencing primary keys
when it really means not null unique indexes :-P ), but on the whole
it works and looks good.

The only corner case I have run into is creating a view with what I
would call an implicit 'not null' constraint. Demonstration below:

create table nn (a int4 not null, b int4, unique (a));
select * from nn group by a; -- should this work? I think not?
a | b
---+---
(0 rows)

create view vv as select a, b from nn group by a;
select * from vv;
a | b
---+---
(0 rows)

=# alter table nn alter column a drop not null;
=# select * from nn group by a; -- ok, broken makes sense
ERROR: column "nn.b" must appear in the GROUP BY clause or be used in
an aggregate function
LINE 1: SELECT * from nn group by a;

=# select * from vv; -- yipes should be broken?
a | b
---+---
(0 rows)

Im thinking we should not allow the "select * from nn group by a;" to
work. Thoughts?

(FYI I do plan on doing some performance testing with large columns
later, any other requests?)

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Peter Eisentraut on 17 Jul 2010 06:15

On fre, 2010-07-16 at 22:29 -0600, Alex Hunsaker wrote:
> The only corner case I have run into is creating a view with what I
> would call an implicit 'not null' constraint. Demonstration below:
>
> create table nn (a int4 not null, b int4, unique (a));
> select * from nn group by a; -- should this work? I think not?

I believe I referred to this upsthread. There is another patch in the
commitfest about explicitly representing NOT NULL constraints in
pg_constraint. Then this case would create a dependency on those
constraints. So we need to get that other patch in first.

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alex Hunsaker on 17 Jul 2010 13:13

On Sat, Jul 17, 2010 at 04:15, Peter Eisentraut <peter_e(a)gmx.net> wrote:
> On fre, 2010-07-16 at 22:29 -0600, Alex Hunsaker wrote:
>> The only corner case I have run into is creating a view with what I
>> would call an implicit 'not null' constraint. Demonstration below:
>>
>> create table nn (a int4 not null, b int4, unique (a));
>> select * from nn group by a; -- should this work? I think not?
>
> I believe I referred to this upsthread.

Aww, and here I thought I had just been diligent :). In other news
its really no surprise that your test with 1600 columns had little
effect. As it loops over the the indexes, then the index keys and
then the group by items right? So I would expect the more indexes you
had or group by items to slow it down. Not so much the number of
columns. Right?

Anyhow it sounds like I should try it on top of the other patch and
see if it works. I assume it might still need some fixups to work
with that other patch? Or do you expect it to just work?

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alex Hunsaker on 18 Jul 2010 00:30

On Fri, Jul 16, 2010 at 22:29, Alex Hunsaker <badalex(a)gmail.com> wrote:
> (FYI I do plan on doing some performance testing with large columns
> later, any other requests?)

And here are the results. All tests are with an empty table with 1500
int4 columns. There is a unique non null index on the first column.

(non assert build)
A: select count(1) from (select * from test group by ...1500 columns...) as res;
B: select count(1) from (select * from test group by a_0) as res; --
a_0 has the not null unique index

CVS A: 360ms
PATCH A: 370ms
PATCH B: 60ms

1500 indexes (one per column, on the column):
CVS: A: 670ms
PATCH A: 850ms
PATCH B: 561ms

So it seems for tables with lots of columns the patch is faster, at
least when you omit all the columns from the group by. I suspect for
most "normal" (5-20 columns) usage it should be a wash.

(Stupid question) Does anyone know why HEAD is quite a bit slower when
there are lots off indexes? Do we end up looping and perhaps locking
them or something?

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Alex Hunsaker on 18 Jul 2010 00:40

On Sat, Jul 17, 2010 at 11:13, Alex Hunsaker <badalex(a)gmail.com> wrote:
> On Sat, Jul 17, 2010 at 04:15, Peter Eisentraut <peter_e(a)gmx.net> wrote:
>> On fre, 2010-07-16 at 22:29 -0600, Alex Hunsaker wrote:
>>> The only corner case I have run into is creating a view with what I
>>> would call an implicit 'not null' constraint. Demonstration below:
>>>
>>> create table nn (a int4 not null, b int4, unique (a));
>>> select * from nn group by a; -- should this work? I think not?
>>
>> I believe I referred to this upsthread.

Yeah, I went back and reread the thread and um... yep its right where
you posted patch 2. I think I read it, forgot about it and then it
bubbled up to my subconscious while testing :).

> Anyhow it sounds like I should try it on top of the other patch and
> see if it works. I assume it might still need some fixups to work
> with that other patch? Or do you expect it to just work?
[ referring to the not null pg_constraint patch ]

Probably no surprise to you, I tried it on top of the not null
pg_constraint patch and it did not work 'out of the box'. Mainly I
was curious if there was some magic going on that I did not see. In
any event do you think its worth adding a regression test for this?

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6
Prev: [HACKERS] Functional dependencies and GROUP BY
Next: [HACKERS] How to get permission to debug postgres?