WIP: generalized index constraints [PgSql]

Prev: corrupted double-linked list
Next: Timestamp to time_t

From: Tom Lane on 19 Sep 2009 14:05

Jeff Davis <pgsql(a)j-davis.com> writes:
> There's an important unresolved question with this patch that I need to
> address, which just came to light: what about functional/expression
> indexes?

What about them? It's not clear why you think this requires anything
special.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jeff Davis on 19 Sep 2009 14:51

On Sat, 2009-09-19 at 14:05 -0400, Tom Lane wrote:
> What about them? It's not clear why you think this requires anything
> special.

>From a syntax standpoint, I need to represent one operator for every
index column involved in the constraint. So, if there's a functional
index on ((a||b)::circle), I clearly can't have an exclusion constraint
like (a =, b =).

I see two options:

1. (<expr> <op>), where <expr> is an expression over table attributes
that must have the exact signature as the expression for the index.
2. (<index_col> <op>), and then read the expression from the index

and in either case, use that expression for the extra checking that I
need to do: I need to check whether the input heap tuple conflicts with
concurrently inserting heap tuples, and I also need to do a recheck
step.

#1 seems like extra work and complexity, because I need to test for the
correct signature (maybe that's not difficult), and that extra
flexibility is pretty marginal -- I can't think of an obvious case where
you'd want different expressions. Also, it complicates the simple case
of wanting the expressions to match.

#2 is awkward because the expression columns of an index have generated
names, and you would have to write things like (pg_expression_1 &&).
Also, it makes the constraint too tied to the index, which is a valid
complaint Peter had.

Perhaps you can point me in the right direction to see if two
expressions/functions have matching signatures? Or, if that is too much
of a pain, perhaps I should just test for equal expressions?

Regards,
Jeff Davis

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 19 Sep 2009 15:26

Jeff Davis <pgsql(a)j-davis.com> writes:
> On Sat, 2009-09-19 at 14:05 -0400, Tom Lane wrote:
>> What about them? It's not clear why you think this requires anything
>> special.

>> From a syntax standpoint, I need to represent one operator for every
> index column involved in the constraint. So, if there's a functional
> index on ((a||b)::circle), I clearly can't have an exclusion constraint
> like (a =, b =).

> I see two options:

> 1. (<expr> <op>), where <expr> is an expression over table attributes
> that must have the exact signature as the expression for the index.
> 2. (<index_col> <op>), and then read the expression from the index

You need to do (1), I think, because (2) seems to require using the
index column name. We have generally felt that the names assigned
to index columns are implementation artifacts that the user ought not
rely on in SQL commands.

> and in either case, use that expression for the extra checking that I
> need to do: I need to check whether the input heap tuple conflicts with
> concurrently inserting heap tuples, and I also need to do a recheck
> step.

I haven't read the patch, but this whole discussion sounds to me like
it means you're trying to plug things in at the wrong level. Indexes
generally don't care where the values they are storing came from ---
whether it's a simple column or a expression result, it's all the same
to the index. I don't see why that shouldn't be true for exclusion
constraints too.

BTW, further betraying that I've not read the patch: what exactly are
you doing about the information_schema views? If we are treating these
things as SQL constraints, one would expect them to show up in
information_schema; but I don't see how to represent them there in any
adequate fashion, even without the expression-index angle. On the whole
I think we'd be a lot better off to NOT consider them to be constraints,
but just another option for CREATE INDEX.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jeff Davis on 19 Sep 2009 16:22

On Sat, 2009-09-19 at 15:26 -0400, Tom Lane wrote:
> I haven't read the patch, but this whole discussion sounds to me like
> it means you're trying to plug things in at the wrong level. Indexes
> generally don't care where the values they are storing came from ---
> whether it's a simple column or a expression result, it's all the same
> to the index. I don't see why that shouldn't be true for exclusion
> constraints too.

The design is that one backend needs to be able to see values being
inserted by other backends before commit. There are two ways I can see
to do this:

(a) have all concurrent inserters serialize doing something like:
1. acquire exclusive LWLock
2. search index for conflicts with dirty snapshot and recheck if
necessary
3. insert into index
4. release exclusive LWLock

(b) do what I do now, which is to:
1. acquire exlusive LWLock
2. put self in table of concurrent inserters, along with TID of heap
tuple I'm inserting
3. release exclusive LWLock
4. acquire shared LWLock
5. copy potential conflicts to local memory
6. release shared LWLock
7. test for real conflicts between my heap tuple and the potentially
conflicting heap tuple (which can be found by TID).
8. search index with dirty snapshot for conflicts and recheck if
necessary
9. insert tuple into index
10. acquire exclusive LWLock
11. remove self from table of concurrent inserters
12. release exclusive LWLock

Design (b) offers better concurrency because all conflict testing, index
searching, and index insertion take place without a lock at all. So, I
chose design (b). This has been out there for quite a long time[1][2],
and if it is an unacceptable design I need to know soon in order for
this feature to make it.

However, the consequence of (b) is that ExecInsertIndexTuples needs to
know about the translation from a heap tuple to an index tuple so that
the conflicts can be checked.

> BTW, further betraying that I've not read the patch: what exactly are
> you doing about the information_schema views? If we are treating these
> things as SQL constraints, one would expect them to show up in
> information_schema; but I don't see how to represent them there in any
> adequate fashion, even without the expression-index angle.

Nothing right now. I think they should just be omitted from
information_schema, which can only (almost by definition) represent the
lowest common denominator features.

> On the whole
> I think we'd be a lot better off to NOT consider them to be constraints,
> but just another option for CREATE INDEX.

You suggested allowing an ALTER TABLE representation[3] and that design
has floated around for quite some time as well.

ALTER TABLE also has a major advantage: multiple constraints can use the
same index. For instance, an index on (a, b, c) can be used to enforce
both (a =, b =) and (a =, c =). You can't do that with btree, and it
could be a powerful feature that might cause some people to choose my
mechanism for a regular UNIQUE constraint over btree's existing
uniqueness mechanism.

So, I actually switched over the ALTER TABLE as my primary syntactic
representation, and dropped the CREATE INDEX variant (I think that would
be worthwhile to bring back as an extra option, but I haven't yet). If I
need to drop ALTER TABLE, I need to know soon.

Regards,
Jeff Davis

[1] http://archives.postgresql.org/pgsql-hackers/2008-06/msg00404.php
[2] http://archives.postgresql.org/pgsql-hackers/2009-07/msg00302.php
[3] http://archives.postgresql.org/pgsql-hackers/2009-07/msg00406.php

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 19 Sep 2009 16:43

Jeff Davis <pgsql(a)j-davis.com> writes:
> The design is that one backend needs to be able to see values being
> inserted by other backends before commit.

I don't understand why this isn't handled exactly the way unique
constraints are done now. Frankly, the amount of added complexity you
propose below is enough to make me want to reject the patch forthwith;
given that it's going to be a relatively little-used feature, the bugs
are never going to be out of it completely if we do it like this.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: corrupted double-linked list
Next: Timestamp to time_t