Range types [PgSql]

Prev: Winflex
Next: [HACKERS] Fast or immediate shutdown

From: Tom Lane on 14 Dec 2009 11:25

Scott Bailey <artacus(a)comcast.net> writes:
> Because intervals (mathematical not SQL) can be open or closed at each
> end point we need to know what the next an previous value would be at
> the specified granularity. And while you can do some operations without
> knowing this, there are many you can't. For instance you could not tell
> whether two [] or () ranges were adjacent, or be able to coalesce an
> array of ranges.

This statement seems to me to demonstrate that you don't actually
understand the concept of open and closed ranges. It has nothing
whatsoever to do with assuming that the data type is discrete;
these concepts are perfectly well defined for the reals, for example.
What it is about is whether the inclusion conditions are "< bound"
or "<= bound".

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Nathan Boley on 14 Dec 2009 13:00

>> Because intervals (mathematical not SQL) can be open or closed at each
>> end point we need to know what the next an previous value would be at
>> the specified granularity. And while you can do some operations without
>> knowing this, there are many you can't. For instance you could not tell
>> whether two [] or () ranges were adjacent, or be able to coalesce an
>> array of ranges.
>
> This statement seems to me to demonstrate that you don't actually
> understand the concept of open and closed ranges. It has nothing
> whatsoever to do with assuming that the data type is discrete;
> these concepts are perfectly well defined for the reals, for example.
> What it is about is whether the inclusion conditions are "< bound"
> or "<= bound".

IMHO the first question is whether, for integers, [1,2] UNION [3,5]
should be equal to [1,5]. In math this is certainly true, and defining
'next' seems like a reasonable way to establish this in postgres.

The next question is whether, for floats, [1,3-FLT_EPSILON] UNION
[3,5] should be [1,5].

And the next question is whether, for numeric(6,2), [1,2.99] UNION
[3,5] should be [1,5].

FWIW, I would answer yes, no, yes to those three questions.

-Nathan

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jeff Davis on 14 Dec 2009 13:02

On Mon, 2009-12-14 at 11:25 -0500, Tom Lane wrote:
> Scott Bailey <artacus(a)comcast.net> writes:
> > Because intervals (mathematical not SQL) can be open or closed at each
> > end point we need to know what the next an previous value would be at
> > the specified granularity. And while you can do some operations without
> > knowing this, there are many you can't. For instance you could not tell
> > whether two [] or () ranges were adjacent, or be able to coalesce an
> > array of ranges.
>
> This statement seems to me to demonstrate that you don't actually
> understand the concept of open and closed ranges. It has nothing
> whatsoever to do with assuming that the data type is discrete;
> these concepts are perfectly well defined for the reals, for example.
> What it is about is whether the inclusion conditions are "< bound"
> or "<= bound".

Of course you can still define the obvious "contains" and "overlaps"
operators for a continuous range. But there are specific differences in
the API between a discrete range and a continuous range, which is what
Scott was talking about.

1. With discrete ranges, you can change the displayed
inclusivity/exclusivity without changing the value. For instance in the
integer domain, [ 5, 10 ] is the same value as [ 5, 11 ). This works on
both input and output. It is not possible to change the display for
continuous ranges.

2. With discrete ranges, you can get the last point before the range,
the first point in the range, the last point in the range, and the first
point after the range. These are more than enough to describe the range
completely. For continuous ranges, those functions will fail depending
on the inclusivity/exclusivity of the range. Practically speaking, you
would want to have a different set of accessors: start_inclusive(),
start_point(), end_point(), and end_inclusive(). However, those
functions can't be usefully defined on a discrete range.

We can't choose the API for continuous ranges as the API for discrete
ranges as well. If we did, then we would think that [5, 10] and [5, 11)
were not equal, but they are. Similarly, we would think that [5, 10] and
[11, 12] were not adjacent, but they are.

So there are certainly some user-visible API differences between the
two, and I don't think those differences can be hidden.

Regards,
Jeff Davis

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Jeff Davis on 14 Dec 2009 13:14

On Mon, 2009-12-14 at 09:58 -0500, Tom Lane wrote:
> In particular, the granularity examples you give seem to assume that
> the underlying datatype is exact not approximate --- which among other
> things will mean that it fails to work for float timestamps. Since
> timestamps are supposedly the main use-case, that's pretty troubling.

Additionally, granularity for timestamps is not particularly useful when
you get to things like "days" and "months" which don't have a clean
algebra.

Is the granule there only to try to support continuous ranges? If so, I
don't think it's necessary if we expose the API differences I outlined
in another email in this thread. Also, that would mean that we don't
need a granule for float, because we can already treat it as discrete*.

Regards,
Jeff Davis

*: nextafter() allows you to increment or decrement a double (loosely
speaking), and according to the man page it's part of C99 and
POSIX.1-2001.

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Tom Lane on 14 Dec 2009 13:21

Nathan Boley <npboley(a)gmail.com> writes:
>> This statement seems to me to demonstrate that you don't actually
>> understand the concept of open and closed ranges.

> IMHO the first question is whether, for integers, [1,2] UNION [3,5]
> should be equal to [1,5]. In math this is certainly true, and defining
> 'next' seems like a reasonable way to establish this in postgres.

Well, that's nice to have (not essential) for data types that actually
are discrete. It's not a sufficient argument for creating a definition
that is flat out broken for non-discrete datatypes.

It might be worth pointing out here that the concept of an open interval
was only invented in the first place for dealing with a continuum.
If you could assume the underlying set is discrete, every open interval
could be replaced with a closed one, using the next or prior value as
the bound instead. There would be no need for two kinds of interval.

If you are intending to support UNION on intervals, you are going to
need some more-general representation anyway. (I trust you don't think
we will accept a datatype for which [1,2] UNION [3,5] works but
[1,2] UNION [4,5] throws an error.) So whether the code can reduce the
result of a union to a single range or has to keep it as two ranges is
only an internal optimization issue anyhow, not something that should
drive an artificial (and infeasible) attempt to force every domain to be
discrete.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13
Prev: Winflex
Next: [HACKERS] Fast or immediate shutdown