A thought on Index Organized Tables [PgSql]

Prev: [COMMITTERS] pgsql: Oops, don't forget to rewind the directory before scanning it to
Next: Time travel on the buildfarm

From: Gokulakannan Somasundaram on 26 Feb 2010 01:19

>
>
> Proving that a set of comparison operators are consistent just by
> examining their runtime behavior is probably equivalent to solving the
> halting problem. I can't see us doing it, or wanting to accept the
> overhead of checking it even if it could be done.
>

The overhead of checking is very minimal. When we update, we have to just
carry the tuple id of the heaptuple and insert transaction id. We check
whether they are same with the index snapshot. If it is not same, then we
will go ahead and start treating this index as either dropped / as a normal
index ( without snapshot ). Since the overhead of dropping / marking it as
normal index will occur very rarely, we need not be concerned about that
performance impact ( i suppose). The overhead of checking is going to be
there only on suspicious user defined functions. ( We can have a flag for
is_suspicious )

>
> To be a bit more concrete: the typical sort of failure that you could
> get from broken btree operators is failure of transitivity, that is
> the comparators report A < B and B < C for some A, B, C, but do not say
> that A < C when those two values are compared directly. I don't see any
> convenient way to detect that as a byproduct of normal index operations,
> because you wouldn't typically have a reason to make all three
> comparisons in close proximity. Indeed, the searching and sorting
> algorithms do their best to avoid making "redundant" comparisons of that
> kind.
>
> I am not saying that we should do analysis of runtime behavior. I am saying
that, we would provide a set of built-in functions which will be always
stable (with some flag in pg_proc) . We will scan the provided function for
any functions that are not in the stable set provided, when it gets created.
Now if the function has one such function, then it is declared as
suspicious to be broken/volatile.

Thanks for the reply,
Gokul.

From: Gokulakannan Somasundaram on 26 Feb 2010 03:36

>
> To be a bit more concrete: the typical sort of failure that you could
> get from broken btree operators is failure of transitivity, that is
> the comparators report A < B and B < C for some A, B, C, but do not say
> that A < C when those two values are compared directly. I don't see any
> convenient way to detect that as a byproduct of normal index operations,
> because you wouldn't typically have a reason to make all three
> comparisons in close proximity. Indeed, the searching and sorting
> algorithms do their best to avoid making "redundant" comparisons of that
> kind.
>

This is interesting Tom, but i am unable to understand, why it won't affect
the current indexes. While insertion it might get inserted in a block and
offset, and while searching it might either return no results / show a wrong
place. Because ordering is required for searching also right? I definitely
feel, i am missing something here.

Gokul.

From: Karl Schnaitter on 26 Feb 2010 04:27

On Fri, Feb 26, 2010 at 12:36 AM, Gokulakannan Somasundaram <
gokul007(a)gmail.com> wrote:

> To be a bit more concrete: the typical sort of failure that you could
>> get from broken btree operators is failure of transitivity, that is
>> the comparators report A < B and B < C for some A, B, C, but do not say
>> that A < C when those two values are compared directly. I don't see any
>> convenient way to detect that as a byproduct of normal index operations,
>> because you wouldn't typically have a reason to make all three
>> comparisons in close proximity. Indeed, the searching and sorting
>> algorithms do their best to avoid making "redundant" comparisons of that
>> kind.
>>
>
> This is interesting Tom, but i am unable to understand, why it won't affect
> the current indexes. While insertion it might get inserted in a block and
> offset, and while searching it might either return no results / show a wrong
> place. Because ordering is required for searching also right? I definitely
> feel, i am missing something here.
>

It definitely affects current indexes. We can't completely avoid bad user
functions. That is why it is important to put limits on how much damage they
can do. That's the motivation for the idea I mentioned before, of
double-checking visibility data in an IndexTuple before letting it survive a
VACUUM.

From: Greg Stark on 26 Feb 2010 07:54

On Fri, Feb 26, 2010 at 4:47 AM, Tom Lane <tgl(a)sss.pgh.pa.us> wrote:
>> I feel the other one is easy. To store the hint bits inside the ItemId, in
>> the place of size.
>
> No, we're not going there. That breaks the fundamental page content
> manipulation algorithms, and falls down for tuples not yet stored in a
> page (or being examined without a pointer to the page readily at hand),
> and has no redeeming social value anyway compared to doing it in the
> proven fashion.

Well we were already talking about moving the hint bits to someplace
else to enable CRC checking. My favourite place was the line pointer,
but you wanted a separate area -- either of which would have these
problems.

But this is all irrelevant to the particular issue at hand. The bigger
point is that you've chosen a change that requires massive changes to
all different parts of the system and causes problems for all
different situations. You might be able to come up with solutions for
some of them but I bet there are some you realize later are insoluble.
And a lot of the solutions themselves have problems or impose
limitations that we won't be able to live with.

Much better to take on a "simple" project like enabling full
sequential index scans which you claimed you had a solution for and
which is in any case an important sub-problem for IOT.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Gokulakannan Somasundaram on 26 Feb 2010 07:57

>
> It definitely affects current indexes. We can't completely avoid bad user
> functions. That is why it is important to put limits on how much damage they
> can do. That's the motivation for the idea I mentioned before, of
> double-checking visibility data in an IndexTuple before letting it survive a
> VACUUM.
>

No i don't say it would affect Vacuum, but i am suspecting that it would
affect Index based select. Since Vacuum uses a sequential scan of tuples, it
doesn't require the ordering operator, but any index based search would
require a ordering operator for binary search and for comparing with the
right most key.

Thanks,
Gokul.

First | Prev | Next | Last
Pages: 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Prev: [COMMITTERS] pgsql: Oops, don't forget to rewind the directory before scanning it to
Next: Time travel on the buildfarm