From: Bernd Helmle on


--On 27. Januar 2010 15:42:45 -0500 Robert Haas <robertmhaas(a)gmail.com>
wrote:

>
> Bernd (or anyone), feel free to take a look in parallel. More eyes
> would be helpful...

I've planned to look at this tomorrow when i'm back in office.

--
Thanks

Bernd

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: KaiGai Kohei on
(2010/01/28 5:42), Robert Haas wrote:
> On Wed, Jan 27, 2010 at 10:17 AM, KaiGai Kohei<kaigai(a)kaigai.gr.jp> wrote:
>> (2010/01/27 23:29), Robert Haas wrote:
>>>
>>> 2010/1/27 KaiGai Kohei<kaigai(a)ak.jp.nec.com>:
>>>>
>>>> The attached patch is revised one based on the V3 approach.
>>>> The only difference from V3 is that it also applies checks on the
>>>> AT_AlterColumnType option, not only renameatt().
>>>
>>> I think I was clear about what the next step was for this patch in my
>>> previous email, but let me try again.
>>>
>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02407.php
>>>
>>> See also Tom's comments here:
>>>
>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00110.php
>>>
>>> I don't believe that either Tom or I are prepared to commit a patch
>>> based on this approach, at least not unless someone makes an attempt
>>> to do it the other way and finds an even more serious problem. If
>>> you're not interested in rewriting the patch along the lines Tom
>>> suggested, then we should just mark this as Returned with Feedback and
>>> move on.
>>
>> The V3/V5 patch was the rewritten one based on the Tom's comment, as is.
>> It counts the expected inhcount at the first find_all_inheritors() time
>> at once, and it compares the pg_attribute.attinhcount.
>> (In actually, find_all_inheritors() does not have a capability to count
>> the number of merged from a common origin, so I newly defined the
>> find_all_inheritors_with_inhcount().)
>>
>> Am I missing something?
>
> Err... I'm not sure. I thought I understood what the different
> versions of this patch were doing, but apparently I'm all confused.
> I'll take another look at this.

Just to be safe, I'd like to introduce the differences between revisions.

* The V2 patch
http://archives.postgresql.org/message-id/4B3F6353.3080105(a)kaigai.gr.jp

It just checked whether the pg_attribute.inhcount is larger than 1, or not.
But it was problematic when diamond-inheritance case.

* The V3 patch
http://archives.postgresql.org/message-id/4B41BB04.2070609(a)ak.jp.nec.com

It computed an expected-inhcount for each relations when renameatt()
picks up all the child relations on the top of recursion level.
Its cost to walk on the inheritance tree is similar to existing code
except for a case when we have many diamond-inheritance, because
find_all_inheritors() does not need to walk on the child relations
that was already checked, but find_all_inheritors_with_inhcount()
has to walk on these children to compute multiplicity.

See the following example,

T2
/ \
T1 T4 - T5
\ /
T3

In this case, find_all_inheritors() encounter T4 with T1-T2 path and
T1-T3 path. But it does not need to scan T5 on the second time,
because it already chains OID of the T4 and T5 with the first walks,
and list_concat_unique_oid() does not add duplicated OIDs anymore.

But find_all_inheritors_with_inhcount() needs to walks on the T4-T5 path
to compute the number of times being inherited, even if second walks.
Your testcase emphasized this difference so much. But, in my opinion,
it is quite natural because the existing code does not apply necessary
checks.

* The V4 patch
http://archives.postgresql.org/message-id/4B4EC1F1.30108(a)ak.jp.nec.com

I found out ALTER COLUMN TYPE also has same issue.
But ATPrepAlterColumnType() did recursive scan using ATSimpleRecursion(),
so it needs to modify the logic in this function. At that time, I preferred
to compute an expected inhcount for each recursion level, rather than
modifying the logic in ATPrepAlterColumnType(), although its cost was
larger than find_all_inheritors_with_inhcount().

* The V5 patch
http://archives.postgresql.org/message-id/4B5FFE4B.1060505(a)ak.jp.nec.com

We can find out a performance matter in the V4 patch, so I reverted the
base logic into V3 approach. In addition to the reverting, I also modified
the ATPrepAlterColumnType() to check whether the column to be altered
is inherited from multiple origin, or not.

> Bernd (or anyone), feel free to take a look in parallel. More eyes
> would be helpful...
>
> ...Robert
>
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(a)ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: KaiGai Kohei on
(2010/01/28 6:58), Robert Haas wrote:
> On Wed, Jan 27, 2010 at 3:42 PM, Robert Haas<robertmhaas(a)gmail.com> wrote:
>> On Wed, Jan 27, 2010 at 10:17 AM, KaiGai Kohei<kaigai(a)kaigai.gr.jp> wrote:
>>> (2010/01/27 23:29), Robert Haas wrote:
>>>>
>>>> 2010/1/27 KaiGai Kohei<kaigai(a)ak.jp.nec.com>:
>>>>>
>>>>> The attached patch is revised one based on the V3 approach.
>>>>> The only difference from V3 is that it also applies checks on the
>>>>> AT_AlterColumnType option, not only renameatt().
>>>>
>>>> I think I was clear about what the next step was for this patch in my
>>>> previous email, but let me try again.
>>>>
>>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg02407.php
>>>>
>>>> See also Tom's comments here:
>>>>
>>>> http://archives.postgresql.org/pgsql-hackers/2010-01/msg00110.php
>>>>
>>>> I don't believe that either Tom or I are prepared to commit a patch
>>>> based on this approach, at least not unless someone makes an attempt
>>>> to do it the other way and finds an even more serious problem. If
>>>> you're not interested in rewriting the patch along the lines Tom
>>>> suggested, then we should just mark this as Returned with Feedback and
>>>> move on.
>>>
>>> The V3/V5 patch was the rewritten one based on the Tom's comment, as is.
>>> It counts the expected inhcount at the first find_all_inheritors() time
>>> at once, and it compares the pg_attribute.attinhcount.
>>> (In actually, find_all_inheritors() does not have a capability to count
>>> the number of merged from a common origin, so I newly defined the
>>> find_all_inheritors_with_inhcount().)
>>>
>>> Am I missing something?
>>
>> Err... I'm not sure. I thought I understood what the different
>> versions of this patch were doing, but apparently I'm all confused.
>> I'll take another look at this.
>
> OK, I took a look at this. It's busted:
>
> test=# create table top (a integer);
> CREATE TABLE
> test=# create table upper1 () inherits (top);
> CREATE TABLE
> test=# create table upper2 () inherits (top);
> CREATE TABLE
> test=# create table lower1 () inherits (upper1, upper2);
> NOTICE: merging multiple inherited definitions of column "a"
> CREATE TABLE
> test=# create table lower2 () inherits (upper1, upper2);
> NOTICE: merging multiple inherited definitions of column "a"
> CREATE TABLE
> test=# create table spoiler (a integer);
> CREATE TABLE
> test=# create table bottom () inherits (lower1, lower2, spoiler);
> NOTICE: merging multiple inherited definitions of column "a"
> NOTICE: merging multiple inherited definitions of column "a"
> CREATE TABLE
> test=# alter table top rename a to b;
> ALTER TABLE
> test=# select * from spoiler;
> ERROR: could not find inherited attribute "a" of relation "bottom"

Hmm, indeed, this logic (V3/V5) is busted.

The idea of V4 patch can also handle this case correctly, although it
is lesser in performance.
I wonder whether it is really unacceptable cost in performance, or not.
Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations,
and I don't think this bugfix will damage to the reputation of PostgreSQL.

Where should we go on the next?

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(a)ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: KaiGai Kohei on
(2010/01/29 0:46), Robert Haas wrote:
> 2010/1/27 KaiGai Kohei<kaigai(a)ak.jp.nec.com>:
>> Hmm, indeed, this logic (V3/V5) is busted.
>>
>> The idea of V4 patch can also handle this case correctly, although it
>> is lesser in performance.
>> I wonder whether it is really unacceptable cost in performance, or not.
>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations,
>> and I don't think this bugfix will damage to the reputation of PostgreSQL.
>>
>> Where should we go on the next?
>
> Isn't the problem here just that the following comment is 100% wrong?
>
> /*
> * Unlike find_all_inheritors(), we need to walk on
> child relations
> * that have diamond inheritance tree, because this
> function has to
> * return correct expected inhecount to the caller.
> */
>
> It seems to me that the right solution here is to just add one more
> argument to find_all_inheritors(), something like List
> **expected_inh_count.
>
> Am I missing something?

The find_all_inheritors() does not walk on child relations more than
two times, even if a child has multiple parents inherited from common
origin, because list_concat_unique_oid() ignores the given OID if it
is already on the list. It means all the child relations under the
relation already walked on does not checked anywhere. (Of course,
this assumption is correct for the purpose of find_all_inheritors()
with minimum cost.)

What we want to do here is to compute the number of times a certain
child relation is inherited from a common origin; it shall be the
expected-inhcount. So, we need an arrangement to the logic.

For example, see the following diagram.

T2
/ \
T1 T4---T5
\ /
T3

If we call find_all_inheritors() with T1. The find_inheritance_children()
returns T2 and T3 for T1.
Then, it calls find_inheritance_children() for T2, and it returns T4.
Then, it calls find_inheritance_children() for T3, and it returns T4, but
it is already in the "rels_list", so list_concat_unique_oid() ignores it.
Then, it calls find_inheritance_children() for T4, and it returns T5.

In this example, we want the expected inhcount for T2 and T3 should be 1,
for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so
they will have 1 incorrectly.
Even if we count up the ignored OID (T4), find_all_inheritors() does not
walk on T5, because it is already walked on obviously when T4 is ignored.

However, my V3/V5 logic is also busted when a certain relation is inherited
from a relation which has multiple parents.

Right now, we have only the V4 logic which works correctly....

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(a)ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: KaiGai Kohei on
(2010/01/29 9:29), Robert Haas wrote:
> 2010/1/28 KaiGai Kohei<kaigai(a)ak.jp.nec.com>:
>> (2010/01/29 0:46), Robert Haas wrote:
>>> 2010/1/27 KaiGai Kohei<kaigai(a)ak.jp.nec.com>:
>>>> Hmm, indeed, this logic (V3/V5) is busted.
>>>>
>>>> The idea of V4 patch can also handle this case correctly, although it
>>>> is lesser in performance.
>>>> I wonder whether it is really unacceptable cost in performance, or not.
>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations,
>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL.
>>>>
>>>> Where should we go on the next?
>>>
>>> Isn't the problem here just that the following comment is 100% wrong?
>>>
>>> /*
>>> * Unlike find_all_inheritors(), we need to walk on
>>> child relations
>>> * that have diamond inheritance tree, because this
>>> function has to
>>> * return correct expected inhecount to the caller.
>>> */
>>>
>>> It seems to me that the right solution here is to just add one more
>>> argument to find_all_inheritors(), something like List
>>> **expected_inh_count.
>>>
>>> Am I missing something?
>>
>> The find_all_inheritors() does not walk on child relations more than
>> two times, even if a child has multiple parents inherited from common
>> origin, because list_concat_unique_oid() ignores the given OID if it
>> is already on the list. It means all the child relations under the
>> relation already walked on does not checked anywhere. (Of course,
>> this assumption is correct for the purpose of find_all_inheritors()
>> with minimum cost.)
>>
>> What we want to do here is to compute the number of times a certain
>> child relation is inherited from a common origin; it shall be the
>> expected-inhcount. So, we need an arrangement to the logic.
>>
>> For example, see the following diagram.
>>
>> T2
>> / \
>> T1 T4---T5
>> \ /
>> T3
>>
>> If we call find_all_inheritors() with T1. The find_inheritance_children()
>> returns T2 and T3 for T1.
>> Then, it calls find_inheritance_children() for T2, and it returns T4.
>> Then, it calls find_inheritance_children() for T3, and it returns T4, but
>> it is already in the "rels_list", so list_concat_unique_oid() ignores it.
>> Then, it calls find_inheritance_children() for T4, and it returns T5.
>>
>> In this example, we want the expected inhcount for T2 and T3 should be 1,
>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so
>> they will have 1 incorrectly.
>> Even if we count up the ignored OID (T4), find_all_inheritors() does not
>> walk on T5, because it is already walked on obviously when T4 is ignored.
>
> I think the count for T5 should be 1, and I think that the count for
> T4 can easily be made to be 2 by coding the algorithm correctly.

Ahh, it is right. I was confused.

Is it possible to introduce the logic mathematical-strictly?
Now I'm considering whether the find_all_inheritors() logic is suitable
for any cases, or not.

What we want to compute here is:

WITH RECURSIVE r AS (
SELECT 't1'::regclass AS inhrelid
UNION ALL
SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent
) -- r is all the child relations inherited from 't1'
SELECT inhrelid::regclass, count(*) FROM pg_inherits
WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid;

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(a)ak.jp.nec.com>

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers