Prev: [HACKERS] More frame options in window functions
Next: [HACKERS] psql tab completion for DO blocks
From: KaiGai Kohei on 28 Jan 2010 22:02 (2010/01/29 9:58), KaiGai Kohei wrote: > (2010/01/29 9:29), Robert Haas wrote: >> 2010/1/28 KaiGai Kohei<kaigai(a)ak.jp.nec.com>: >>> (2010/01/29 0:46), Robert Haas wrote: >>>> 2010/1/27 KaiGai Kohei<kaigai(a)ak.jp.nec.com>: >>>>> Hmm, indeed, this logic (V3/V5) is busted. >>>>> >>>>> The idea of V4 patch can also handle this case correctly, although it >>>>> is lesser in performance. >>>>> I wonder whether it is really unacceptable cost in performance, or not. >>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL. >>>>> >>>>> Where should we go on the next? >>>> >>>> Isn't the problem here just that the following comment is 100% wrong? >>>> >>>> /* >>>> * Unlike find_all_inheritors(), we need to walk on >>>> child relations >>>> * that have diamond inheritance tree, because this >>>> function has to >>>> * return correct expected inhecount to the caller. >>>> */ >>>> >>>> It seems to me that the right solution here is to just add one more >>>> argument to find_all_inheritors(), something like List >>>> **expected_inh_count. >>>> >>>> Am I missing something? >>> >>> The find_all_inheritors() does not walk on child relations more than >>> two times, even if a child has multiple parents inherited from common >>> origin, because list_concat_unique_oid() ignores the given OID if it >>> is already on the list. It means all the child relations under the >>> relation already walked on does not checked anywhere. (Of course, >>> this assumption is correct for the purpose of find_all_inheritors() >>> with minimum cost.) >>> >>> What we want to do here is to compute the number of times a certain >>> child relation is inherited from a common origin; it shall be the >>> expected-inhcount. So, we need an arrangement to the logic. >>> >>> For example, see the following diagram. >>> >>> T2 >>> / \ >>> T1 T4---T5 >>> \ / >>> T3 >>> >>> If we call find_all_inheritors() with T1. The find_inheritance_children() >>> returns T2 and T3 for T1. >>> Then, it calls find_inheritance_children() for T2, and it returns T4. >>> Then, it calls find_inheritance_children() for T3, and it returns T4, but >>> it is already in the "rels_list", so list_concat_unique_oid() ignores it. >>> Then, it calls find_inheritance_children() for T4, and it returns T5. >>> >>> In this example, we want the expected inhcount for T2 and T3 should be 1, >>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so >>> they will have 1 incorrectly. >>> Even if we count up the ignored OID (T4), find_all_inheritors() does not >>> walk on T5, because it is already walked on obviously when T4 is ignored. >> >> I think the count for T5 should be 1, and I think that the count for >> T4 can easily be made to be 2 by coding the algorithm correctly. > > Ahh, it is right. I was confused. > > Is it possible to introduce the logic mathematical-strictly? > Now I'm considering whether the find_all_inheritors() logic is suitable > for any cases, or not. I modified the logic to compute an expected inhcount of the child relations. What we want to compute here is to compute the number of times a certain relation being inherited directly from any relations delivered from a unique origin. If the column to be renamed is eventually inherited from a common origin, its attinhcount is not larger than the expected inhcount. > WITH RECURSIVE r AS ( > SELECT 't1'::regclass AS inhrelid > UNION ALL > SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent > ) -- r is all the child relations inherited from 't1' > SELECT inhrelid::regclass, count(*) FROM pg_inherits > WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid; The modified logic increments the expected inhcount of the relation already walked on. If we found a child relation twice or more, it means the child relation is at the junction of the inheritance tree. In this case, we don't walk on the child relations any more, but it is not necessary, because the first path already checked it. The find_all_inheritors() is called from several points more than renameatt(), so I added find_all_inheritors_with_inhcount() which takes argument of the expected inhcount list. And, find_all_inheritors() was also modified to call find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication. It is the result of Berrnd's test case. - CVS HEAD 0.895s 0.903s 0.884s 0.896s 0.892s - with V6 patch 0.972s 0.941s 0.961s 0.949s 0.946s Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai(a)ak.jp.nec.com>
From: KaiGai Kohei on 31 Jan 2010 18:41 (2010/01/30 3:36), Robert Haas wrote: > 2010/1/28 KaiGai Kohei<kaigai(a)ak.jp.nec.com>: >> (2010/01/29 9:58), KaiGai Kohei wrote: >>> (2010/01/29 9:29), Robert Haas wrote: >>>> 2010/1/28 KaiGai Kohei<kaigai(a)ak.jp.nec.com>: >>>>> (2010/01/29 0:46), Robert Haas wrote: >>>>>> 2010/1/27 KaiGai Kohei<kaigai(a)ak.jp.nec.com>: >>>>>>> Hmm, indeed, this logic (V3/V5) is busted. >>>>>>> >>>>>>> The idea of V4 patch can also handle this case correctly, although it >>>>>>> is lesser in performance. >>>>>>> I wonder whether it is really unacceptable cost in performance, or not. >>>>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >>>>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL. >>>>>>> >>>>>>> Where should we go on the next? >>>>>> >>>>>> Isn't the problem here just that the following comment is 100% wrong? >>>>>> >>>>>> /* >>>>>> * Unlike find_all_inheritors(), we need to walk on >>>>>> child relations >>>>>> * that have diamond inheritance tree, because this >>>>>> function has to >>>>>> * return correct expected inhecount to the caller. >>>>>> */ >>>>>> >>>>>> It seems to me that the right solution here is to just add one more >>>>>> argument to find_all_inheritors(), something like List >>>>>> **expected_inh_count. >>>>>> >>>>>> Am I missing something? >>>>> >>>>> The find_all_inheritors() does not walk on child relations more than >>>>> two times, even if a child has multiple parents inherited from common >>>>> origin, because list_concat_unique_oid() ignores the given OID if it >>>>> is already on the list. It means all the child relations under the >>>>> relation already walked on does not checked anywhere. (Of course, >>>>> this assumption is correct for the purpose of find_all_inheritors() >>>>> with minimum cost.) >>>>> >>>>> What we want to do here is to compute the number of times a certain >>>>> child relation is inherited from a common origin; it shall be the >>>>> expected-inhcount. So, we need an arrangement to the logic. >>>>> >>>>> For example, see the following diagram. >>>>> >>>>> T2 >>>>> / \ >>>>> T1 T4---T5 >>>>> \ / >>>>> T3 >>>>> >>>>> If we call find_all_inheritors() with T1. The find_inheritance_children() >>>>> returns T2 and T3 for T1. >>>>> Then, it calls find_inheritance_children() for T2, and it returns T4. >>>>> Then, it calls find_inheritance_children() for T3, and it returns T4, but >>>>> it is already in the "rels_list", so list_concat_unique_oid() ignores it. >>>>> Then, it calls find_inheritance_children() for T4, and it returns T5. >>>>> >>>>> In this example, we want the expected inhcount for T2 and T3 should be 1, >>>>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so >>>>> they will have 1 incorrectly. >>>>> Even if we count up the ignored OID (T4), find_all_inheritors() does not >>>>> walk on T5, because it is already walked on obviously when T4 is ignored. >>>> >>>> I think the count for T5 should be 1, and I think that the count for >>>> T4 can easily be made to be 2 by coding the algorithm correctly. >>> >>> Ahh, it is right. I was confused. >>> >>> Is it possible to introduce the logic mathematical-strictly? >>> Now I'm considering whether the find_all_inheritors() logic is suitable >>> for any cases, or not. >> >> I modified the logic to compute an expected inhcount of the child relations. >> >> What we want to compute here is to compute the number of times a certain >> relation being inherited directly from any relations delivered from a unique >> origin. If the column to be renamed is eventually inherited from a common >> origin, its attinhcount is not larger than the expected inhcount. >> >>> WITH RECURSIVE r AS ( >>> SELECT 't1'::regclass AS inhrelid >>> UNION ALL >>> SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent >>> ) -- r is all the child relations inherited from 't1' >>> SELECT inhrelid::regclass, count(*) FROM pg_inherits >>> WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid; >> >> The modified logic increments the expected inhcount of the relation already >> walked on. If we found a child relation twice or more, it means the child >> relation is at the junction of the inheritance tree. >> In this case, we don't walk on the child relations any more, but it is not >> necessary, because the first path already checked it. >> >> The find_all_inheritors() is called from several points more than renameatt(), >> so I added find_all_inheritors_with_inhcount() which takes argument of the >> expected inhcount list. And, find_all_inheritors() was also modified to call >> find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication. >> >> It is the result of Berrnd's test case. >> >> - CVS HEAD >> 0.895s >> 0.903s >> 0.884s >> 0.896s >> 0.892s >> >> - with V6 patch >> 0.972s >> 0.941s >> 0.961s >> 0.949s >> 0.946s > > Well, that seems a lot better. Unfortunately, I can't commit this > code: it's mind-bogglingly ugly. I don't believe that duplicating > find_all_inheritors is the right solution (a point I've mentioned > previously), and it's certainly not right to use typeName->location as > a place to store an unrelated attribute inheritance count. There is > also at least one superfluous variable renaming; and the recursion > handling looks pretty grotty to me, too. > > I wonder if we should just leave this alone for 9.0 and revisit the > issue after doing some of the previously-proposed refactoring for 9.1. > The amount of time we're spending trying to fix this is somewhat out > of proportion to the importance of the problem. At least, I think we can fix the bug on renameatt() case in clean way, although we need an additional recursion handling in ATPrepAlterColumnType() to fix ALTER COLUMN TYPE cases. Should it focus on only the original renameatt() matter in this stage? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai(a)ak.jp.nec.com> -- Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
From: KaiGai Kohei on 31 Jan 2010 22:04 (2010/02/01 8:41), KaiGai Kohei wrote: > (2010/01/30 3:36), Robert Haas wrote: >> 2010/1/28 KaiGai Kohei<kaigai(a)ak.jp.nec.com>: >>> (2010/01/29 9:58), KaiGai Kohei wrote: >>>> (2010/01/29 9:29), Robert Haas wrote: >>>>> 2010/1/28 KaiGai Kohei<kaigai(a)ak.jp.nec.com>: >>>>>> (2010/01/29 0:46), Robert Haas wrote: >>>>>>> 2010/1/27 KaiGai Kohei<kaigai(a)ak.jp.nec.com>: >>>>>>>> Hmm, indeed, this logic (V3/V5) is busted. >>>>>>>> >>>>>>>> The idea of V4 patch can also handle this case correctly, although it >>>>>>>> is lesser in performance. >>>>>>>> I wonder whether it is really unacceptable cost in performance, or not. >>>>>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations, >>>>>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL. >>>>>>>> >>>>>>>> Where should we go on the next? >>>>>>> >>>>>>> Isn't the problem here just that the following comment is 100% wrong? >>>>>>> >>>>>>> /* >>>>>>> * Unlike find_all_inheritors(), we need to walk on >>>>>>> child relations >>>>>>> * that have diamond inheritance tree, because this >>>>>>> function has to >>>>>>> * return correct expected inhecount to the caller. >>>>>>> */ >>>>>>> >>>>>>> It seems to me that the right solution here is to just add one more >>>>>>> argument to find_all_inheritors(), something like List >>>>>>> **expected_inh_count. >>>>>>> >>>>>>> Am I missing something? >>>>>> >>>>>> The find_all_inheritors() does not walk on child relations more than >>>>>> two times, even if a child has multiple parents inherited from common >>>>>> origin, because list_concat_unique_oid() ignores the given OID if it >>>>>> is already on the list. It means all the child relations under the >>>>>> relation already walked on does not checked anywhere. (Of course, >>>>>> this assumption is correct for the purpose of find_all_inheritors() >>>>>> with minimum cost.) >>>>>> >>>>>> What we want to do here is to compute the number of times a certain >>>>>> child relation is inherited from a common origin; it shall be the >>>>>> expected-inhcount. So, we need an arrangement to the logic. >>>>>> >>>>>> For example, see the following diagram. >>>>>> >>>>>> T2 >>>>>> / \ >>>>>> T1 T4---T5 >>>>>> \ / >>>>>> T3 >>>>>> >>>>>> If we call find_all_inheritors() with T1. The find_inheritance_children() >>>>>> returns T2 and T3 for T1. >>>>>> Then, it calls find_inheritance_children() for T2, and it returns T4. >>>>>> Then, it calls find_inheritance_children() for T3, and it returns T4, but >>>>>> it is already in the "rels_list", so list_concat_unique_oid() ignores it. >>>>>> Then, it calls find_inheritance_children() for T4, and it returns T5. >>>>>> >>>>>> In this example, we want the expected inhcount for T2 and T3 should be 1, >>>>>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so >>>>>> they will have 1 incorrectly. >>>>>> Even if we count up the ignored OID (T4), find_all_inheritors() does not >>>>>> walk on T5, because it is already walked on obviously when T4 is ignored. >>>>> >>>>> I think the count for T5 should be 1, and I think that the count for >>>>> T4 can easily be made to be 2 by coding the algorithm correctly. >>>> >>>> Ahh, it is right. I was confused. >>>> >>>> Is it possible to introduce the logic mathematical-strictly? >>>> Now I'm considering whether the find_all_inheritors() logic is suitable >>>> for any cases, or not. >>> >>> I modified the logic to compute an expected inhcount of the child relations. >>> >>> What we want to compute here is to compute the number of times a certain >>> relation being inherited directly from any relations delivered from a unique >>> origin. If the column to be renamed is eventually inherited from a common >>> origin, its attinhcount is not larger than the expected inhcount. >>> >>>> WITH RECURSIVE r AS ( >>>> SELECT 't1'::regclass AS inhrelid >>>> UNION ALL >>>> SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent >>>> ) -- r is all the child relations inherited from 't1' >>>> SELECT inhrelid::regclass, count(*) FROM pg_inherits >>>> WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid; >>> >>> The modified logic increments the expected inhcount of the relation already >>> walked on. If we found a child relation twice or more, it means the child >>> relation is at the junction of the inheritance tree. >>> In this case, we don't walk on the child relations any more, but it is not >>> necessary, because the first path already checked it. >>> >>> The find_all_inheritors() is called from several points more than renameatt(), >>> so I added find_all_inheritors_with_inhcount() which takes argument of the >>> expected inhcount list. And, find_all_inheritors() was also modified to call >>> find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication. >>> >>> It is the result of Berrnd's test case. >>> >>> - CVS HEAD >>> 0.895s >>> 0.903s >>> 0.884s >>> 0.896s >>> 0.892s >>> >>> - with V6 patch >>> 0.972s >>> 0.941s >>> 0.961s >>> 0.949s >>> 0.946s >> >> Well, that seems a lot better. Unfortunately, I can't commit this >> code: it's mind-bogglingly ugly. I don't believe that duplicating >> find_all_inheritors is the right solution (a point I've mentioned >> previously), and it's certainly not right to use typeName->location as >> a place to store an unrelated attribute inheritance count. There is >> also at least one superfluous variable renaming; and the recursion >> handling looks pretty grotty to me, too. >> >> I wonder if we should just leave this alone for 9.0 and revisit the >> issue after doing some of the previously-proposed refactoring for 9.1. >> The amount of time we're spending trying to fix this is somewhat out >> of proportion to the importance of the problem. > > At least, I think we can fix the bug on renameatt() case in clean way, > although we need an additional recursion handling in ATPrepAlterColumnType() > to fix ALTER COLUMN TYPE cases. > > Should it focus on only the original renameatt() matter in this stage? The attached patch modified find_all_inheritors() to return the list of expected inhcount, if List * pointer is given. And, it focuses on only the bugs in renameatt() case. Also, the ALTER COLUMN TYPE case should be also fixed in the 9.1 release (or 9.0.1?). Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai(a)ak.jp.nec.com>
From: KaiGai Kohei on 1 Feb 2010 19:48 (2010/02/02 3:01), Robert Haas wrote: > 2010/1/31 KaiGai Kohei<kaigai(a)ak.jp.nec.com>: >> The attached patch modified find_all_inheritors() to return the list of >> expected inhcount, if List * pointer is given. And, it focuses on only >> the bugs in renameatt() case. > > I have cleaned up and simplified this patch. Attached is the version > I intend to commit. Changes: > > - make find_all_inheritors return the list of OIDs, as previously, > instead of using an out parameter > - undo some useless variable renamings > - undo some useless whitespace changes > - rework comments > - fix regression tests to avoid using the same alias twice in the same query > - add an ORDER BY clause to the regression tests so that they pass on my machine > - improve the names of some of the new variables > - instead of adding an additional argument to renameatt(), just > replace the existing 'bool recursing' with 'int expected_parents'. > This allows merging the two versions of the "cannot rename inherited > column" message together, which seems like a reasonably good idea to > me, though if someone has a better idea that's fine. I didn't think > the one additional word added to the message provided enough clarity > to make it worth creating another translatable string. Thanks for your checks. >> Also, the ALTER COLUMN TYPE case should be also fixed in the 9.1 release >> (or 9.0.1?). > > If the fix is something we could commit for 9.0.1, then we ought to do > it now before 9.0 is released. If you want to submit a follow-on > patch to address ALTER COLUMN TYPE once this is committed, then please > do so. The attached patch also fixes ALTER COLUMN TYPE case. It replaced the 'recursing' argument in ATPrepCmd() by 'expected_parents', and it is delivered to ATPrepAlterColumnType(). The logic to forbid altering the column type is same as renameatt(). ATSimpleRecursion() is also modified to use forboth() to call ATPrepCmd() recursively. One concern is at ATOneLevelRecursion() which is called by ATPrepAddColumn() only, and it also calls ATPrepCmd() for the direct children. Right now, I set 1 on the 'expected_parents'. However, IMO, here is no reason why we cannot rewrite the ATPrepAddColumn() using ATSimpleRecursion(). Eventually, ATExecAddColumn() shall be invoked several times for same column, if the inheritance tree has diamond-inheritance structure. And, it increments pg_attribute.attinhcount except for the first invocation. If we store the 'expected_parents' on the ColumnDef->inhcount, we don't need to call the ATExecAddColumn() more than once in a single ALTER TABLE command. Any comments? And, when should we do it? 9.0? 9.1? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai(a)ak.jp.nec.com>
From: KaiGai Kohei on 1 Feb 2010 20:47 (2010/02/02 9:48), KaiGai Kohei wrote: >>> Also, the ALTER COLUMN TYPE case should be also fixed in the 9.1 release >>> (or 9.0.1?). >> >> If the fix is something we could commit for 9.0.1, then we ought to do >> it now before 9.0 is released. If you want to submit a follow-on >> patch to address ALTER COLUMN TYPE once this is committed, then please >> do so. > > The attached patch also fixes ALTER COLUMN TYPE case. > > It replaced the 'recursing' argument in ATPrepCmd() by 'expected_parents', > and it is delivered to ATPrepAlterColumnType(). The logic to forbid altering > the column type is same as renameatt(). > ATSimpleRecursion() is also modified to use forboth() to call ATPrepCmd() > recursively. > > One concern is at ATOneLevelRecursion() which is called by ATPrepAddColumn() > only, and it also calls ATPrepCmd() for the direct children. > Right now, I set 1 on the 'expected_parents'. However, IMO, here is no reason > why we cannot rewrite the ATPrepAddColumn() using ATSimpleRecursion(). > > Eventually, ATExecAddColumn() shall be invoked several times for same column, > if the inheritance tree has diamond-inheritance structure. And, it increments > pg_attribute.attinhcount except for the first invocation. > If we store the 'expected_parents' on the ColumnDef->inhcount, we don't need > to call the ATExecAddColumn() more than once in a single ALTER TABLE command. > > Any comments? And, when should we do it? 9.0? 9.1? The attached one also clean up ATPrepAddColumn() and ATExecAddColumn() code, not only ATPrepAlterColumnType(), according to what I mentioned above. There are two regression test fails, because it does not call ATExecAddColumn() twice or more in diamond-inheritance cases, so it does not notice merging definitions of columns. If we should go on right now, I'll add and fix regression tests, and submit a formal patch again. If not, I'll work it later. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei <kaigai(a)ak.jp.nec.com>
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 Prev: [HACKERS] More frame options in window functions Next: [HACKERS] psql tab completion for DO blocks |