Prev: Rman backup of BCV using ASM
Next: CLOB
From: Charles Hooper on 28 May 2010 14:38 On May 28, 1:23 pm, joel garry <joel-ga...(a)home.com> wrote: > <mr.frog.to....(a)googlemail.com> wrote: > > Thankyou both gentlemen. I appreciate the feedback you have given me. > > I am not an Oracle expert so I was not sure what the best approach > > was. I will set both up and see which runs faster (I am guessing the > > inline due to the way correlated subqueries run). > > > I thank you both very much. > > > Cheers > > > The Frog > > I'm wondering about how you think correlated subqueries run. Seehttp://download.oracle.com/docs/cd/B19306_01/server.102/b14200/querie...http://jonathanlewis.wordpress.com/2007/03/08/transformation-and-opti... > > If you haven't already, get Jonathan's book about the optimizer. See > this example: http://jonathanlewis.wordpress.com/2006/11/06/filter-subqueries/ > > jg Joel, I wonder if it is not just a terminology problem here. Essentially, what do you call it when a SELECT statement appears in a column position of another query. I think that Walt and The Frog were on the same page regarding what they are talking about. I think that the Oracle definition of a correlated subquery is a SELECT statement that appears in the WHERE clause of another SQL statement, and one or more columns from the parent SQL statement is directly related to one or more columns (or generated values) from the SELECT statement in the WHERE clause. http://download.oracle.com/docs/cd/E11882_01/server.112/e10592/statements_10002.htm#i2066912 "The following examples show the general syntax of a correlated subquery: SELECT select_list FROM table1 t_alias1 WHERE expr operator (SELECT column_list FROM table2 t_alias2 WHERE t_alias1.column operator t_alias2.column);" I think that there is a proper name for the case where a subquery appears in a column position of another query, but I can't think of that name at the moment. Here is possibly interesting test case: CREATE TABLE T1 ( ID NUMBER, DESCRIPTION VARCHAR2(80)); INSERT INTO T1 SELECT CEIL(ABS(SIN(ROWNUM/9.9999)*10000)), 'This is the long description for this number '|| TO_CHAR(CEIL(ABS(SIN(ROWNUM/9.9999)*10000))) FROM (SELECT ROWNUM RN FROM DUAL CONNECT BY LEVEL<=1000), (SELECT ROWNUM RN FROM DUAL CONNECT BY LEVEL<=1000); CREATE INDEX IND_T1 ON T1(ID); CREATE TABLE T2 AS SELECT ROWNUM C1, LPAD('A',100,'A') C2 FROM DUAL CONNECT BY LEVEL<=10000; CREATE TABLE T3 AS SELECT ROWNUM C1, LPAD('A',100,'A') C2 FROM DUAL CONNECT BY LEVEL<=10000; CREATE TABLE T4 AS SELECT ROWNUM C1, LPAD('A',100,'A') C2 FROM DUAL CONNECT BY LEVEL<=10000; CREATE INDEX IND_T4 ON T4(C1); COMMIT; EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE,METHOD_OPT=>'FOR ALL COLUMNS SIZE 1') EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE) EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T3',CASCADE=>TRUE) EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T4',CASCADE=>TRUE) There are just 1,000,000 rows in table T1. Now, let's try a test script using the tables: SET AUTOTRACE TRACEONLY EXPLAIN SELECT T1.ID, T2.C1 T2_C1, T3.C1 T3_C1, T4.C1 T4_C1 FROM T1, T2, T3, T4 WHERE T2.C1 BETWEEN 1 AND 200 AND T2.C1=T3.C1 AND T2.C1=T4.C1 AND T2.C1=T1.ID; SELECT T1.ID, (SELECT T2.C1 FROM T2 WHERE T1.ID=T2.C1) T2_C1, (SELECT T3.C1 FROM T3 WHERE T1.ID=T3.C1) T3_C1, (SELECT T4.C1 FROM T4 WHERE T1.ID=T4.C1) T4_C1 FROM T1 WHERE T1.ID BETWEEN 1 AND 200; The first SQL statement directly joins the tables, while the second places SELECT statements in column positions. The output (11.1.0.7): SQL> SELECT 2 T1.ID, 3 T2.C1 T2_C1, 4 T3.C1 T3_C1, 5 T4.C1 T4_C1 6 FROM 7 T1, 8 T2, 9 T3, 10 T4 11 WHERE 12 T2.C1 BETWEEN 1 AND 200 13 AND T2.C1=T3.C1 14 AND T2.C1=T4.C1 15 AND T2.C1=T1.ID; Execution Plan ---------------------------------------------------------- Plan hash value: 3780653648 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 457 | 7312 | 10041 (1)| 00:00:41 | |* 1 | HASH JOIN | | 457 | 7312 | 10041 (1)| 00:00:41 | |* 2 | HASH JOIN | | 198 | 2376 | 46 (5)| 00:00:01 | |* 3 | HASH JOIN | | 199 | 1592 | 43 (3)| 00:00:01 | |* 4 | TABLE ACCESS FULL| T2 | 200 | 800 | 21 (0)| 00:00:01 | |* 5 | TABLE ACCESS FULL| T3 | 200 | 800 | 21 (0)| 00:00:01 | |* 6 | INDEX RANGE SCAN | IND_T4 | 200 | 800 | 2 (0)| 00:00:01 | |* 7 | TABLE ACCESS FULL | T1 | 20002 | 80008 | 9994 (1)| 00:00:41 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T2"."C1"="T1"."ID") 2 - access("T2"."C1"="T4"."C1") 3 - access("T2"."C1"="T3"."C1") 4 - filter("T2"."C1"<=200 AND "T2"."C1">=1) 5 - filter("T3"."C1"<=200 AND "T3"."C1">=1) 6 - access("T4"."C1">=1 AND "T4"."C1"<=200) 7 - filter("T1"."ID"<=200 AND "T1"."ID">=1) SQL> SELECT 2 T1.ID, 3 (SELECT 4 T2.C1 5 FROM 6 T2 7 WHERE 8 T1.ID=T2.C1) T2_C1, 9 (SELECT 10 T3.C1 11 FROM 12 T3 13 WHERE 14 T1.ID=T3.C1) T3_C1, 15 (SELECT 16 T4.C1 17 FROM 18 T4 19 WHERE 20 T1.ID=T4.C1) T4_C1 21 FROM 22 T1 23 WHERE 24 T1.ID BETWEEN 1 AND 200; Execution Plan ---------------------------------------------------------- Plan hash value: 2945978589 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20002 | 80008 | 9994 (1)| 00:00:41 | |* 1 | TABLE ACCESS FULL| T2 | 1 | 4 | 21 (0)| 00:00:01 | |* 2 | TABLE ACCESS FULL| T3 | 1 | 4 | 21 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | IND_T4 | 1 | 4 | 1 (0)| 00:00:01 | |* 4 | TABLE ACCESS FULL| T1 | 20002 | 80008 | 9994 (1)| 00:00:41 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("T2"."C1"=:B1) 2 - filter("T3"."C1"=:B1) 3 - access("T4"."C1"=:B1) 4 - filter("T1"."ID"<=200 AND "T1"."ID">=1) If we were to actually run the SQL statements, we might find that the first runs in about 15 seconds and the second in about 16 seconds, both with the same number of physical reads. That is no fun, so let's change the number 200 to 1200 to see what happens. SET TIMING ON SET AUTOTRACE TRACEONLY STATISTICS; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SYSTEM FLUSH BUFFER_CACHE; SET ARRAYSIZE 1000 SELECT T1.ID, T2.C1 T2_C1, T3.C1 T3_C1, T4.C1 T4_C1 FROM T1, T2, T3, T4 WHERE T2.C1 BETWEEN 1 AND 1200 AND T2.C1=T3.C1 AND T2.C1=T4.C1 AND T2.C1=T1.ID; ALTER SYSTEM FLUSH BUFFER_CACHE; ALTER SYSTEM FLUSH BUFFER_CACHE; SELECT T1.ID, (SELECT T2.C1 FROM T2 WHERE T1.ID=T2.C1) T2_C1, (SELECT T3.C1 FROM T3 WHERE T1.ID=T3.C1) T3_C1, (SELECT T4.C1 FROM T4 WHERE T1.ID=T4.C1) T4_C1 FROM T1 WHERE T1.ID BETWEEN 1 AND 1200; Here is the output: SQL> SELECT 2 T1.ID, 3 T2.C1 T2_C1, 4 T3.C1 T3_C1, 5 T4.C1 T4_C1 6 FROM 7 T1, 8 T2, 9 T3, 10 T4 11 WHERE 12 T2.C1 BETWEEN 1 AND 1200 13 AND T2.C1=T3.C1 14 AND T2.C1=T4.C1 15 AND T2.C1=T1.ID; 76580 rows selected. Elapsed: 00:00:15.96 Statistics --------------------------------------------------- 1 recursive calls 0 db block gets 83197 consistent gets 83110 physical reads 0 redo size 1288037 bytes sent via SQL*Net to client 1217 bytes received via SQL*Net from client 78 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 76580 rows processed SQL> SELECT 2 T1.ID, 3 (SELECT 4 T2.C1 5 FROM 6 T2 7 WHERE 8 T1.ID=T2.C1) T2_C1, 9 (SELECT 10 T3.C1 11 FROM 12 T3 13 WHERE 14 T1.ID=T3.C1) T3_C1, 15 (SELECT 16 T4.C1 17 FROM 18 T4 19 WHERE 20 T1.ID=T4.C1) T4_C1 21 FROM 22 T1 23 WHERE 24 T1.ID BETWEEN 1 AND 1200; 76580 rows selected. Elapsed: 00:01:40.09 Statistics --------------------------------------------------- 1 recursive calls 0 db block gets 10073639 consistent gets 83110 physical reads 0 redo size 1288037 bytes sent via SQL*Net to client 1217 bytes received via SQL*Net from client 78 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 76580 rows processed The number of consistent gets jumps significantly for the second SQL statement and so did the execution time (and CPU usage), compare with the output for T1.ID between 1 and 200: 12732 rows selected. Elapsed: 00:00:17.54 Statistics --------------------------------------------------- 0 recursive calls 0 db block gets 522390 consistent gets 83108 physical reads 0 redo size 196813 bytes sent via SQL*Net to client 513 bytes received via SQL*Net from client 14 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 12732 rows processed --- I think that The Frog is on the right track to test the performance of both approaches. Charles Hooper Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table" http://hoopercharles.wordpress.com/ IT Manager/Oracle DBA K&M Machine-Fabricating, Inc.
From: joel garry on 28 May 2010 19:58 On May 28, 11:38 am, Charles Hooper <hooperc2...(a)yahoo.com> wrote: > On May 28, 1:23 pm, joel garry <joel-ga...(a)home.com> wrote: > > > <mr.frog.to....(a)googlemail.com> wrote: > > > Thankyou both gentlemen. I appreciate the feedback you have given me. > > > I am not an Oracle expert so I was not sure what the best approach > > > was. I will set both up and see which runs faster (I am guessing the > > > inline due to the way correlated subqueries run). > > > > I thank you both very much. > > > > Cheers > > > > The Frog > > > I'm wondering about how you think correlated subqueries run. Seehttp://download.oracle.com/docs/cd/B19306_01/server.102/b14200/querie...... > > > If you haven't already, get Jonathan's book about the optimizer. See > > this example: http://jonathanlewis.wordpress.com/2006/11/06/filter-subqueries/ > > > jg > > Joel, > > I wonder if it is not just a terminology problem here. Essentially, > what do you call it when a SELECT statement appears in a column > position of another query. I think that Walt and The Frog were on the > same page regarding what they are talking about. > > I think that the Oracle definition of a correlated subquery is a > SELECT statement that appears in the WHERE clause of another SQL > statement, and one or more columns from the parent SQL statement is > directly related to one or more columns (or generated values) from the That's right. Sql Language Reference manual: "Using Subqueries A subquery answers multiple-part questions. For example, to determine who works in Taylor's department, you can first use a subquery to determine the department in which Taylor works. You can then answer the original question with the parent SELECT statement. A subquery in the FROM clause of a SELECT statement is also called an inline view. you can nest any number of subqueries in an inline view. A subquery in the WHERE clause of a SELECT statement is also called a nested subquery...Oracle performs a correlated subquery when a nested subquery references a column from a table referred to a parent statement any number of levels above the subquery." > SELECT statement in the WHERE clause.http://download.oracle.com/docs/cd/E11882_01/server.112/e10592/statem... > "The following examples show the general syntax of a correlated > subquery: > SELECT select_list > FROM table1 t_alias1 > WHERE expr operator > (SELECT column_list > FROM table2 t_alias2 > WHERE t_alias1.column > operator t_alias2.column);" > > I think that there is a proper name for the case where a subquery > appears in a column position of another query, but I can't think of > that name at the moment. "Scalar subquery," perhaps? I think the O docs are weak on this, though people seem to have picked up on it when it was introduced in 9i. It's an ANSI definition. Tom has interesting observations: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1594885400346999596 > > Here is possibly interesting test case: > --- > > I think that The Frog is on the right track to test the performance of > both approaches. > Thanks once again for demonstrating to us all how to think. jg -- @home.com is bogus. http://www.gennick.com/madness.html
From: Charles Hooper on 29 May 2010 09:42 On May 28, 7:58 pm, joel garry <joel-ga...(a)home.com> wrote: (snip) > > I think that there is a proper name for the case where a subquery > > appears in a column position of another query, but I can't think of > > that name at the moment. > > "Scalar subquery," perhaps? I think the O docs are weak on this, > though people seem to have picked up on it when it was introduced in > 9i. It's an ANSI definition. > > Tom has interesting observations:http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1... > > > > > Here is possibly interesting test case: > > --- > > > I think that The Frog is on the right track to test the performance of > > both approaches. > > Thanks once again for demonstrating to us all how to think. > > jg Joel, Thanks for finding the definition of that term - I have seen it countless times, but have trouble remembering the term when needed. http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/expressions013.htm "A scalar subquery expression is a subquery that returns exactly one column value from one row. The value of the scalar subquery expression is the value of the select list item of the subquery. If the subquery returns 0 rows, then the value of the scalar subquery expression is NULL. If the subquery returns more than one row, then Oracle returns an error." With the definition from Oracle's documentation, a scalar subquery can also appear in a WHERE clause, if the subquery returns a single row and a single column: SELECT * FROM T1 WHERE C1=( SELECT MAX(C1) FROM T2 WHERE STATE='CONFUSED'); This might be what was leading me to forget the term when the subquery appears in a column position. The AskTom article does a nice job of explaining why someone might or might not want to use a *scalar* query in a column position in a SQL statement. Incidentally, I have a copy of Tom Kyte's new book on pre-order and a copy of "Oracle Tuning: The Definitive Reference Second Edition" on pre-order (the first of that author's books I have ordered - I was curious after bumping into the author several times on the OTN forums). I will probably post detailed reviews of both books. I hope that the second edition of the second book does not show me a "Top 5 Timed Events" report with a value indicating the number of waits for the CPU, or other odd problems like average single block read times of 2.75 seconds, while the average multi-block read time is 0.127 seconds like can be observed in a Google Books view of page 16 of the first edition. Last year I mentioned to the author of this book some of my concerns (on the OTN forums) about several pages of the first edition, viewable through Google books, and I am curious to see if corrections were made for the second edition. Charles Hooper Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table" http://hoopercharles.wordpress.com/ IT Manager/Oracle DBA K&M Machine-Fabricating, Inc.
From: John Hurley on 29 May 2010 17:28 Charles: > Incidentally, I have a copy of Tom Kyte's new book on pre-order and a > copy of "Oracle Tuning: The Definitive Reference Second Edition" on > pre-order (the first of that author's books I have ordered - I was > curious after bumping into the author several times on the OTN > forums). I will probably post detailed reviews of both books. I hope > that the second edition of the second book does not show me a "Top 5 > Timed Events" report with a value indicating the number of waits for > the CPU, or other odd problems like average single block read times of > 2.75 seconds, while the average multi-block read time is 0.127 seconds > like can be observed in a Google Books view of page 16 of the first > edition. Last year I mentioned to the author of this book some of my > concerns (on the OTN forums) about several pages of the first edition, > viewable through Google books, and I am curious to see if corrections > were made for the second edition. You are not bringing in the name of the leader of the guide ponys are you? Not the New York subway bleeder? Not sure why you are going in this direction here buddy ... it could get ugly.
From: Charles Hooper on 29 May 2010 19:06
On May 29, 5:28 pm, John Hurley <hurleyjo...(a)yahoo.com> wrote: > Charles: > > > Incidentally, I have a copy of Tom Kyte's new book on pre-order and a > > copy of "Oracle Tuning: The Definitive Reference Second Edition" on > > pre-order (the first of that author's books I have ordered - I was > > curious after bumping into the author several times on the OTN > > forums). I will probably post detailed reviews of both books. I hope > > that the second edition of the second book does not show me a "Top 5 > > Timed Events" report with a value indicating the number of waits for > > the CPU, or other odd problems like average single block read times of > > 2.75 seconds, while the average multi-block read time is 0.127 seconds > > like can be observed in a Google Books view of page 16 of the first > > edition. Last year I mentioned to the author of this book some of my > > concerns (on the OTN forums) about several pages of the first edition, > > viewable through Google books, and I am curious to see if corrections > > were made for the second edition. > > You are not bringing in the name of the leader of the guide ponys are > you? Not the New York subway bleeder? > > Not sure why you are going in this direction here buddy ... it could > get ugly. I was curious, so I spent good money on the book. For most of the Oracle books that I have read in the last two years I have posted detailed reviews of the books on Amazon. The reviews list what I thought was good about the book, what I thought was great about the book, errors/errata found in the book's chapters, as well as misc. points about the book as it is read. Some of my reviews identified a dozen or more errors in a book, which is a significant problem with some of the Oracle books on the market. The error must make it difficult for beginners to learn about Oracle Database. How long will the review of the "Oracle Tuning: The Definitive Reference Second Edition" book stretch? I am not sure, but it appears that I could write about a 4 page review on the first 27 pages of the first edition of this book - I sincerely hope that some/most of those issues were addressed in the second edition. I want to make certain that I obtain a good value from the $44 that I spent on the book. I do not think that it too much to ask for accuracy from a published book, especially when the book cost more than $40. Some of the pages in the chapters that I co-authored in the "Expert Oracle Practices" required more than 8 hours of research, typing, verification, testing, proofing, re-testing, and formatting per page. I believe that Raldolf spent roughly the same amount of time on the pages that he wrote for the chapters. I suspect that many of the very good books that are on the market for Oracle also required about the same amount of time per page. Certainly, it can't be too much to ask that other book authors put effort into testing and verification of what they publish? The author's website states about the the "Oracle Tuning: The Definitive Reference Second Edition" book: "This is not a book for beginners. Targeted at the senior Oracle DBA, this comprehensive book gives you all of the knowledge you need to be successful in tuning even the most complex Oracle database." So, I will be looking for the portions of the book's contents that are comprehensive and targeted at the senior Oracle DBA. Charles Hooper Co-author of "Expert Oracle Practices: Oracle Database Administration from the Oak Table" http://hoopercharles.wordpress.com/ IT Manager/Oracle DBA K&M Machine-Fabricating, Inc. |