Ramming two SQL queries together..... [Oracle]

Prev: Rman backup of BCV using ASM
Next: CLOB

From: Charles Hooper on 28 May 2010 14:38

On May 28, 1:23 pm, joel garry <joel-ga...(a)home.com> wrote:
> <mr.frog.to....(a)googlemail.com> wrote:
> > Thankyou both gentlemen. I appreciate the feedback you have given me.
> > I am not an Oracle expert so I was not sure what the best approach
> > was. I will set both up and see which runs faster (I am guessing the
> > inline due to the way correlated subqueries run).
>
> > I thank you both very much.
>
> > Cheers
>
> > The Frog
>
> I'm wondering about how you think correlated subqueries run. Seehttp://download.oracle.com/docs/cd/B19306_01/server.102/b14200/querie...http://jonathanlewis.wordpress.com/2007/03/08/transformation-and-opti...
>
> If you haven't already, get Jonathan's book about the optimizer. See
> this example: http://jonathanlewis.wordpress.com/2006/11/06/filter-subqueries/
>
> jg

Joel,

I wonder if it is not just a terminology problem here. Essentially,
what do you call it when a SELECT statement appears in a column
position of another query. I think that Walt and The Frog were on the
same page regarding what they are talking about.

I think that the Oracle definition of a correlated subquery is a
SELECT statement that appears in the WHERE clause of another SQL
statement, and one or more columns from the parent SQL statement is
directly related to one or more columns (or generated values) from the
SELECT statement in the WHERE clause.
http://download.oracle.com/docs/cd/E11882_01/server.112/e10592/statements_10002.htm#i2066912
"The following examples show the general syntax of a correlated
subquery:
SELECT select_list
FROM table1 t_alias1
WHERE expr operator
(SELECT column_list
FROM table2 t_alias2
WHERE t_alias1.column
operator t_alias2.column);"

I think that there is a proper name for the case where a subquery
appears in a column position of another query, but I can't think of
that name at the moment.

Here is possibly interesting test case:
CREATE TABLE T1 (
ID NUMBER,
DESCRIPTION VARCHAR2(80));

INSERT INTO T1
SELECT
CEIL(ABS(SIN(ROWNUM/9.9999)*10000)),
'This is the long description for this number '||
TO_CHAR(CEIL(ABS(SIN(ROWNUM/9.9999)*10000)))
FROM
(SELECT
ROWNUM RN
FROM
DUAL
CONNECT BY
LEVEL<=1000),
(SELECT
ROWNUM RN
FROM
DUAL
CONNECT BY
LEVEL<=1000);

CREATE INDEX IND_T1 ON T1(ID);

CREATE TABLE T2 AS
SELECT
ROWNUM C1,
LPAD('A',100,'A') C2
FROM
DUAL
CONNECT BY
LEVEL<=10000;

CREATE TABLE T3 AS
SELECT
ROWNUM C1,
LPAD('A',100,'A') C2
FROM
DUAL
CONNECT BY
LEVEL<=10000;

CREATE TABLE T4 AS
SELECT
ROWNUM C1,
LPAD('A',100,'A') C2
FROM
DUAL
CONNECT BY
LEVEL<=10000;

CREATE INDEX IND_T4 ON T4(C1);

COMMIT;

EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T1',CASCADE=>TRUE,METHOD_OPT=>'FOR
ALL COLUMNS SIZE 1')
EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T2',CASCADE=>TRUE)
EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T3',CASCADE=>TRUE)
EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>USER,TABNAME=>'T4',CASCADE=>TRUE)

There are just 1,000,000 rows in table T1.

Now, let's try a test script using the tables:
SET AUTOTRACE TRACEONLY EXPLAIN

SELECT
T1.ID,
T2.C1 T2_C1,
T3.C1 T3_C1,
T4.C1 T4_C1
FROM
T1,
T2,
T3,
T4
WHERE
T2.C1 BETWEEN 1 AND 200
AND T2.C1=T3.C1
AND T2.C1=T4.C1
AND T2.C1=T1.ID;

SELECT
T1.ID,
(SELECT
T2.C1
FROM
T2
WHERE
T1.ID=T2.C1) T2_C1,
(SELECT
T3.C1
FROM
T3
WHERE
T1.ID=T3.C1) T3_C1,
(SELECT
T4.C1
FROM
T4
WHERE
T1.ID=T4.C1) T4_C1
FROM
T1
WHERE
T1.ID BETWEEN 1 AND 200;

The first SQL statement directly joins the tables, while the second
places SELECT statements in column positions. The output (11.1.0.7):
SQL> SELECT
2 T1.ID,
3 T2.C1 T2_C1,
4 T3.C1 T3_C1,
5 T4.C1 T4_C1
6 FROM
7 T1,
8 T2,
9 T3,
10 T4
11 WHERE
12 T2.C1 BETWEEN 1 AND 200
13 AND T2.C1=T3.C1
14 AND T2.C1=T4.C1
15 AND T2.C1=T1.ID;

Execution Plan
----------------------------------------------------------
Plan hash value: 3780653648

-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 457 | 7312 | 10041 (1)|
00:00:41 |
|* 1 | HASH JOIN | | 457 | 7312 | 10041 (1)|
00:00:41 |
|* 2 | HASH JOIN | | 198 | 2376 | 46 (5)|
00:00:01 |
|* 3 | HASH JOIN | | 199 | 1592 | 43 (3)|
00:00:01 |
|* 4 | TABLE ACCESS FULL| T2 | 200 | 800 | 21 (0)|
00:00:01 |
|* 5 | TABLE ACCESS FULL| T3 | 200 | 800 | 21 (0)|
00:00:01 |
|* 6 | INDEX RANGE SCAN | IND_T4 | 200 | 800 | 2 (0)|
00:00:01 |
|* 7 | TABLE ACCESS FULL | T1 | 20002 | 80008 | 9994 (1)|
00:00:41 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T2"."C1"="T1"."ID")
2 - access("T2"."C1"="T4"."C1")
3 - access("T2"."C1"="T3"."C1")
4 - filter("T2"."C1"<=200 AND "T2"."C1">=1)
5 - filter("T3"."C1"<=200 AND "T3"."C1">=1)
6 - access("T4"."C1">=1 AND "T4"."C1"<=200)
7 - filter("T1"."ID"<=200 AND "T1"."ID">=1)

SQL> SELECT
2 T1.ID,
3 (SELECT
4 T2.C1
5 FROM
6 T2
7 WHERE
8 T1.ID=T2.C1) T2_C1,
9 (SELECT
10 T3.C1
11 FROM
12 T3
13 WHERE
14 T1.ID=T3.C1) T3_C1,
15 (SELECT
16 T4.C1
17 FROM
18 T4
19 WHERE
20 T1.ID=T4.C1) T4_C1
21 FROM
22 T1
23 WHERE
24 T1.ID BETWEEN 1 AND 200;

Execution Plan
----------------------------------------------------------
Plan hash value: 2945978589

----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20002 | 80008 | 9994 (1)|
00:00:41 |
|* 1 | TABLE ACCESS FULL| T2 | 1 | 4 | 21 (0)|
00:00:01 |
|* 2 | TABLE ACCESS FULL| T3 | 1 | 4 | 21 (0)|
00:00:01 |
|* 3 | INDEX RANGE SCAN | IND_T4 | 1 | 4 | 1 (0)|
00:00:01 |
|* 4 | TABLE ACCESS FULL| T1 | 20002 | 80008 | 9994 (1)|
00:00:41 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("T2"."C1"=:B1)
2 - filter("T3"."C1"=:B1)
3 - access("T4"."C1"=:B1)
4 - filter("T1"."ID"<=200 AND "T1"."ID">=1)

If we were to actually run the SQL statements, we might find that the
first runs in about 15 seconds and the second in about 16 seconds,
both with the same number of physical reads. That is no fun, so let's
change the number 200 to 1200 to see what happens.

SET TIMING ON
SET AUTOTRACE TRACEONLY STATISTICS;
ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;
SET ARRAYSIZE 1000

SELECT
T1.ID,
T2.C1 T2_C1,
T3.C1 T3_C1,
T4.C1 T4_C1
FROM
T1,
T2,
T3,
T4
WHERE
T2.C1 BETWEEN 1 AND 1200
AND T2.C1=T3.C1
AND T2.C1=T4.C1
AND T2.C1=T1.ID;

ALTER SYSTEM FLUSH BUFFER_CACHE;
ALTER SYSTEM FLUSH BUFFER_CACHE;

SELECT
T1.ID,
(SELECT
T2.C1
FROM
T2
WHERE
T1.ID=T2.C1) T2_C1,
(SELECT
T3.C1
FROM
T3
WHERE
T1.ID=T3.C1) T3_C1,
(SELECT
T4.C1
FROM
T4
WHERE
T1.ID=T4.C1) T4_C1
FROM
T1
WHERE
T1.ID BETWEEN 1 AND 1200;

Here is the output:
SQL> SELECT
2 T1.ID,
3 T2.C1 T2_C1,
4 T3.C1 T3_C1,
5 T4.C1 T4_C1
6 FROM
7 T1,
8 T2,
9 T3,
10 T4
11 WHERE
12 T2.C1 BETWEEN 1 AND 1200
13 AND T2.C1=T3.C1
14 AND T2.C1=T4.C1
15 AND T2.C1=T1.ID;

76580 rows selected.

Elapsed: 00:00:15.96

Statistics
---------------------------------------------------
1 recursive calls
0 db block gets
83197 consistent gets
83110 physical reads
0 redo size
1288037 bytes sent via SQL*Net to client
1217 bytes received via SQL*Net from client
78 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
76580 rows processed

SQL> SELECT
2 T1.ID,
3 (SELECT
4 T2.C1
5 FROM
6 T2
7 WHERE
8 T1.ID=T2.C1) T2_C1,
9 (SELECT
10 T3.C1
11 FROM
12 T3
13 WHERE
14 T1.ID=T3.C1) T3_C1,
15 (SELECT
16 T4.C1
17 FROM
18 T4
19 WHERE
20 T1.ID=T4.C1) T4_C1
21 FROM
22 T1
23 WHERE
24 T1.ID BETWEEN 1 AND 1200;

76580 rows selected.

Elapsed: 00:01:40.09

Statistics
---------------------------------------------------
1 recursive calls
0 db block gets
10073639 consistent gets
83110 physical reads
0 redo size
1288037 bytes sent via SQL*Net to client
1217 bytes received via SQL*Net from client
78 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
76580 rows processed

The number of consistent gets jumps significantly for the second SQL
statement and so did the execution time (and CPU usage), compare with
the output for T1.ID between 1 and 200:
12732 rows selected.

Elapsed: 00:00:17.54

Statistics
---------------------------------------------------
0 recursive calls
0 db block gets
522390 consistent gets
83108 physical reads
0 redo size
196813 bytes sent via SQL*Net to client
513 bytes received via SQL*Net from client
14 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
12732 rows processed

---

I think that The Frog is on the right track to test the performance of
both approaches.

Charles Hooper
Co-author of "Expert Oracle Practices: Oracle Database Administration
from the Oak Table"
http://hoopercharles.wordpress.com/
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

From: joel garry on 28 May 2010 19:58

On May 28, 11:38 am, Charles Hooper <hooperc2...(a)yahoo.com> wrote:
> On May 28, 1:23 pm, joel garry <joel-ga...(a)home.com> wrote:
>
> > <mr.frog.to....(a)googlemail.com> wrote:
> > > Thankyou both gentlemen. I appreciate the feedback you have given me.
> > > I am not an Oracle expert so I was not sure what the best approach
> > > was. I will set both up and see which runs faster (I am guessing the
> > > inline due to the way correlated subqueries run).
>
> > > I thank you both very much.
>
> > > Cheers
>
> > > The Frog
>
> > I'm wondering about how you think correlated subqueries run. Seehttp://download.oracle.com/docs/cd/B19306_01/server.102/b14200/querie......
>
> > If you haven't already, get Jonathan's book about the optimizer. See
> > this example: http://jonathanlewis.wordpress.com/2006/11/06/filter-subqueries/
>
> > jg
>
> Joel,
>
> I wonder if it is not just a terminology problem here. Essentially,
> what do you call it when a SELECT statement appears in a column
> position of another query. I think that Walt and The Frog were on the
> same page regarding what they are talking about.
>
> I think that the Oracle definition of a correlated subquery is a
> SELECT statement that appears in the WHERE clause of another SQL
> statement, and one or more columns from the parent SQL statement is
> directly related to one or more columns (or generated values) from the

That's right.
Sql Language Reference manual:
"Using Subqueries

A subquery answers multiple-part questions. For example, to determine
who works in Taylor's department, you can first use a subquery to
determine the department in which Taylor works. You can then answer
the original question with the parent SELECT statement. A subquery in
the FROM clause of a SELECT statement is also called an inline view.
you can nest any number of subqueries in an inline view. A subquery in
the WHERE clause of a SELECT statement is also called a nested
subquery...Oracle performs a correlated subquery when a nested
subquery references a column from a table referred to a parent
statement any number of levels above the subquery."

> SELECT statement in the WHERE clause.http://download.oracle.com/docs/cd/E11882_01/server.112/e10592/statem...
> "The following examples show the general syntax of a correlated
> subquery:
> SELECT select_list
> FROM table1 t_alias1
> WHERE expr operator
> (SELECT column_list
> FROM table2 t_alias2
> WHERE t_alias1.column
> operator t_alias2.column);"
>
> I think that there is a proper name for the case where a subquery
> appears in a column position of another query, but I can't think of
> that name at the moment.

"Scalar subquery," perhaps? I think the O docs are weak on this,
though people seem to have picked up on it when it was introduced in
9i. It's an ANSI definition.

Tom has interesting observations:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1594885400346999596

>
> Here is possibly interesting test case:

> ---
>
> I think that The Frog is on the right track to test the performance of
> both approaches.
>

Thanks once again for demonstrating to us all how to think.

jg
--
@home.com is bogus.
http://www.gennick.com/madness.html

From: Charles Hooper on 29 May 2010 09:42

On May 28, 7:58 pm, joel garry <joel-ga...(a)home.com> wrote:
(snip)
> > I think that there is a proper name for the case where a subquery
> > appears in a column position of another query, but I can't think of
> > that name at the moment.
>
> "Scalar subquery," perhaps? I think the O docs are weak on this,
> though people seem to have picked up on it when it was introduced in
> 9i. It's an ANSI definition.
>
> Tom has interesting observations:http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1...
>
>
>
> > Here is possibly interesting test case:
> > ---
>
> > I think that The Frog is on the right track to test the performance of
> > both approaches.
>
> Thanks once again for demonstrating to us all how to think.
>
> jg

Joel,

Thanks for finding the definition of that term - I have seen it
countless times, but have trouble remembering the term when needed.

http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/expressions013.htm
"A scalar subquery expression is a subquery that returns exactly one
column value from one row. The value of the scalar subquery expression
is the value of the select list item of the subquery. If the subquery
returns 0 rows, then the value of the scalar subquery expression is
NULL. If the subquery returns more than one row, then Oracle returns
an error."

With the definition from Oracle's documentation, a scalar subquery can
also appear in a WHERE clause, if the subquery returns a single row
and a single column:
SELECT
*
FROM
T1
WHERE
C1=(
SELECT
MAX(C1)
FROM
T2
WHERE
STATE='CONFUSED');

This might be what was leading me to forget the term when the subquery
appears in a column position.

The AskTom article does a nice job of explaining why someone might or
might not want to use a *scalar* query in a column position in a SQL
statement.

Incidentally, I have a copy of Tom Kyte's new book on pre-order and a
copy of "Oracle Tuning: The Definitive Reference Second Edition" on
pre-order (the first of that author's books I have ordered - I was
curious after bumping into the author several times on the OTN
forums). I will probably post detailed reviews of both books. I hope
that the second edition of the second book does not show me a "Top 5
Timed Events" report with a value indicating the number of waits for
the CPU, or other odd problems like average single block read times of
2.75 seconds, while the average multi-block read time is 0.127 seconds
like can be observed in a Google Books view of page 16 of the first
edition. Last year I mentioned to the author of this book some of my
concerns (on the OTN forums) about several pages of the first edition,
viewable through Google books, and I am curious to see if corrections
were made for the second edition.

Charles Hooper
Co-author of "Expert Oracle Practices: Oracle Database Administration
from the Oak Table"
http://hoopercharles.wordpress.com/
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

From: John Hurley on 29 May 2010 17:28

Charles:

> Incidentally, I have a copy of Tom Kyte's new book on pre-order and a
> copy of "Oracle Tuning: The Definitive Reference Second Edition" on
> pre-order (the first of that author's books I have ordered - I was
> curious after bumping into the author several times on the OTN
> forums). I will probably post detailed reviews of both books. I hope
> that the second edition of the second book does not show me a "Top 5
> Timed Events" report with a value indicating the number of waits for
> the CPU, or other odd problems like average single block read times of
> 2.75 seconds, while the average multi-block read time is 0.127 seconds
> like can be observed in a Google Books view of page 16 of the first
> edition. Last year I mentioned to the author of this book some of my
> concerns (on the OTN forums) about several pages of the first edition,
> viewable through Google books, and I am curious to see if corrections
> were made for the second edition.

You are not bringing in the name of the leader of the guide ponys are
you? Not the New York subway bleeder?

Not sure why you are going in this direction here buddy ... it could
get ugly.

From: Charles Hooper on 29 May 2010 19:06

On May 29, 5:28 pm, John Hurley <hurleyjo...(a)yahoo.com> wrote:
> Charles:
>
> > Incidentally, I have a copy of Tom Kyte's new book on pre-order and a
> > copy of "Oracle Tuning: The Definitive Reference Second Edition" on
> > pre-order (the first of that author's books I have ordered - I was
> > curious after bumping into the author several times on the OTN
> > forums). I will probably post detailed reviews of both books. I hope
> > that the second edition of the second book does not show me a "Top 5
> > Timed Events" report with a value indicating the number of waits for
> > the CPU, or other odd problems like average single block read times of
> > 2.75 seconds, while the average multi-block read time is 0.127 seconds
> > like can be observed in a Google Books view of page 16 of the first
> > edition. Last year I mentioned to the author of this book some of my
> > concerns (on the OTN forums) about several pages of the first edition,
> > viewable through Google books, and I am curious to see if corrections
> > were made for the second edition.
>
> You are not bringing in the name of the leader of the guide ponys are
> you? Not the New York subway bleeder?
>
> Not sure why you are going in this direction here buddy ... it could
> get ugly.

I was curious, so I spent good money on the book. For most of the
Oracle books that I have read in the last two years I have posted
detailed reviews of the books on Amazon. The reviews list what I
thought was good about the book, what I thought was great about the
book, errors/errata found in the book's chapters, as well as misc.
points about the book as it is read. Some of my reviews identified a
dozen or more errors in a book, which is a significant problem with
some of the Oracle books on the market. The error must make it
difficult for beginners to learn about Oracle Database.

How long will the review of the "Oracle Tuning: The Definitive
Reference Second Edition" book stretch? I am not sure, but it appears
that I could write about a 4 page review on the first 27 pages of the
first edition of this book - I sincerely hope that some/most of those
issues were addressed in the second edition. I want to make certain
that I obtain a good value from the $44 that I spent on the book. I
do not think that it too much to ask for accuracy from a published
book, especially when the book cost more than $40.

Some of the pages in the chapters that I co-authored in the "Expert
Oracle Practices" required more than 8 hours of research, typing,
verification, testing, proofing, re-testing, and formatting per page.
I believe that Raldolf spent roughly the same amount of time on the
pages that he wrote for the chapters. I suspect that many of the very
good books that are on the market for Oracle also required about the
same amount of time per page. Certainly, it can't be too much to ask
that other book authors put effort into testing and verification of
what they publish?

The author's website states about the the "Oracle Tuning: The
Definitive Reference Second Edition" book: "This is not a book for
beginners. Targeted at the senior Oracle DBA, this comprehensive book
gives you all of the knowledge you need to be successful in tuning
even the most complex Oracle database." So, I will be looking for the
portions of the book's contents that are comprehensive and targeted at
the senior Oracle DBA.

Charles Hooper
Co-author of "Expert Oracle Practices: Oracle Database Administration
from the Oak Table"
http://hoopercharles.wordpress.com/
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

First | Prev | Next | Last
Pages: 1 2 3
Prev: Rman backup of BCV using ASM
Next: CLOB