From: UXDBA on
All,


at Oracle 9.2.0.8

we are aiming to delete huge rows( old data ) from a table T1

Since we can not afford outage hence CTAS and other options are ruled
out.

Method being opted are ( in Test env first ) :

a) Mark the rows to be deleted. i.e. Update one column, which is made
exclusivley for
archiving/deletion
b) delete the marked rows in a batches
b) Use Online redfifnition to reclaim the storage space and lower the
HWM.

Further details:

1) Table T1 has no clobs ( columns are varchar, number, date types)
2) T1 has "objid" column as primary key NUMBER datatype
3) "dev" column is type NUMBER
4) dev column is indexed normal B*Tree index.
5) Based on the bussiness rule for "unused data" we set "dev" column
to 1. Default value of "dev"
column is NULL

We use the following function to set the "dev" column to 1

f1) ...FN_Mark_for_Del
UPDATE T1
SET dev = 1
WHERE start_time < (SYSDATE - no_days) and
some_business_rule..........;

Then we use the following function to delete the rows.

f2) ...FN_del_T1
FUNCTION fn_delete_T1 (batch_size IN NUMBER, max_rows IN NUMBER)
RETURN BOOLEAN
AS
cnt NUMBER;
row_count NUMBER;
BEGIN
SAVEPOINT checkhere;

SELECT COUNT (objid)
INTO row_count
FROM T1
WHERE dev = 1;

cnt := 0;
DBMS_OUTPUT.put_line
( 'Total number of rows to be deleted from T1 ['
|| LEAST (row_count, max_rows)
|| '] in batches of ['
|| batch_size
|| '] rows'
);
DBMS_OUTPUT.put_line ( 'Starting deletion at ['
|| TO_CHAR (SYSDATE, 'dd-MM-yyyy
hh24:mi:ss')
|| ']'
);

WHILE (cnt < row_count AND (cnt + batch_size <= max_rows))
LOOP
DELETE FROM T1
WHERE dev = 1 AND ROWNUM <= batch_size;

cnt := cnt + SQL%ROWCOUNT;
COMMIT;
DBMS_OUTPUT.put_line ( 'Deleted ['
|| cnt
|| '] rows from T1'
);
END LOOP;

DBMS_OUTPUT.put_line ( 'End of deletion at ['
|| TO_CHAR (SYSDATE, 'dd-MM-yyyy
hh24:mi:ss')
|| ']'
);
DBMS_OUTPUT.put_line ('Deleted [' || cnt
|| '] rows from T1'
);
RETURN TRUE;
EXCEPTION
WHEN OTHERS
THEN
ROLLBACK TO checkhere;
DBMS_OUTPUT.put_line ( 'Error processing. Sql error code is
['
|| SQLCODE
|| ']'
);
RETURN FALSE;
END fn_delete_T1;




We used f1 to mark the data for deletion :

Table Total Count Marked for Del
T1 6469204 6177615


Attempted to delete the rows in a controlled manner in following
phases.
phase-1 : batch size 500, records deleted 1000000 --> Time taken
00:05:49.09 Undo Generated 1GB,
Redo 1.5 GB
phase-2 : batch size 1000,records deleted 1000000 --> Time taken
00:07:23.8 Undo Generated 1GB,
Redo 1.5 GB
phase-3 : batch size 250, records deleted 1000000 --> Time taken
00:29:59.9 Undo Generated 1GB,
Redo 1.5 GB
phase-4 : batch size 500, records deleted 1000000 --> Time taken
00:22:23.31 Undo Generated 1GB,
Redo 1.5 GB

Done above just to benchmark the batch_size and no_of_rows deleted ..
so that we can use those
figure in live.

Can see huge difference in time in phas-1 & phase-4 ( same batch size,
same number_of_rows
deleted).

excperts from Statspack report ( phase-1):

...
Load Profile
~~~~~~~~~~~~ Per Second Per
Transaction
---------------
---------------
Redo size: 5,465,377.22
1,039,493.31
Logical reads: 66,886.79
12,721.60
Block changes: 44,316.35
8,428.80
Physical reads: 3,898.02
741.39
Physical writes: 1,498.68
285.04
User calls: 61.01
11.60
Parses: 31.51
5.99
Hard parses: 0.01
0.00
Sorts: 0.32
0.06
Logons: 0.00
0.00
Executes: 37.43
7.12
Transactions: 5.26
...
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 100.00
Buffer Hit %: 94.17 In-memory Sort %: 100.00
Library Hit %: 99.99 Soft Parse %: 99.98
Execute to Parse %: 15.82 Latch Hit %: 99.98
Parse CPU to Parse Elapsd %: 120.00 % Non-Parse CPU: 99.94

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 32.99 33.00
% SQL with executions>1: 91.76 91.72
% Memory for SQL w/exec>1: 84.31 84.28

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~
% Total
Event Waits Time (s)
Ela Time
-------------------------------------------- ------------ -----------
--------
CPU time
205 39.03
log file parallel write 4,494
130 24.71
db file sequential read 1,511,549
124 23.65
global cache busy 60
45 8.62
write complete waits 47
7 1.29

-------------------------------------------------------------

....
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s)
Hash Value
--------------- ------------ -------------- ------ -------- ---------
----------
25,865,837 1 25,865,837.0 99.7 196.90 340.57
2165703550
Module: SQL*Plus
DECLARE RetVal BOOLEAN; BATCH_SIZE NUMBER; MAX_ROWS NUMBER
; BEGIN BATCH_SIZE := NULL; MAX_ROWS := NULL; RetVal :=
USERNAME.PKG_ARCHIVE_DEL.fn_delete_T1
( 500, 1000000 ); COMMIT;
END;

25,842,514 2,000 12,921.3 99.6 193.00 336.26
3890394561
Module: SQL*Plus
DELETE FROM T1 WHERE DEV = 1 AND ROWNUM <= :B1





excperts from Statspack report ( phase-4):

....
Load Profile
~~~~~~~~~~~~ Per Second Per
Transaction
---------------
---------------
Redo size: 1,438,322.45
992,682.66
Logical reads: 31,594.86
21,805.73
Block changes: 11,635.91
8,030.72
Physical reads: 16,059.68
11,083.86
Physical writes: 818.52
564.92
User calls: 61.15
42.21
Parses: 32.80
22.64
Hard parses: 0.00
0.00
Sorts: 0.37
0.25
Logons: 0.06
0.04
Executes: 34.67
23.93
Transactions: 1.45
...
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 99.99 Redo NoWait %: 100.00
Buffer Hit %: 49.17 In-memory Sort %: 100.00
Library Hit %: 100.00 Soft Parse %: 100.00
Execute to Parse %: 5.41 Latch Hit %: 99.68
Parse CPU to Parse Elapsd %: 54.84 % Non-Parse CPU: 99.97

Shared Pool Statistics Begin End
------ ------
Memory Usage %: 33.05 33.06
% SQL with executions>1: 91.44 91.42
% Memory for SQL w/exec>1: 84.07 84.05

Top 5 Timed Events
~~~~~~~~~~~~~~~~~~
% Total
Event Waits Time (s)
Ela Time
-------------------------------------------- ------------ -----------
--------
CPU time
1,271 66.76
db file sequential read 23,891,385
472 24.78
log file parallel write 4,272
121 6.37
latch free 5,188
14 .71
DFS lock handle 2,646
9 .46

-------------------------------------------------------------
...
CPU Elapsd
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s)
Hash Value
--------------- ------------ -------------- ------ -------- ---------
----------
46,644,584 1 46,644,584.0 99.2 1244.23 1311.65
733080457
Module: SQL*Plus
DECLARE RetVal BOOLEAN; BATCH_SIZE NUMBER; MAX_ROWS NUMBER
; BEGIN BATCH_SIZE := NULL; MAX_ROWS := NULL; RetVal :=
USERNAME.PKG_ARCHIVE_DEL.fn_delete_T1
(500, 1000000 ); COMMIT;
END;

46,621,261 2,000 23,310.6 99.2 1240.47 1307.92
3890394561
Module: SQL*Plus
DELETE FROM T1 WHERE DEV = 1 AND ROWNUM <= :B1



Questions

q1) what for phase-4 waiting ? [ as it took time 4x longer than
phase-1 for the same number of
rows & get per execution are almost double in phase-4 compared to
phase-1 with poor buffer hit ratio ]
q2) is "delayed block cleanout" playing some role here ?
q3) function used (f1) and (f2) are efficient ? something can be
suggested here to improve
the efficiency?
q4) in functions no use of rowid or cursor.. do use of these will
help.?
q5) will indexing "dev" column as bitmap index will help? (In
production env this table T1 is
subject to huge insert/update.)



Regards

From: joel garry on
On Jan 28, 5:16 am, UXDBA <unixdb...(a)googlemail.com> wrote:

>
> q1) what for  phase-4 waiting ? [ as it took time 4x longer than
> phase-1 for the same number of
> rows & get per execution are almost double in phase-4 compared to
> phase-1 with poor buffer hit ratio  ]

Well, sequential read waits probably means you are using an index to
figure out which blocks to update. Of course the buffer hit ratio
would be poor, you are jumping all over the place to grab random
blocks. How big is the table (size and rows)? How long does it take
to do a full scan of the table? Sometimes it is faster to use full
scans because multiblock reads can help, but we have no clue here of
your total sizes. Hit ratio tuning is silly, anyways, but here it may
be a clue you may be doing it wrong. But you may have no choice.
Don't be afraid of large undo, in any case.

> q2) is "delayed block cleanout" playing some role here ?

Before and in between phases, select some field that is not indexed to
force a full table scan and cleanup of the blocks. If that makes the
times quicker and more stable, it was likely delayed block cleanout.
If I'm not totally confused.

jg
--
@home.com is bogus. "Pick any combination of four numbers plus a
percentage point, use it as your story's stardate. For example, 1313.5
is twelve o'clock noon of one day and 1314.5 would be noon the next
day. Each percentage point is roughly equivalent to one-tenth of a
day. The progression of stardates in your script should remain
constant but don't worry about whether or not there is a progression
from other scripts. Stardates are a mathematical formula which varies
depending on location in the galaxy, velocity of travel, and other
factors, can vary widely from episode to episode." - Star Trek
writer's bible
From: UXDBA on
On Jan 28, 1:16 pm, UXDBA <unixdb...(a)googlemail.com> wrote:
> All,
>
> at Oracle 9.2.0.8
>
> we are aiming to  delete huge  rows( old data )  from a table T1
>
> Since we can not afford outage hence CTAS and other options are ruled
> out.
>
> Method being opted are ( in Test env first ) :
>
> a) Mark the rows to be deleted. i.e. Update one column, which is made
> exclusivley for
> archiving/deletion
> b) delete the marked rows in a  batches
> b) Use Online redfifnition to reclaim the storage space and lower the
> HWM.
>
> Further details:
>
> 1) Table T1 has no clobs ( columns are varchar, number, date types)
> 2) T1 has "objid" column as primary key NUMBER datatype
> 3) "dev" column is type NUMBER
> 4) dev column is indexed normal B*Tree index.
> 5) Based on the bussiness rule for "unused data" we set "dev" column
> to 1. Default value of   "dev"
> column is NULL
>
> We use the following function to set the "dev" column to 1
>
> f1) ...FN_Mark_for_Del
> UPDATE T1
>         SET dev = 1
>       WHERE start_time < (SYSDATE - no_days) and
> some_business_rule..........;
>
> Then we use the following function to delete the rows.
>
> f2) ...FN_del_T1
> FUNCTION fn_delete_T1 (batch_size IN NUMBER, max_rows IN NUMBER)
>       RETURN BOOLEAN
>    AS
>       cnt         NUMBER;
>       row_count   NUMBER;
>    BEGIN
>       SAVEPOINT checkhere;
>
>       SELECT COUNT (objid)
>         INTO row_count
>         FROM T1
>        WHERE dev = 1;
>
>       cnt := 0;
>       DBMS_OUTPUT.put_line
>                (   'Total number of rows to be deleted from T1 ['
>                 || LEAST (row_count, max_rows)
>                 || '] in batches of ['
>                 || batch_size
>                 || '] rows'
>                );
>       DBMS_OUTPUT.put_line (   'Starting deletion at ['
>                             || TO_CHAR (SYSDATE, 'dd-MM-yyyy
> hh24:mi:ss')
>                             || ']'
>                            );
>
>       WHILE (cnt < row_count AND (cnt + batch_size <= max_rows))
>       LOOP
>          DELETE FROM T1
>                WHERE dev = 1 AND ROWNUM <= batch_size;
>
>          cnt := cnt + SQL%ROWCOUNT;
>          COMMIT;
>          DBMS_OUTPUT.put_line (   'Deleted ['
>                                || cnt
>                                || '] rows from T1'
>                               );
>       END LOOP;
>
>       DBMS_OUTPUT.put_line (   'End of deletion at ['
>                             || TO_CHAR (SYSDATE, 'dd-MM-yyyy
> hh24:mi:ss')
>                             || ']'
>                            );
>       DBMS_OUTPUT.put_line ('Deleted [' || cnt
>                             || '] rows from T1'
>                            );
>       RETURN TRUE;
>    EXCEPTION
>       WHEN OTHERS
>       THEN
>          ROLLBACK TO checkhere;
>          DBMS_OUTPUT.put_line (   'Error processing. Sql error code is
> ['
>                                || SQLCODE
>                                || ']'
>                               );
>          RETURN FALSE;
>    END fn_delete_T1;
>
> We used f1 to mark the data for deletion :
>
> Table    Total Count    Marked for Del
> T1    6469204            6177615
>
> Attempted to delete the rows in a controlled manner in following
> phases.
> phase-1 : batch size 500, records deleted 1000000 --> Time taken
> 00:05:49.09  Undo Generated 1GB,
> Redo 1.5 GB
> phase-2 : batch size 1000,records deleted 1000000 --> Time taken
> 00:07:23.8   Undo Generated 1GB,
> Redo 1.5 GB
> phase-3 : batch size 250, records deleted 1000000 --> Time taken
> 00:29:59.9   Undo Generated 1GB,
> Redo 1.5 GB
> phase-4 : batch size 500, records deleted 1000000 --> Time taken
> 00:22:23.31  Undo Generated 1GB,
> Redo 1.5 GB
>
> Done above just to benchmark the batch_size and no_of_rows deleted ..
> so that we can use those
> figure in live.
>
> Can see huge difference in time in phas-1 & phase-4 ( same batch size,
> same number_of_rows
> deleted).
>
> excperts from Statspack report ( phase-1):
>
> ..
> Load Profile
> ~~~~~~~~~~~~                            Per Second       Per
> Transaction
>                                    ---------------
> ---------------
>                   Redo size:          5,465,377.22
> 1,039,493.31
>               Logical reads:             66,886..79
> 12,721.60
>               Block changes:             44,316..35
> 8,428.80
>              Physical reads:              3,898.02
> 741.39
>             Physical writes:              1,498..68
> 285.04
>                  User calls:                 61.01
> 11.60
>                      Parses:                 31.51
> 5.99
>                 Hard parses:                  0.01
> 0.00
>                       Sorts:                  0.32
> 0.06
>                      Logons:                  0.00
> 0.00
>                    Executes:                 37.43
> 7.12
>                Transactions:                  5.26
> ..
> Instance Efficiency Percentages (Target 100%)
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>             Buffer Nowait %:  100.00       Redo NoWait %:    100.00
>             Buffer  Hit   %:   94.17    In-memory Sort %:    100.00
>             Library Hit   %:   99.99        Soft Parse %:     99.98
>          Execute to Parse %:   15.82         Latch Hit %:     99.98
> Parse CPU to Parse Elapsd %:  120.00     % Non-Parse CPU:     99.94
>
>  Shared Pool Statistics        Begin   End
>                                ------  ------
>              Memory Usage %:   32.99   33.00
>     % SQL with executions>1:   91.76   91.72
>   % Memory for SQL w/exec>1:   84.31   84.28
>
> Top 5 Timed Events
> ~~~~~~~~~~~~~~~~~~
> % Total
> Event                                               Waits    Time (s)
> Ela Time
> -------------------------------------------- ------------ -----------
> --------
> CPU time
> 205    39.03
> log file parallel write                             4,494
> 130    24.71
> db file sequential read                         1,511,549
> 124    23.65
> global cache busy                                      60
> 45     8.62
> write complete waits                                   47
> 7     1.29
>
> -------------------------------------------------------------
>
> ...
>                                                     CPU      Elapsd
>   Buffer Gets    Executions  Gets per Exec  %Total Time (s)  Time (s)
> Hash Value
> --------------- ------------ -------------- ------ -------- ---------
> ----------
>      25,865,837            1   25,865,837.0   99.7   196.90    340.57
> 2165703550
> Module: SQL*Plus
> DECLARE   RetVal BOOLEAN;   BATCH_SIZE NUMBER;   MAX_ROWS NUMBER
> ; BEGIN   BATCH_SIZE := NULL;   MAX_ROWS := NULL;   RetVal :=
> USERNAME.PKG_ARCHIVE_DEL.fn_delete_T1
> ( 500, 1000000 );   COMMIT;
>  END;
>
>      25,842,514        2,000       12,921.3   99.6   193.00    336.26
> 3890394561
> Module: SQL*Plus
> DELETE FROM T1 WHERE DEV = 1 AND ROWNUM <= :B1
>
> excperts from Statspack report ( phase-4):
>
> ...
> Load Profile
> ~~~~~~~~~~~~                            Per Second       Per
> Transaction
>                                    ---------------
> ---------------
>                   Redo size:          1,438,322.45
> 992,682.66
>               Logical reads:             31,594..86
> 21,805.73
>               Block changes:             11,635..91
> 8,030.72
>              Physical reads:             16,059..68
> 11,083.86
>             Physical writes:                818.52
> 564.92
>                  User calls:                 61.15
> 42.21
>                      Parses:                 32.80
> 22.64
>                 Hard parses:                  0.00
> 0.00
>                       Sorts:                  0.37
> 0.25
>                      Logons:                  0.06
> 0.04
>                    Executes:                 34.67
> 23.93
>                Transactions:                  1.45
> ..
> Instance Efficiency Percentages (Target 100%)
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>             Buffer Nowait %:   99.99       Redo NoWait %:    100.00
>             Buffer  Hit   %:   49.17    In-memory Sort %:    100.00
>             Library Hit   %:  100.00        Soft Parse %:    100.00
>          Execute to Parse %:    5.41         Latch Hit %:     99.68
> Parse CPU to Parse Elapsd %:   54.84     % Non-Parse CPU:     99.97
>
>  Shared Pool Statistics        Begin   End
>                                ------  ------
>              Memory Usage %:   33.05   33.06
>     % SQL with executions>1:   91.44   91.42
>   % Memory for SQL w/exec>1:   84.07   84.05
>
> Top 5 Timed Events
> ~~~~~~~~~~~~~~~~~~
> % Total
> Event                                               Waits    Time (s)
> Ela Time
> -------------------------------------------- ------------ -----------
> --------
> CPU time
> 1,271    66.76
> db file sequential read                        23,891,385
> 472    24.78
> log file parallel write                             4,272
> 121     6.37
> latch free                                          5,188
> 14      .71
> DFS lock handle                                     2,646
> 9      .46
>
> -------------------------------------------------------------
> ..
>                                                      CPU      Elapsd
>   Buffer Gets    Executions  Gets per Exec  %Total Time (s)  Time (s)
> Hash Value
> --------------- ------------ -------------- ------ -------- ---------
> ----------
>      46,644,584            1   46,644,584.0   99.2  1244.23   1311.65
> 733080457
> Module: SQL*Plus
> DECLARE   RetVal BOOLEAN;   BATCH_SIZE NUMBER;   MAX_ROWS NUMBER
> ; BEGIN   BATCH_SIZE := NULL;   MAX_ROWS := NULL;   RetVal :=
> USERNAME.PKG_ARCHIVE_DEL.fn_delete_T1
> (500, 1000000 );   COMMIT;
> END;
>
>      46,621,261        2,000       23,310.6   99.2  1240.47   1307.92
> 3890394561
> Module: SQL*Plus
> DELETE FROM T1 WHERE DEV = 1 AND ROWNUM <= :B1
>
> Questions
>
> q1) what for  phase-4 waiting ? [ as it took time 4x longer than
> phase-1 for the same number of
> rows & get per execution are almost double in phase-4 compared to
> phase-1 with poor buffer hit ratio  ]
> q2) is "delayed block cleanout" playing some role here ?
> q3) function  used (f1) and (f2) are efficient ? something can be
> suggested here to improve
> the efficiency?
> q4) in functions no use of rowid or cursor.. do use of these will
> help.?
> q5) will indexing "dev" column as bitmap index will help? (In
> production env this table T1 is
> subject to huge insert/update.)
>
> Regards

Thanks Joel

T1 count(*) - 6469204

and size is just 850 MB.

Regards

From: joel garry on
On Jan 28, 9:56 am, UXDBA <unixdb...(a)googlemail.com> wrote:


>
> > Regards
>
> Thanks  Joel
>
> T1 count(*) - 6469204

Sorry I missed that in the OP.

>
> and size is just 850 MB.
>
> Regards

Is there some reason you can't just do the setting of the mark for
delete flag and do it all at once, commit at the end? That will
probably force it to do a full table scan, switching your sequential
reads to scattered reads, and perhaps moving the bottleneck to redo
writing, where it perhaps should be.

Is your business logic too complex to just do the delete as a single,
perhaps complicated, SQL statement? That would also reduce PL context-
switching, as well as doing the full-table stuff once instead of
twice.

There is no rule of thumb for what percentage of deletes will work
better with a full table scan, but some people have shown situations
where even a small percentage of the data being deleted benefits from
just going through a full-scan once. Sometimes this is due to the
order of magnitude performance increase possible with multiblock
reads.

There is a rule of thumb about when to use sql versus pl/sql: Oh, I
see you posted this question to asktom too, see the answer to Tony
Fernandez two questions above your question.
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:2345591157689

jg
--
@home.com is bogus.
http://acme.com/licensemaker/licensemaker.cgi?state=California&text=oracle&plate=1993&r=1987184252
From: UXDBA on
On Jan 28, 10:00 pm, joel garry <joel-ga...(a)home.com> wrote:
> On Jan 28, 9:56 am, UXDBA <unixdb...(a)googlemail.com> wrote:
>
>
>
> > > Regards
>
> > Thanks  Joel
>
> > T1 count(*) - 6469204
>
> Sorry I missed that in the OP.
>
>
>
> > and size is just 850 MB.
>
> > Regards
>
> Is there some reason you can't just do the setting of the mark for
> delete flag and do it all at once, commit at the end?  That will
> probably force it to do a full table scan, switching your sequential
> reads to scattered reads, and perhaps moving the bottleneck to redo
> writing, where it perhaps should be.
>
> Is your business logic too complex to just do the delete as a single,
> perhaps complicated, SQL statement?  That would also reduce PL context-
> switching, as well as doing the full-table stuff once instead of
> twice.
>
> There is no rule of thumb for what percentage of deletes will work
> better with a full table scan, but some people have shown situations
> where even a small percentage of the data being deleted benefits from
> just going through a full-scan once.  Sometimes this is due to the
> order of magnitude performance increase possible with multiblock
> reads.
>
> There is a rule of thumb about when to use sql versus pl/sql:  Oh, I
> see you posted this question to asktom too, see the answer to Tony
> Fernandez two questions above your question.http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:234...
>
> jg
> --
> @home.com is bogus.http://acme.com/licensemaker/licensemaker.cgi?state=California&text=o...

Thanks Joel,

....Is your business logic too complex to just do the delete as a
single,


No it is simple we can mark the rows to be deleted in small chunks.
and then do a simple

delete from ... where ...

yes I have seen the posting in Asktom.
Will attempt the same.

Regards