From: UXDBA on 28 Jan 2010 08:16 All, at Oracle 9.2.0.8 we are aiming to delete huge rows( old data ) from a table T1 Since we can not afford outage hence CTAS and other options are ruled out. Method being opted are ( in Test env first ) : a) Mark the rows to be deleted. i.e. Update one column, which is made exclusivley for archiving/deletion b) delete the marked rows in a batches b) Use Online redfifnition to reclaim the storage space and lower the HWM. Further details: 1) Table T1 has no clobs ( columns are varchar, number, date types) 2) T1 has "objid" column as primary key NUMBER datatype 3) "dev" column is type NUMBER 4) dev column is indexed normal B*Tree index. 5) Based on the bussiness rule for "unused data" we set "dev" column to 1. Default value of "dev" column is NULL We use the following function to set the "dev" column to 1 f1) ...FN_Mark_for_Del UPDATE T1 SET dev = 1 WHERE start_time < (SYSDATE - no_days) and some_business_rule..........; Then we use the following function to delete the rows. f2) ...FN_del_T1 FUNCTION fn_delete_T1 (batch_size IN NUMBER, max_rows IN NUMBER) RETURN BOOLEAN AS cnt NUMBER; row_count NUMBER; BEGIN SAVEPOINT checkhere; SELECT COUNT (objid) INTO row_count FROM T1 WHERE dev = 1; cnt := 0; DBMS_OUTPUT.put_line ( 'Total number of rows to be deleted from T1 [' || LEAST (row_count, max_rows) || '] in batches of [' || batch_size || '] rows' ); DBMS_OUTPUT.put_line ( 'Starting deletion at [' || TO_CHAR (SYSDATE, 'dd-MM-yyyy hh24:mi:ss') || ']' ); WHILE (cnt < row_count AND (cnt + batch_size <= max_rows)) LOOP DELETE FROM T1 WHERE dev = 1 AND ROWNUM <= batch_size; cnt := cnt + SQL%ROWCOUNT; COMMIT; DBMS_OUTPUT.put_line ( 'Deleted [' || cnt || '] rows from T1' ); END LOOP; DBMS_OUTPUT.put_line ( 'End of deletion at [' || TO_CHAR (SYSDATE, 'dd-MM-yyyy hh24:mi:ss') || ']' ); DBMS_OUTPUT.put_line ('Deleted [' || cnt || '] rows from T1' ); RETURN TRUE; EXCEPTION WHEN OTHERS THEN ROLLBACK TO checkhere; DBMS_OUTPUT.put_line ( 'Error processing. Sql error code is [' || SQLCODE || ']' ); RETURN FALSE; END fn_delete_T1; We used f1 to mark the data for deletion : Table Total Count Marked for Del T1 6469204 6177615 Attempted to delete the rows in a controlled manner in following phases. phase-1 : batch size 500, records deleted 1000000 --> Time taken 00:05:49.09 Undo Generated 1GB, Redo 1.5 GB phase-2 : batch size 1000,records deleted 1000000 --> Time taken 00:07:23.8 Undo Generated 1GB, Redo 1.5 GB phase-3 : batch size 250, records deleted 1000000 --> Time taken 00:29:59.9 Undo Generated 1GB, Redo 1.5 GB phase-4 : batch size 500, records deleted 1000000 --> Time taken 00:22:23.31 Undo Generated 1GB, Redo 1.5 GB Done above just to benchmark the batch_size and no_of_rows deleted .. so that we can use those figure in live. Can see huge difference in time in phas-1 & phase-4 ( same batch size, same number_of_rows deleted). excperts from Statspack report ( phase-1): ... Load Profile ~~~~~~~~~~~~ Per Second Per Transaction --------------- --------------- Redo size: 5,465,377.22 1,039,493.31 Logical reads: 66,886.79 12,721.60 Block changes: 44,316.35 8,428.80 Physical reads: 3,898.02 741.39 Physical writes: 1,498.68 285.04 User calls: 61.01 11.60 Parses: 31.51 5.99 Hard parses: 0.01 0.00 Sorts: 0.32 0.06 Logons: 0.00 0.00 Executes: 37.43 7.12 Transactions: 5.26 ... Instance Efficiency Percentages (Target 100%) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Buffer Nowait %: 100.00 Redo NoWait %: 100.00 Buffer Hit %: 94.17 In-memory Sort %: 100.00 Library Hit %: 99.99 Soft Parse %: 99.98 Execute to Parse %: 15.82 Latch Hit %: 99.98 Parse CPU to Parse Elapsd %: 120.00 % Non-Parse CPU: 99.94 Shared Pool Statistics Begin End ------ ------ Memory Usage %: 32.99 33.00 % SQL with executions>1: 91.76 91.72 % Memory for SQL w/exec>1: 84.31 84.28 Top 5 Timed Events ~~~~~~~~~~~~~~~~~~ % Total Event Waits Time (s) Ela Time -------------------------------------------- ------------ ----------- -------- CPU time 205 39.03 log file parallel write 4,494 130 24.71 db file sequential read 1,511,549 124 23.65 global cache busy 60 45 8.62 write complete waits 47 7 1.29 ------------------------------------------------------------- .... CPU Elapsd Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value --------------- ------------ -------------- ------ -------- --------- ---------- 25,865,837 1 25,865,837.0 99.7 196.90 340.57 2165703550 Module: SQL*Plus DECLARE RetVal BOOLEAN; BATCH_SIZE NUMBER; MAX_ROWS NUMBER ; BEGIN BATCH_SIZE := NULL; MAX_ROWS := NULL; RetVal := USERNAME.PKG_ARCHIVE_DEL.fn_delete_T1 ( 500, 1000000 ); COMMIT; END; 25,842,514 2,000 12,921.3 99.6 193.00 336.26 3890394561 Module: SQL*Plus DELETE FROM T1 WHERE DEV = 1 AND ROWNUM <= :B1 excperts from Statspack report ( phase-4): .... Load Profile ~~~~~~~~~~~~ Per Second Per Transaction --------------- --------------- Redo size: 1,438,322.45 992,682.66 Logical reads: 31,594.86 21,805.73 Block changes: 11,635.91 8,030.72 Physical reads: 16,059.68 11,083.86 Physical writes: 818.52 564.92 User calls: 61.15 42.21 Parses: 32.80 22.64 Hard parses: 0.00 0.00 Sorts: 0.37 0.25 Logons: 0.06 0.04 Executes: 34.67 23.93 Transactions: 1.45 ... Instance Efficiency Percentages (Target 100%) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Buffer Nowait %: 99.99 Redo NoWait %: 100.00 Buffer Hit %: 49.17 In-memory Sort %: 100.00 Library Hit %: 100.00 Soft Parse %: 100.00 Execute to Parse %: 5.41 Latch Hit %: 99.68 Parse CPU to Parse Elapsd %: 54.84 % Non-Parse CPU: 99.97 Shared Pool Statistics Begin End ------ ------ Memory Usage %: 33.05 33.06 % SQL with executions>1: 91.44 91.42 % Memory for SQL w/exec>1: 84.07 84.05 Top 5 Timed Events ~~~~~~~~~~~~~~~~~~ % Total Event Waits Time (s) Ela Time -------------------------------------------- ------------ ----------- -------- CPU time 1,271 66.76 db file sequential read 23,891,385 472 24.78 log file parallel write 4,272 121 6.37 latch free 5,188 14 .71 DFS lock handle 2,646 9 .46 ------------------------------------------------------------- ... CPU Elapsd Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value --------------- ------------ -------------- ------ -------- --------- ---------- 46,644,584 1 46,644,584.0 99.2 1244.23 1311.65 733080457 Module: SQL*Plus DECLARE RetVal BOOLEAN; BATCH_SIZE NUMBER; MAX_ROWS NUMBER ; BEGIN BATCH_SIZE := NULL; MAX_ROWS := NULL; RetVal := USERNAME.PKG_ARCHIVE_DEL.fn_delete_T1 (500, 1000000 ); COMMIT; END; 46,621,261 2,000 23,310.6 99.2 1240.47 1307.92 3890394561 Module: SQL*Plus DELETE FROM T1 WHERE DEV = 1 AND ROWNUM <= :B1 Questions q1) what for phase-4 waiting ? [ as it took time 4x longer than phase-1 for the same number of rows & get per execution are almost double in phase-4 compared to phase-1 with poor buffer hit ratio ] q2) is "delayed block cleanout" playing some role here ? q3) function used (f1) and (f2) are efficient ? something can be suggested here to improve the efficiency? q4) in functions no use of rowid or cursor.. do use of these will help.? q5) will indexing "dev" column as bitmap index will help? (In production env this table T1 is subject to huge insert/update.) Regards
From: joel garry on 28 Jan 2010 12:22 On Jan 28, 5:16 am, UXDBA <unixdb...(a)googlemail.com> wrote: > > q1) what for phase-4 waiting ? [ as it took time 4x longer than > phase-1 for the same number of > rows & get per execution are almost double in phase-4 compared to > phase-1 with poor buffer hit ratio ] Well, sequential read waits probably means you are using an index to figure out which blocks to update. Of course the buffer hit ratio would be poor, you are jumping all over the place to grab random blocks. How big is the table (size and rows)? How long does it take to do a full scan of the table? Sometimes it is faster to use full scans because multiblock reads can help, but we have no clue here of your total sizes. Hit ratio tuning is silly, anyways, but here it may be a clue you may be doing it wrong. But you may have no choice. Don't be afraid of large undo, in any case. > q2) is "delayed block cleanout" playing some role here ? Before and in between phases, select some field that is not indexed to force a full table scan and cleanup of the blocks. If that makes the times quicker and more stable, it was likely delayed block cleanout. If I'm not totally confused. jg -- @home.com is bogus. "Pick any combination of four numbers plus a percentage point, use it as your story's stardate. For example, 1313.5 is twelve o'clock noon of one day and 1314.5 would be noon the next day. Each percentage point is roughly equivalent to one-tenth of a day. The progression of stardates in your script should remain constant but don't worry about whether or not there is a progression from other scripts. Stardates are a mathematical formula which varies depending on location in the galaxy, velocity of travel, and other factors, can vary widely from episode to episode." - Star Trek writer's bible
From: UXDBA on 28 Jan 2010 12:56 On Jan 28, 1:16 pm, UXDBA <unixdb...(a)googlemail.com> wrote: > All, > > at Oracle 9.2.0.8 > > we are aiming to delete huge rows( old data ) from a table T1 > > Since we can not afford outage hence CTAS and other options are ruled > out. > > Method being opted are ( in Test env first ) : > > a) Mark the rows to be deleted. i.e. Update one column, which is made > exclusivley for > archiving/deletion > b) delete the marked rows in a batches > b) Use Online redfifnition to reclaim the storage space and lower the > HWM. > > Further details: > > 1) Table T1 has no clobs ( columns are varchar, number, date types) > 2) T1 has "objid" column as primary key NUMBER datatype > 3) "dev" column is type NUMBER > 4) dev column is indexed normal B*Tree index. > 5) Based on the bussiness rule for "unused data" we set "dev" column > to 1. Default value of "dev" > column is NULL > > We use the following function to set the "dev" column to 1 > > f1) ...FN_Mark_for_Del > UPDATE T1 > SET dev = 1 > WHERE start_time < (SYSDATE - no_days) and > some_business_rule..........; > > Then we use the following function to delete the rows. > > f2) ...FN_del_T1 > FUNCTION fn_delete_T1 (batch_size IN NUMBER, max_rows IN NUMBER) > RETURN BOOLEAN > AS > cnt NUMBER; > row_count NUMBER; > BEGIN > SAVEPOINT checkhere; > > SELECT COUNT (objid) > INTO row_count > FROM T1 > WHERE dev = 1; > > cnt := 0; > DBMS_OUTPUT.put_line > ( 'Total number of rows to be deleted from T1 [' > || LEAST (row_count, max_rows) > || '] in batches of [' > || batch_size > || '] rows' > ); > DBMS_OUTPUT.put_line ( 'Starting deletion at [' > || TO_CHAR (SYSDATE, 'dd-MM-yyyy > hh24:mi:ss') > || ']' > ); > > WHILE (cnt < row_count AND (cnt + batch_size <= max_rows)) > LOOP > DELETE FROM T1 > WHERE dev = 1 AND ROWNUM <= batch_size; > > cnt := cnt + SQL%ROWCOUNT; > COMMIT; > DBMS_OUTPUT.put_line ( 'Deleted [' > || cnt > || '] rows from T1' > ); > END LOOP; > > DBMS_OUTPUT.put_line ( 'End of deletion at [' > || TO_CHAR (SYSDATE, 'dd-MM-yyyy > hh24:mi:ss') > || ']' > ); > DBMS_OUTPUT.put_line ('Deleted [' || cnt > || '] rows from T1' > ); > RETURN TRUE; > EXCEPTION > WHEN OTHERS > THEN > ROLLBACK TO checkhere; > DBMS_OUTPUT.put_line ( 'Error processing. Sql error code is > [' > || SQLCODE > || ']' > ); > RETURN FALSE; > END fn_delete_T1; > > We used f1 to mark the data for deletion : > > Table Total Count Marked for Del > T1 6469204 6177615 > > Attempted to delete the rows in a controlled manner in following > phases. > phase-1 : batch size 500, records deleted 1000000 --> Time taken > 00:05:49.09 Undo Generated 1GB, > Redo 1.5 GB > phase-2 : batch size 1000,records deleted 1000000 --> Time taken > 00:07:23.8 Undo Generated 1GB, > Redo 1.5 GB > phase-3 : batch size 250, records deleted 1000000 --> Time taken > 00:29:59.9 Undo Generated 1GB, > Redo 1.5 GB > phase-4 : batch size 500, records deleted 1000000 --> Time taken > 00:22:23.31 Undo Generated 1GB, > Redo 1.5 GB > > Done above just to benchmark the batch_size and no_of_rows deleted .. > so that we can use those > figure in live. > > Can see huge difference in time in phas-1 & phase-4 ( same batch size, > same number_of_rows > deleted). > > excperts from Statspack report ( phase-1): > > .. > Load Profile > ~~~~~~~~~~~~ Per Second Per > Transaction > --------------- > --------------- > Redo size: 5,465,377.22 > 1,039,493.31 > Logical reads: 66,886..79 > 12,721.60 > Block changes: 44,316..35 > 8,428.80 > Physical reads: 3,898.02 > 741.39 > Physical writes: 1,498..68 > 285.04 > User calls: 61.01 > 11.60 > Parses: 31.51 > 5.99 > Hard parses: 0.01 > 0.00 > Sorts: 0.32 > 0.06 > Logons: 0.00 > 0.00 > Executes: 37.43 > 7.12 > Transactions: 5.26 > .. > Instance Efficiency Percentages (Target 100%) > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Buffer Nowait %: 100.00 Redo NoWait %: 100.00 > Buffer Hit %: 94.17 In-memory Sort %: 100.00 > Library Hit %: 99.99 Soft Parse %: 99.98 > Execute to Parse %: 15.82 Latch Hit %: 99.98 > Parse CPU to Parse Elapsd %: 120.00 % Non-Parse CPU: 99.94 > > Shared Pool Statistics Begin End > ------ ------ > Memory Usage %: 32.99 33.00 > % SQL with executions>1: 91.76 91.72 > % Memory for SQL w/exec>1: 84.31 84.28 > > Top 5 Timed Events > ~~~~~~~~~~~~~~~~~~ > % Total > Event Waits Time (s) > Ela Time > -------------------------------------------- ------------ ----------- > -------- > CPU time > 205 39.03 > log file parallel write 4,494 > 130 24.71 > db file sequential read 1,511,549 > 124 23.65 > global cache busy 60 > 45 8.62 > write complete waits 47 > 7 1.29 > > ------------------------------------------------------------- > > ... > CPU Elapsd > Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) > Hash Value > --------------- ------------ -------------- ------ -------- --------- > ---------- > 25,865,837 1 25,865,837.0 99.7 196.90 340.57 > 2165703550 > Module: SQL*Plus > DECLARE RetVal BOOLEAN; BATCH_SIZE NUMBER; MAX_ROWS NUMBER > ; BEGIN BATCH_SIZE := NULL; MAX_ROWS := NULL; RetVal := > USERNAME.PKG_ARCHIVE_DEL.fn_delete_T1 > ( 500, 1000000 ); COMMIT; > END; > > 25,842,514 2,000 12,921.3 99.6 193.00 336.26 > 3890394561 > Module: SQL*Plus > DELETE FROM T1 WHERE DEV = 1 AND ROWNUM <= :B1 > > excperts from Statspack report ( phase-4): > > ... > Load Profile > ~~~~~~~~~~~~ Per Second Per > Transaction > --------------- > --------------- > Redo size: 1,438,322.45 > 992,682.66 > Logical reads: 31,594..86 > 21,805.73 > Block changes: 11,635..91 > 8,030.72 > Physical reads: 16,059..68 > 11,083.86 > Physical writes: 818.52 > 564.92 > User calls: 61.15 > 42.21 > Parses: 32.80 > 22.64 > Hard parses: 0.00 > 0.00 > Sorts: 0.37 > 0.25 > Logons: 0.06 > 0.04 > Executes: 34.67 > 23.93 > Transactions: 1.45 > .. > Instance Efficiency Percentages (Target 100%) > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Buffer Nowait %: 99.99 Redo NoWait %: 100.00 > Buffer Hit %: 49.17 In-memory Sort %: 100.00 > Library Hit %: 100.00 Soft Parse %: 100.00 > Execute to Parse %: 5.41 Latch Hit %: 99.68 > Parse CPU to Parse Elapsd %: 54.84 % Non-Parse CPU: 99.97 > > Shared Pool Statistics Begin End > ------ ------ > Memory Usage %: 33.05 33.06 > % SQL with executions>1: 91.44 91.42 > % Memory for SQL w/exec>1: 84.07 84.05 > > Top 5 Timed Events > ~~~~~~~~~~~~~~~~~~ > % Total > Event Waits Time (s) > Ela Time > -------------------------------------------- ------------ ----------- > -------- > CPU time > 1,271 66.76 > db file sequential read 23,891,385 > 472 24.78 > log file parallel write 4,272 > 121 6.37 > latch free 5,188 > 14 .71 > DFS lock handle 2,646 > 9 .46 > > ------------------------------------------------------------- > .. > CPU Elapsd > Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) > Hash Value > --------------- ------------ -------------- ------ -------- --------- > ---------- > 46,644,584 1 46,644,584.0 99.2 1244.23 1311.65 > 733080457 > Module: SQL*Plus > DECLARE RetVal BOOLEAN; BATCH_SIZE NUMBER; MAX_ROWS NUMBER > ; BEGIN BATCH_SIZE := NULL; MAX_ROWS := NULL; RetVal := > USERNAME.PKG_ARCHIVE_DEL.fn_delete_T1 > (500, 1000000 ); COMMIT; > END; > > 46,621,261 2,000 23,310.6 99.2 1240.47 1307.92 > 3890394561 > Module: SQL*Plus > DELETE FROM T1 WHERE DEV = 1 AND ROWNUM <= :B1 > > Questions > > q1) what for phase-4 waiting ? [ as it took time 4x longer than > phase-1 for the same number of > rows & get per execution are almost double in phase-4 compared to > phase-1 with poor buffer hit ratio ] > q2) is "delayed block cleanout" playing some role here ? > q3) function used (f1) and (f2) are efficient ? something can be > suggested here to improve > the efficiency? > q4) in functions no use of rowid or cursor.. do use of these will > help.? > q5) will indexing "dev" column as bitmap index will help? (In > production env this table T1 is > subject to huge insert/update.) > > Regards Thanks Joel T1 count(*) - 6469204 and size is just 850 MB. Regards
From: joel garry on 28 Jan 2010 17:00 On Jan 28, 9:56 am, UXDBA <unixdb...(a)googlemail.com> wrote: > > > Regards > > Thanks Joel > > T1 count(*) - 6469204 Sorry I missed that in the OP. > > and size is just 850 MB. > > Regards Is there some reason you can't just do the setting of the mark for delete flag and do it all at once, commit at the end? That will probably force it to do a full table scan, switching your sequential reads to scattered reads, and perhaps moving the bottleneck to redo writing, where it perhaps should be. Is your business logic too complex to just do the delete as a single, perhaps complicated, SQL statement? That would also reduce PL context- switching, as well as doing the full-table stuff once instead of twice. There is no rule of thumb for what percentage of deletes will work better with a full table scan, but some people have shown situations where even a small percentage of the data being deleted benefits from just going through a full-scan once. Sometimes this is due to the order of magnitude performance increase possible with multiblock reads. There is a rule of thumb about when to use sql versus pl/sql: Oh, I see you posted this question to asktom too, see the answer to Tony Fernandez two questions above your question. http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:2345591157689 jg -- @home.com is bogus. http://acme.com/licensemaker/licensemaker.cgi?state=California&text=oracle&plate=1993&r=1987184252
From: UXDBA on 29 Jan 2010 05:05
On Jan 28, 10:00 pm, joel garry <joel-ga...(a)home.com> wrote: > On Jan 28, 9:56 am, UXDBA <unixdb...(a)googlemail.com> wrote: > > > > > > Regards > > > Thanks Joel > > > T1 count(*) - 6469204 > > Sorry I missed that in the OP. > > > > > and size is just 850 MB. > > > Regards > > Is there some reason you can't just do the setting of the mark for > delete flag and do it all at once, commit at the end? That will > probably force it to do a full table scan, switching your sequential > reads to scattered reads, and perhaps moving the bottleneck to redo > writing, where it perhaps should be. > > Is your business logic too complex to just do the delete as a single, > perhaps complicated, SQL statement? That would also reduce PL context- > switching, as well as doing the full-table stuff once instead of > twice. > > There is no rule of thumb for what percentage of deletes will work > better with a full table scan, but some people have shown situations > where even a small percentage of the data being deleted benefits from > just going through a full-scan once. Sometimes this is due to the > order of magnitude performance increase possible with multiblock > reads. > > There is a rule of thumb about when to use sql versus pl/sql: Oh, I > see you posted this question to asktom too, see the answer to Tony > Fernandez two questions above your question.http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:234... > > jg > -- > @home.com is bogus.http://acme.com/licensemaker/licensemaker.cgi?state=California&text=o... Thanks Joel, ....Is your business logic too complex to just do the delete as a single, No it is simple we can mark the rows to be deleted in small chunks. and then do a simple delete from ... where ... yes I have seen the posting in Asktom. Will attempt the same. Regards |