From: shorti on 19 Jul 2010 18:16 DB2 V9.5 on AIX, We received an error on one of our machines that the DB2 active log filesystem was full. I am assuming it was due to some uncommitted transaction so I am trying to hunt down who it was. To do this I was trying to determine what active log file was hung out in order to get a time period of how long the transaction has been open so I can find it among the thousands of transactions. I did confirm we filled up the active log directory and not the archive log directory. I see some potential lock holds in the application snapshot but even with that it will be difficult to determine which one was held for a long period of time since I dont see anything in the snapshot that tells me when the first uncommitted transaction started. Do any of these entries in the diag log or the entries I received from the database snapshot give any information about the longest uncommitted transaction or the active log that could not archive? I have researched the logs stated in some of the entries below and I dont see any indication that they truly are the log that had the uncommitted transaction: 2010-07-15-04.24.58.172043+000 I942782A418 LEVEL: Error PID : 2203654 TID : 9254 PROC : db2sysc 0 INSTANCE: afuser NODE : 000 DB : ALPHADB EDUID : 9254 EDUNAME: db2loggr (ALPHADB) 0 FUNCTION: DB2 UDB, data protection services, sqlpgadf, probe:1970 RETCODE : ZRC=0x850F000C=-2062614516=SQLO_DISK "Disk full." DIA8312C Disk was full. 2010-07-15-04.24.58.172374+000 I943201A390 LEVEL: Info PID : 2203654 TID : 9254 PROC : db2sysc 0 INSTANCE: afuser NODE : 000 DB : ALPHADB EDUID : 9254 EDUNAME: db2loggr (ALPHADB) 0 FUNCTION: DB2 UDB, data protection services, sqlpgadf, probe:1970 DATA #1 : <preformatted> Last log file not required: S0016020.LOG. 2010-07-15-04.24.58.172554+000 I943592A397 LEVEL: Info PID : 2203654 TID : 9254 PROC : db2sysc 0 INSTANCE: afuser NODE : 000 DB : ALPHADB EDUID : 9254 EDUNAME: db2loggr (ALPHADB) 0 FUNCTION: DB2 UDB, data protection services, sqlpgasn, probe:190 DATA #1 : <preformatted> Error -2062614516 from add active log is ignored. --------------------------------------------------------------------------- From Snapshot: Node number = 0 File number of first active log = 16034 File number of last active log = 16048 File number of current active log = 16035 File number of log being archived = 15838 First active log file = S0016034.LOG ----------------------------------------------------------------------------------------------------------- The db2diag.log shows S0015718.LOG was the last file archived before hitting the first "Disk was full" error and an sql -968. But I see that we continued to archive until the machine was rebooted about 30 minutes later. I also see the archive logging before hand seems steady and reasonable as well so nothing to indicate a log held open. I want to also mention a online database table/index reorg was going on and had started about 20 minutes prior to seeing the first error. I know the online reorg will accumulate a lot more logging. I dont suspect the reorg being a problem but it would have escalated the number of logs being accumulated so probably irritated the problem of an open transaction. We monitor the archive logging space during online reorg to ensure we do not run out of space...that is why I know we didnt run out of archive space also. Any help on tracking this down with the logging I have would be helpful. Thanks!
From: Lennart Jonsson on 19 Jul 2010 20:12 On 2010-07-20 00:16, shorti wrote: > DB2 V9.5 on AIX, > > We received an error on one of our machines that the DB2 active log > filesystem was full. I am assuming it was due to some uncommitted > transaction so I am trying to hunt down who it was. To do this I was > trying to determine what active log file was hung out in order to get > a time period of how long the transaction has been open so I can find > it among the thousands of transactions. I did confirm we filled up > the active log directory and not the archive log directory. > Perhaps something like: select x.agent_id, substr(z.corr_TOKEN,1,12) as ip, cast(y.stmt_text as varchar(800)) as stmt, x.locks_held, x.APPL_IDLE_TIME, x.UOW_START_TIME from sysibmadm.SNAPAPPL x join sysibmadm.snapstmt y on x.agent_id = y.agent_id join sysibmadm.SNAPAPPL_INFO z on x.agent_id = z.agent_id where UOW_STOP_TIME is null and TIMESTAMPDIFF(8,CHAR(current_timestamp - UOW_START_TIME)) > 8 order by UOW_START_TIME; You might be interested in other attributes, and it is assumed that the right monitors are on. /Lennart [...]
|
Pages: 1 Prev: Locked by agent id 0 huh!! Next: Q about DB2 Express-C and Admin Tools |