Prev: SP2-0332: Cannot create spool file
Next: Strange SQL
From: Dusan Bolek on 2 Nov 2005 02:44 We're encountering many of "Global Enqueue Services Deadlock detected" messages in the alert log of one of our database instances (two node RAC database) together with quite big global cache cr request waits in STATSPACK report. The problem is that the only other available information is trace file that looks like: *** 2005-09-19 10:37:01.153 Global Wait-For-Graph(WFG) at ddTS[0.0] : BLOCKED 70000016f956fd8 5 [0x2b90011][0x5f20],[TX] [131321,1285] 1 BLOCKER 70000016b5ff288 5 [0x2b90011][0x5f20],[TX] [131317,1159] 1 BLOCKED 70000016b1c8440 5 [0x3fe001d][0x1a1d],[TX] [131317,1159] 1 BLOCKER 700000199574ea0 5 [0x3fe001d][0x1a1d],[TX] [131321,1285] 1 and sometimes also like: ----------enqueue------------------------ lock version : 10665 Owner node : 0 grant_level : KJUSEREX req_level : KJUSEREX bast_level : KJUSERNL notify_func : 0 resp : 7000001b13711d8 procp : 700000160d71fa0 pid : 5365832 proc version : 39 oprocp : 0 opid : 0 gid : 618 xid : 67532 66652 dd_time : 0.0 secs dd_count : 0 timeout : 0.0 secs On_timer_q? : N On_dd_q? : N lock_state : GRANTED Open Options : KJUSERDEADLOCK Convert options : KJUSERNOQUEUE History : 0x9a514395 Msg_Seq : 0x0 res_seq : 0 The problem is that it looks like no one including local Oracle support knows what is the meaning of values in this tracefile. We are also pursuing this with Oracle Worldwide Support, but it can take a lot of time. Is there someone who investigated this kind of trace file before? Search on Metalink showed only a couple of similar, but never answered questions. Thank you Dusan Bolek
From: Dusan Bolek on 2 Nov 2005 02:47 Database version is: 9.2.0.6, OS: AIX 64-bit 5.2. Forgot to mention that. Sorry. Dusan
From: Alexey Sergeyev on 2 Nov 2005 05:36 Hi Dusan! Yea, we 've experienced a problem like that. Due to some issues with bitmap indexes we were getting periodically ORA-00060, with very similar traces (i got those by setting '60 trace name ERRORSTACK level 1' event). The traces weren't too informative, and i opened a TAR with the question about more accurate diagnostic. The Oracle WWS suggestion was: "... systemstate level 10 is what you need, though the trace information is really **only to be read by Oracle** (there are no docs on how to read it)...". So, i'm afraid, in a similar way your traces are useful for the Oracle Support only... Alexey Sergeyev "Dusan Bolek" <spambin(a)seznam.cz> wrote in message news:1130917453.780017.190090(a)g44g2000cwa.googlegroups.com... > We're encountering many of "Global Enqueue Services Deadlock detected" > messages in the alert log of one of our database instances (two node > RAC database) together with quite big global cache cr request waits in > STATSPACK report. The problem is that the only other available > information is trace file that looks like: > > *** 2005-09-19 10:37:01.153 > Global Wait-For-Graph(WFG) at ddTS[0.0] : > BLOCKED 70000016f956fd8 5 [0x2b90011][0x5f20],[TX] [131321,1285] 1 > BLOCKER 70000016b5ff288 5 [0x2b90011][0x5f20],[TX] [131317,1159] 1 > BLOCKED 70000016b1c8440 5 [0x3fe001d][0x1a1d],[TX] [131317,1159] 1 > BLOCKER 700000199574ea0 5 [0x3fe001d][0x1a1d],[TX] [131321,1285] 1 > > and sometimes also like: > > ----------enqueue------------------------ > lock version : 10665 > Owner node : 0 > grant_level : KJUSEREX > req_level : KJUSEREX > bast_level : KJUSERNL > notify_func : 0 > resp : 7000001b13711d8 > procp : 700000160d71fa0 > pid : 5365832 > proc version : 39 > oprocp : 0 > opid : 0 > gid : 618 > xid : 67532 66652 > dd_time : 0.0 secs > dd_count : 0 > timeout : 0.0 secs > On_timer_q? : N > On_dd_q? : N > lock_state : GRANTED > Open Options : KJUSERDEADLOCK > Convert options : KJUSERNOQUEUE > History : 0x9a514395 > Msg_Seq : 0x0 > res_seq : 0 > > The problem is that it looks like no one including local Oracle support > knows what is the meaning of values in this tracefile. We are also > pursuing this with Oracle Worldwide Support, but it can take a lot of > time. Is there someone who investigated this kind of trace file before? > Search on Metalink showed only a couple of similar, but never answered > questions. > > Thank you > > Dusan Bolek >
From: yong321 on 2 Nov 2005 13:53 Dusan Bolek wrote: > We're encountering many of "Global Enqueue Services Deadlock detected" > messages in the alert log of one of our database instances (two node > RAC database) together with quite big global cache cr request waits in > STATSPACK report. The problem is that the only other available > information is trace file that looks like: > > *** 2005-09-19 10:37:01.153 > Global Wait-For-Graph(WFG) at ddTS[0.0] : > BLOCKED 70000016f956fd8 5 [0x2b90011][0x5f20],[TX] [131321,1285] 1 > BLOCKER 70000016b5ff288 5 [0x2b90011][0x5f20],[TX] [131317,1159] 1 > BLOCKED 70000016b1c8440 5 [0x3fe001d][0x1a1d],[TX] [131317,1159] 1 > BLOCKER 700000199574ea0 5 [0x3fe001d][0x1a1d],[TX] [131321,1285] 1 Dusan, Metalink Note:262226.1 offers some interpretation. Your process 131321 on node 1 is waiting for a TX lock in mode 5 (the two hex numbers, 0x2b90011 and 0x5f20, may be id1 and id2 in v$lock, respectively). Process 131317 on the same node is holding it. Unlike in a single instance deadlock graph, a Global Wait-For-Graph is missing the current SQL. The good news to you is that your database is capable of producing this deadlock involving only one instance. So set cluster_database to false and try to reproduce the deadlock to get a better trace file. Note:181489.1 has an interesting section about global cache cr request. But I think it's unrelated to the above deadlock. > > and sometimes also like: > > ----------enqueue------------------------ > lock version : 10665 > Owner node : 0 > grant_level : KJUSEREX > req_level : KJUSEREX > bast_level : KJUSERNL > notify_func : 0 > resp : 7000001b13711d8 > procp : 700000160d71fa0 > pid : 5365832 That looks like lkdebug output and is about a different instance, a different process. But confirm with Oracle support. Yong Huang
From: Dusan Bolek on 3 Nov 2005 06:50
Thank you, that looks like something. I will try to catch a deadlock message in the alert log as quickly as possible, then find user responsible (via pid) and from V$SQL the statement that could be causing this trouble. It is not 100% reliable way, but could be usefule. Dusan |