From: oracle_man on
All,

Let me know your thoughts on the below. My client blew away his OCR.
I did a restore of the OCR and now the db won't start. Here's the
command line syntax and tail of the crs log:

db001i$srvctl start database -d KGDB
PRKP-1001 : Error starting instance KGDB1 on node urma-atgdb001
CRS-0215: Could not start resource 'ora.KGDB.KGDB1.inst'.
PRKP-1001 : Error starting instance KGDB2 on node urma-atgdb002
CRS-0215: Could not start resource 'ora.KGDB.KGDB2.inst'.

db001i$tail -f /usr/app/oracle/product/10.2.0/crs/log/urma-atgdb001/
crsd/crsd.log
2007-09-11 17:09:34.571: [ CRSD][1484945760]0SM: rE2Ec: 4
2007-09-11 17:09:34.604: [ CRSRES][1512245600]0startRunnable: setting
CLI values
2007-09-11 17:09:37.125: [ CRSAPP][1512245600]0StartResource error
for ora.KGDB.KGDB1.inst error code = 1
2007-09-11 17:09:38.768: [ CRSD][1512245600]0SM:dE2Ec: all E2E cmds
done. 0
2007-09-11 17:51:13.393: [ CRSRES][1493350752]0startRunnable: setting
CLI values
2007-09-11 17:51:13.638: [ CRSRES][1539529056]0Attempting to start
`ora.KGDB.KGDB2.inst` on member `urma-atgdb002`
2007-09-11 17:51:13.672: [ CRSRES][1493350752]0Attempting to start
`ora.KGDB.KGDB1.inst` on member `urma-atgdb001`
2007-09-11 17:51:16.690: [ CRSAPP][1493350752]0StartResource error
for ora.KGDB.KGDB1.inst error code = 1
2007-09-11 17:51:17.833: [ CRSRES][1539529056]0Start of
`ora.KGDB.KGDB2.inst` on member `urma-atgdb002` failed.
2007-09-11 17:51:18.354: [ CRSRES][1493350752]0Start of
`ora.KGDB.KGDB1.inst` on member `urma-atgdb001` failed.

From: Steve Howard on

oracle_man wrote:
> All,
>
> Let me know your thoughts on the below. My client blew away his OCR.
> I did a restore of the OCR and now the db won't start. Here's the
> command line syntax and tail of the crs log:
>
> db001i$srvctl start database -d KGDB
> PRKP-1001 : Error starting instance KGDB1 on node urma-atgdb001
> CRS-0215: Could not start resource 'ora.KGDB.KGDB1.inst'.
> PRKP-1001 : Error starting instance KGDB2 on node urma-atgdb002
> CRS-0215: Could not start resource 'ora.KGDB.KGDB2.inst'.
>
> db001i$tail -f /usr/app/oracle/product/10.2.0/crs/log/urma-atgdb001/
> crsd/crsd.log
> 2007-09-11 17:09:34.571: [ CRSD][1484945760]0SM: rE2Ec: 4
> 2007-09-11 17:09:34.604: [ CRSRES][1512245600]0startRunnable: setting
> CLI values
> 2007-09-11 17:09:37.125: [ CRSAPP][1512245600]0StartResource error
> for ora.KGDB.KGDB1.inst error code = 1
> 2007-09-11 17:09:38.768: [ CRSD][1512245600]0SM:dE2Ec: all E2E cmds
> done. 0
> 2007-09-11 17:51:13.393: [ CRSRES][1493350752]0startRunnable: setting
> CLI values
> 2007-09-11 17:51:13.638: [ CRSRES][1539529056]0Attempting to start
> `ora.KGDB.KGDB2.inst` on member `urma-atgdb002`
> 2007-09-11 17:51:13.672: [ CRSRES][1493350752]0Attempting to start
> `ora.KGDB.KGDB1.inst` on member `urma-atgdb001`
> 2007-09-11 17:51:16.690: [ CRSAPP][1493350752]0StartResource error
> for ora.KGDB.KGDB1.inst error code = 1
> 2007-09-11 17:51:17.833: [ CRSRES][1539529056]0Start of
> `ora.KGDB.KGDB2.inst` on member `urma-atgdb002` failed.
> 2007-09-11 17:51:18.354: [ CRSRES][1493350752]0Start of
> `ora.KGDB.KGDB1.inst` on member `urma-atgdb001` failed.

Hi,

What is in the alert log for both instances?

Regards,

Steve

From: ivl5 on
On Sep 12, 10:54 am, oracle_man <oracle_...(a)yahoo.com> wrote:
> All,
>
> Let me know your thoughts on the below. My client blew away his OCR.
> I did a restore of the OCR and now the db won't start. Here's the
> command line syntax and tail of the crs log:
>
> db001i$srvctl start database -d KGDB
> PRKP-1001 : Error starting instance KGDB1 on node urma-atgdb001
> CRS-0215: Could not start resource 'ora.KGDB.KGDB1.inst'.
> PRKP-1001 : Error starting instance KGDB2 on node urma-atgdb002
> CRS-0215: Could not start resource 'ora.KGDB.KGDB2.inst'.
>
> db001i$tail -f /usr/app/oracle/product/10.2.0/crs/log/urma-atgdb001/
> crsd/crsd.log
> 2007-09-11 17:09:34.571: [ CRSD][1484945760]0SM: rE2Ec: 4
> 2007-09-11 17:09:34.604: [ CRSRES][1512245600]0startRunnable: setting
> CLI values
> 2007-09-11 17:09:37.125: [ CRSAPP][1512245600]0StartResource error
> for ora.KGDB.KGDB1.inst error code = 1
> 2007-09-11 17:09:38.768: [ CRSD][1512245600]0SM:dE2Ec: all E2E cmds
> done. 0
> 2007-09-11 17:51:13.393: [ CRSRES][1493350752]0startRunnable: setting
> CLI values
> 2007-09-11 17:51:13.638: [ CRSRES][1539529056]0Attempting to start
> `ora.KGDB.KGDB2.inst` on member `urma-atgdb002`
> 2007-09-11 17:51:13.672: [ CRSRES][1493350752]0Attempting to start
> `ora.KGDB.KGDB1.inst` on member `urma-atgdb001`
> 2007-09-11 17:51:16.690: [ CRSAPP][1493350752]0StartResource error
> for ora.KGDB.KGDB1.inst error code = 1
> 2007-09-11 17:51:17.833: [ CRSRES][1539529056]0Start of
> `ora.KGDB.KGDB2.inst` on member `urma-atgdb002` failed.
> 2007-09-11 17:51:18.354: [ CRSRES][1493350752]0Start of
> `ora.KGDB.KGDB1.inst` on member `urma-atgdb001` failed.

Check:
- crs_stat -f ora.KGDB.KGDB1.inst
- crs_stat -f ora.KGDB.KGDB2.inst
- alert logs for both instances as suggested by Steve.

From: oracle_man on
On Sep 11, 6:06 pm, Steve Howard <stevedhow...(a)gmail.com> wrote:
> oracle_man wrote:
> > All,
>
> > Let me know your thoughts on the below. My client blew away his OCR.
> > I did a restore of the OCR and now the db won't start. Here's the
> > command line syntax and tail of the crs log:
>
> > db001i$srvctl start database -d KGDB
> > PRKP-1001 : Error starting instance KGDB1 on node urma-atgdb001
> > CRS-0215: Could not start resource 'ora.KGDB.KGDB1.inst'.
> > PRKP-1001 : Error starting instance KGDB2 on node urma-atgdb002
> > CRS-0215: Could not start resource 'ora.KGDB.KGDB2.inst'.
>
> > db001i$tail -f /usr/app/oracle/product/10.2.0/crs/log/urma-atgdb001/
> > crsd/crsd.log
> > 2007-09-11 17:09:34.571: [ CRSD][1484945760]0SM: rE2Ec: 4
> > 2007-09-11 17:09:34.604: [ CRSRES][1512245600]0startRunnable: setting
> > CLI values
> > 2007-09-11 17:09:37.125: [ CRSAPP][1512245600]0StartResource error
> > for ora.KGDB.KGDB1.inst error code = 1
> > 2007-09-11 17:09:38.768: [ CRSD][1512245600]0SM:dE2Ec: all E2E cmds
> > done. 0
> > 2007-09-11 17:51:13.393: [ CRSRES][1493350752]0startRunnable: setting
> > CLI values
> > 2007-09-11 17:51:13.638: [ CRSRES][1539529056]0Attempting to start
> > `ora.KGDB.KGDB2.inst` on member `urma-atgdb002`
> > 2007-09-11 17:51:13.672: [ CRSRES][1493350752]0Attempting to start
> > `ora.KGDB.KGDB1.inst` on member `urma-atgdb001`
> > 2007-09-11 17:51:16.690: [ CRSAPP][1493350752]0StartResource error
> > for ora.KGDB.KGDB1.inst error code = 1
> > 2007-09-11 17:51:17.833: [ CRSRES][1539529056]0Start of
> > `ora.KGDB.KGDB2.inst` on member `urma-atgdb002` failed.
> > 2007-09-11 17:51:18.354: [ CRSRES][1493350752]0Start of
> > `ora.KGDB.KGDB1.inst` on member `urma-atgdb001` failed.
>
> Hi,
>
> What is in the alert log for both instances?
>
> Regards,
>
> Steve

Here is what's in the crs log:

2007-09-12 16:33:52.603
[crsd(6479)]CRS-1201:CRSD started on node urma-atgdb001.

This is the client log:

cat clsc.log
Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996,
2005 Oracle. All rights reserved.
2007-09-12 15:12:25.985: [ CSSCLNT][2538401568]clsssInitNative:
connect failed, rc 9

2007-09-12 15:12:28.667: [ CSSCLNT][2538401568]clsssInitNative:
connect failed, rc 9

This is the crsd.log:

Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996,
2005 Oracle. All rights reserved.
2007-09-12 15:12:24.457: [ default][2541403712][ENTER]0
Oracle Database 10g CRS Release 10.2.0.3.0 Production Copyright 1996,
2004, Oracle. All rights reserved
2007-09-12 15:12:24.457: [ default][2541403712]0CRS Daemon Starting
2007-09-12 15:12:24.458: [ CRSMAIN][2541403712]0Checking the OCR
device
2007-09-12 15:12:24.467: [ CRSMAIN][2541403712]0Connecting to the CSS
Daemon
2007-09-12 15:12:24.880: [ COMMCRS][1084229984]clsc_connect:
(0xb870b0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_urma-
atgdb001_crs))

2007-09-12 15:12:24.880: [ CSSCLNT][2541403712]clsssInitNative:
connect failed, rc 9

2007-09-12 15:12:24.880: [ CRSRTI][2541403712]0CSS is not ready.
Received status 3 from CSS. Waiting for good status ..

2007-09-12 15:12:26.292: [ COMMCRS][1084229984]clsc_connect:
(0xb80600) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_urma-
atgdb001_crs))

2007-09-12 15:12:26.292: [ CSSCLNT][2541403712]clsssInitNative:
connect failed, rc 9

2007-09-12 15:12:26.292: [ CRSRTI][2541403712]0CSS is not ready.
Received status 3 from CSS. Waiting for good status ..

2007-09-12 15:12:27.705: [ COMMCRS][1084229984]clsc_connect:
(0xb80600) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_urma-
atgdb001_crs))

2007-09-12 15:12:27.705: [ CSSCLNT][2541403712]clsssInitNative:
connect failed, rc 9

2007-09-12 15:12:27.705: [ CRSRTI][2541403712]0CSS is not ready.
Received status 3 from CSS. Waiting for good status ..

2007-09-12 15:12:29.733: [ CRSD][2541403712]0Daemon Version:
10.2.0.3.0 Active Version: 10.2.0.3.0
2007-09-12 15:12:29.733: [ CRSD][2541403712]0Active Version and
Software Version are same
2007-09-12 15:12:29.733: [ CRSMAIN][2541403712]0Initializing OCR
2007-09-12 15:12:29.745: [ OCRRAW][2541403712]proprioo: for disk 0 (/
dev/raw/raw2), id match (1), my id set (188263131,1028247821) total id
sets (1), 1st set (188263131,1028247821), 2nd set (0,0) my votes (2),
total votes (2)
2007-09-12 15:12:29.975: [ CRSD][2541403712]0ENV Logging level for
Module: allcomp 0
2007-09-12 15:12:30.380: [ CRSD][2541403712]0ENV Logging level for
Module: default 0
2007-09-12 15:12:37.590: [ CRSD][2541403712]0ENV Logging level for
Module: COMMCRS 0
2007-09-12 15:12:37.595: [ CRSD][2541403712]0ENV Logging level for
Module: COMMNS 0
2007-09-12 15:12:37.598: [ CRSD][2541403712]0ENV Logging level for
Module: CRSUI 0
2007-09-12 15:12:50.465: [ CRSD][2541403712]0ENV Logging level for
Module: CRSCOMM 0
2007-09-12 15:12:50.870: [ CRSD][2541403712]0ENV Logging level for
Module: CRSRTI 0
2007-09-12 15:12:51.075: [ CRSD][2541403712]0ENV Logging level for
Module: CRSMAIN 0
2007-09-12 15:12:51.076: [ CRSD][2541403712]0ENV Logging level for
Module: CRSPLACE 0
2007-09-12 15:12:51.078: [ CRSD][2541403712]0ENV Logging level for
Module: CRSAPP 0
2007-09-12 15:12:51.281: [ CRSD][2541403712]0ENV Logging level for
Module: CRSRES 0
2007-09-12 15:12:51.283: [ CRSD][2541403712]0ENV Logging level for
Module: CRSOCR 0
2007-09-12 15:12:51.888: [ CRSD][2541403712]0ENV Logging level for
Module: CRSTIMER 0
2007-09-12 15:16:16.886: [ CRSD][2541403712]0ENV Logging level for
Module: CRSEVT 0
2007-09-12 15:16:16.892: [ CRSD][2541403712]0ENV Logging level for
Module: CRSD 0
2007-09-12 15:16:17.095: [ CRSD][2541403712]0ENV Logging level for
Module: CLUCLS 0
2007-09-12 15:16:17.703: [ CRSD][2541403712]0ENV Logging level for
Module: OCRRAW 0
2007-09-12 15:16:17.905: [ CRSD][2541403712]0ENV Logging level for
Module: OCROSD 0
2007-09-12 15:16:18.716: [ CRSD][2541403712]0ENV Logging level for
Module: CSSCLNT 0
2007-09-12 15:16:18.720: [ CRSD][2541403712]0ENV Logging level for
Module: OCRAPI 0
2007-09-12 15:16:18.923: [ CRSD][2541403712]0ENV Logging level for
Module: OCRUTL 0
2007-09-12 15:16:18.928: [ CRSD][2541403712]0ENV Logging level for
Module: OCRMSG 0
2007-09-12 15:16:18.929: [ CRSD][2541403712]0ENV Logging level for
Module: OCRCLI 0
2007-09-12 15:16:19.534: [ CRSD][2541403712]0ENV Logging level for
Module: OCRCAC 0
2007-09-12 15:16:20.341: [ CRSD][2541403712]0ENV Logging level for
Module: OCRSRV 0
2007-09-12 15:16:31.115: [ CRSD][2541403712]0ENV Logging level for
Module: OCRMAS 0
2007-09-12 15:16:31.116: [ CRSMAIN][2541403712]0Filename is /usr/app/
oracle/product/10.2.0/crs/crs/init/urma-atgdb001.pid
[ clsdmt][1398925664]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=urma-
atgdb001DBG_CRSD))
2007-09-12 15:24:35.357: [ CRSMAIN][2541403712]0Using Authorizer
location: /usr/app/oracle/product/10.2.0/crs/crs/auth/
2007-09-12 15:27:12.726: [ CRSMAIN][2541403712]0Initializing RTI
2007-09-12 15:36:35.373: [CRSTIMER][1419905376]0Timer Thread Starting.
2007-09-12 15:36:35.374: [ CRSRES][2541403712]0Parameter SECURITY =
1, running in USER Mode
2007-09-12 15:36:35.374: [ CRSMAIN][2541403712]0Initializing EVMMgr
2007-09-12 15:54:50.464: [ CRSMAIN][2541403712]0CRSD locked during
state recovery, please wait.
2007-09-12 16:33:46.318: [ CRSMAIN][2541403712]0CRSD recovered,
unlocked.
2007-09-12 16:33:46.319: [ CRSMAIN][2541403712]0QS socket on:
(ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
2007-09-12 16:33:46.569: [ CRSMAIN][2541403712]0CRSD UI socket on:
(ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
2007-09-12 16:33:52.603: [ CRSMAIN][2541403712]0E2E socket on:
(ADDRESS=(PROTOCOL=tcp)(HOST=urma-atgdb001-priv)(PORT=49896))
2007-09-12 16:33:52.603: [ CRSMAIN][2541403712]0Starting Threads
2007-09-12 16:33:52.603: [ CRSMAIN][2541403712]0CRS Daemon Started.
2007-09-12 16:33:52.604: [ CRSMAIN][1474455904]0Starting
runCommandServer for (UI = 1, E2E = 0). 0
2007-09-12 16:33:52.604: [ CRSMAIN][1476557152]0Starting
runCommandServer for (UI = 1, E2E = 0). 1
2007-09-12 18:03:34.882: [ CRSRES][1482860896]0startRunnable: setting
CLI values
2007-09-12 18:17:34.839: [ COMMCRS][1497553248]clscsendx: (0xcd3dd0)
Physical connection (0xcd4290) not active

2007-09-12 18:18:26.148: [ CRSRES][1487063392]0startRunnable: setting
CLI values
2007-09-12 18:18:29.209: [ CRSRES][1484962144]0startRunnable: setting
CLI values
2007-09-12 18:18:38.859: [ CRSRES][1482860896]0Attempting to start
`ora.urma-atgdb001.vip` on member `urma-atgdb001`
2007-09-12 18:18:39.065: [ CRSRES][1487063392]0Attempting to start
`ora.urma-atgdb002.vip` on member `urma-atgdb001`
2007-09-12 18:19:43.003: [ CRSRES][1484962144]0Attempting to start
`ora.urma-atgdb001.ASM1.asm` on member `urma-atgdb001`
2007-09-12 19:01:21.127: [ COMMCRS][1346476384]clscsendx:
(0x2a98246470) Connection not active


This is the ocssd.log:

Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996,
2005 Oracle. All rights reserved.
[ CSSD]2007-09-12 15:12:26.888 >USER: Oracle Database 10g CSS
Release 10.2.0.3.0 Production Copyright 1996, 2004 Oracle. All rights
reserved.
[ CSSD]2007-09-12 15:12:26.888 >USER: CSS daemon log for node
urma-atgdb001, number 1, in cluster crs
[ CSSD]2007-09-12 15:12:26.901 [2538401568] >TRACE: clssscmain:
local-only set to false
[ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=urma-
atgdb001DBG_CSSD))
[ CSSD]2007-09-12 15:12:26.925 [2538401568] >TRACE:
clssnmReadNodeInfo: added node 1 (urma-atgdb001) to cluster
[ CSSD]2007-09-12 15:12:26.932 [2538401568] >TRACE:
clssnmReadNodeInfo: added node 2 (urma-atgdb002) to cluster
[ CSSD]2007-09-12 15:12:26.939 [1115699552] >TRACE:
clssnm_skgxnmon: skgxn init failed
[ CSSD]2007-09-12 15:12:26.939 [2538401568] >TRACE:
clssnm_skgxnonline: Using vacuous skgxn monitor
[ CSSD]2007-09-12 15:12:26.942 [2538401568] >TRACE:
clssnmNMInitialize: misscount set to (60), impending reconfig
threshold set to (56000)
[ CSSD]2007-09-12 15:12:26.944 [2538401568] >TRACE:
clssnmNMInitialize: diskShortTimeout set to (57000)ms
[ CSSD]2007-09-12 15:12:26.945 [2538401568] >TRACE:
clssnmNMInitialize: diskLongTimeout set to (200000)ms
[ CSSD]2007-09-12 15:12:26.948 [2538401568] >TRACE:
clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw3)
[ CSSD]2007-09-12 15:12:26.949 [1115699552] >TRACE: clssnmvDPT:
spawned for disk 0 (/dev/raw/raw3)
[ CSSD]2007-09-12 15:12:28.959 [1115699552] >TRACE:
clssnmDiskStateChange: state from 2 to 4 disk (0//dev/raw/raw3)
[ CSSD]2007-09-12 15:12:29.000 [1126189408] >TRACE:
clssnmvKillBlockThread: spawned for disk 0 (/dev/raw/raw3) initial
sleep interval (1000)ms
[ CSSD]2007-09-12 15:12:29.010 [1115699552] >TRACE:
clssnmReadDskHeartbeat: node(2) is down. rcfg(1) wrtcnt(13)
LATS(4294098390) Disk lastSeqNo(13)
[ CSSD]2007-09-12 15:12:29.026 [2538401568] >TRACE:
clssnmFatalInit: fatal mode enabled
[ CSSD]2007-09-12 15:12:29.026 [1147169120] >TRACE:
clssnmconnect: connecting to node 1, flags 0x0001, connector 1
[ CSSD]2007-09-12 15:12:29.028 [1147169120] >TRACE:
clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=urma-
atgdb001-priv)(PORT=49895))

[ CSSD]2007-09-12 15:12:29.028 [1147169120] >TRACE:
clssnmconnect: connecting to node 0, flags 0x0000, connector 1
[ CSSD]2007-09-12 15:12:29.028 [1147169120] >TRACE:
clssnmClusterListener: Probing node 2, con (0x725d40)
[ CSSD]2007-09-12 15:12:29.030 [1147169120] >TRACE:
clssnmConnComplete: connected to node 2 (con 0x725d60), state 3 birth
0, unique 1189635134/1189635134 prevConuni(0)
[ CSSD]2007-09-12 15:12:29.034 [1157658976] >TRACE:
clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)
(KEY=Oracle_CSS_LclLstnr_crs_1))
[ CSSD]2007-09-12 15:12:29.034 [1157658976] >TRACE:
clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)
(KEY=OCSSD_LL_urma-atgdb001_crs))
[ CSSD]2007-09-12 15:12:29.036 [1189128544] >TRACE:
clssgmPeerListener: Listening on (ADDRESS=(PROTOCOL=tcp)(DEV=19)
(HOST=192.168.43.101)(PORT=32784))
[ CSSD]2007-09-12 15:12:29.036 [1199618400] >TRACE:
clssnmPollingThread: Connection complete
[ CSSD]2007-09-12 15:12:29.036 [1210108256] >TRACE:
clssnmSendingThread: Connection complete
[ CSSD]2007-09-12 15:12:29.036 [1220598112] >TRACE:
clssnmRcfgMgrThread: Connection complete
[ CSSD]2007-09-12 15:12:29.235 [1147169120] >TRACE:
clssnmHandleSync: Acknowledging sync: src[2] srcName[urma-atgdb002]
seq[5] sync[2]
[ CSSD]2007-09-12 15:12:29.235 [1147169120] >TRACE:
clssnmHandleSync: diskTimeout set to (57000)ms
[ CSSD]2007-09-12 15:12:29.276 [1147169120] >TRACE:
clssnmSendVoteInfo: node(2) syncSeqNo(2)
[ CSSD]2007-09-12 15:12:29.277 [1147169120] >TRACE:
clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0)
birth (0/0) (old/new)
[ CSSD]2007-09-12 15:12:29.277 [1147169120] >TRACE:
clssnmDeactivateNode: node 0 () left cluster

[ CSSD]2007-09-12 15:12:29.277 [1147169120] >TRACE:
clssnmUpdateNodeState: node 1, state (1/2) unique
(1189635146/1189635146) prevConuni(0) birth (0/2) (old/new)
[ CSSD]2007-09-12 15:12:29.277 [1147169120] >TRACE:
clssnmUpdateNodeState: node 2, state (4/3) unique
(1189635134/1189635134) prevConuni(0) birth (0/1) (old/new)
[ CSSD]2007-09-12 15:12:29.278 [1147169120] >USER:
clssnmHandleUpdate: SYNC(2) from node(2) completed
[ CSSD]2007-09-12 15:12:29.278 [1147169120] >USER:
clssnmHandleUpdate: NODE 1 (urma-atgdb001) IS ACTIVE MEMBER OF CLUSTER
[ CSSD]2007-09-12 15:12:29.278 [1147169120] >USER:
clssnmHandleUpdate: NODE 2 (urma-atgdb002) IS ACTIVE MEMBER OF CLUSTER
[ CSSD]2007-09-12 15:12:29.278 [1147169120] >TRACE:
clssnmHandleUpdate: diskTimeout set to (200000)ms
[ CSSD]2007-09-12 15:12:29.342 [2538401568] >USER:
NMEVENT_SUSPEND [00][00][00][00]
[ CSSD]2007-09-12 15:12:29.342 [1231087968] >TRACE:
clssgmReconfigThread: started for reconfig (2)
[ CSSD]2007-09-12 15:12:29.342 [1231087968] >USER:
NMEVENT_RECONFIG [00][00][00][06]
[ CSSD]2007-09-12 15:12:29.342 [1231087968] >TRACE:
clssgmEstablishConnections: 2 nodes in cluster incarn 2
[ CSSD]2007-09-12 15:12:29.343 [1189128544] >TRACE:
clssgmInitialRecv: (0x783ba0) accepted a new connection from node 2
born at 1 active (2, 2), vers (10,3,1,2)
[ CSSD]2007-09-12 15:12:29.343 [1189128544] >TRACE:
clssgmInitialRecv: conns done (2/2)
[ CSSD]2007-09-12 15:12:29.343 [1231087968] >TRACE:
clssgmEstablishMasterNode: MASTER for 2 is node(2) birth(1)
[ CSSD]2007-09-12 15:12:29.343 [1231087968] >TRACE:
clssgmChangeMasterNode: requeued 0 RPCs
[ CSSD]2007-09-12 15:12:29.346 [1189128544] >TRACE:
clssgmHandleDBDone(): src/dest (2/65535) size(72) incarn 2
[ CSSD]CLSS-3000: reconfiguration successful, incarnation 2 with 2
nodes

[ CSSD]CLSS-3001: local node number 1, master node number 2

[ CSSD]2007-09-12 15:12:29.346 [1231087968] >TRACE:
clssgmReconfigThread: completed for reconfig(2), with status(1)
[ CSSD]2007-09-12 15:12:29.516 [1157658976] >TRACE:
clssgmClientConnectMsg: Connect from con(0x78ff30) proc(0x78a6c0)
pid() proto(10:2:1:1)
[ CSSD]2007-09-12 15:12:29.686 [1157658976] >TRACE:
clssgmClientConnectMsg: Connect from con(0x78b210) proc(0x78d650)
pid() proto(10:2:1:1)
[ CSSD]2007-09-12 15:12:29.687 [1189128544] >TRACE:
clssgmCommonAddMember: clsomon joined (1/0x1000000/#CSS_CLSSOMON)
[ CSSD]2007-09-12 15:12:31.955 [1157658976] >TRACE:
clssgmClientConnectMsg: Connect from con(0x792240) proc(0x794680)
pid() proto(10:2:1:1)
[ CSSD]2007-09-12 15:15:14.196 [1157658976] >TRACE:
clssgmClientConnectMsg: Connect from con(0x78e980) proc(0x799d90)
pid() proto(10:2:1:1)
[ CSSD]2007-09-12 18:57:02.583 [1157658976] >TRACE:
clssgmClientConnectMsg: Connect from con(0x79aa70) proc(0x79a6b0)
pid() proto(10:2:1:1)

This is the evmd.log:

Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996,
2005 Oracle. All rights reserved.
2007-09-12 15:12:23.946: [ EVMD][2541405344]0EVMD Starting
2007-09-12 15:12:23.946: [ EVMD][2541405344]0
Oracle Database 10g CRS Release 10.2.0.3.0 Production Copyright 1996,
2006, Oracle. All rights reserved
2007-09-12 15:12:23.946: [ EVMD][2541405344]0Initializing OCR
2007-09-12 15:12:24.571: [ COMMCRS][1084229984]clsc_connect:
(0x6e7b00) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_urma-
atgdb001_crs))

2007-09-12 15:12:24.571: [ CSSCLNT][2541405344]clsssInitNative:
connect failed, rc 9

2007-09-12 15:12:24.572: [ EVMD][2541405344]0EVMD waiting for CSS
to be ready err = 3
2007-09-12 15:12:25.985: [ COMMCRS][1084229984]clsc_connect:
(0x6e7a90) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_urma-
atgdb001_crs))

2007-09-12 15:12:25.985: [ CSSCLNT][2541405344]clsssInitNative:
connect failed, rc 9

2007-09-12 15:12:25.986: [ EVMD][2541405344]0EVMD waiting for CSS
to be ready err = 3
2007-09-12 15:12:27.399: [ COMMCRS][1084229984]clsc_connect:
(0x6e7a90) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_urma-
atgdb001_crs))

2007-09-12 15:12:27.399: [ CSSCLNT][2541405344]clsssInitNative:
connect failed, rc 9

2007-09-12 15:12:27.399: [ EVMD][2541405344]0EVMD waiting for CSS
to be ready err = 3
2007-09-12 15:12:28.812: [ COMMCRS][1084229984]clsc_connect:
(0x6e7a90) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_urma-
atgdb001_crs))

2007-09-12 15:12:28.812: [ CSSCLNT][2541405344]clsssInitNative:
connect failed, rc 9

2007-09-12 15:12:28.812: [ EVMD][2541405344]0EVMD waiting for CSS
to be ready err = 3
2007-09-12 15:12:50.268: [ EVMD][2541405344]0Daemon Version:
10.2.0.3.0 Active Version: 10.2.0.3.0
2007-09-12 15:12:50.268: [ EVMD][2541405344]0Active Version and
Software Version are same
2007-09-12 15:12:50.268: [ EVMD][2541405344]0Initializing
Diagnostics Settings
2007-09-12 15:12:50.271: [ EVMD][2541405344]0ENV Logging level for
Module: allcomp 0
2007-09-12 15:12:50.272: [ EVMD][2541405344]0ENV Logging level for
Module: default 0
2007-09-12 15:12:50.274: [ EVMD][2541405344]0ENV Logging level for
Module: COMMCRS 0
2007-09-12 15:12:50.275: [ EVMD][2541405344]0ENV Logging level for
Module: COMMNS 0
2007-09-12 15:12:50.277: [ EVMD][2541405344]0ENV Logging level for
Module: EVMD 0
2007-09-12 15:12:50.279: [ EVMD][2541405344]0ENV Logging level for
Module: EVMDMAIN 0
2007-09-12 15:12:50.280: [ EVMD][2541405344]0ENV Logging level for
Module: EVMCOMM 0
2007-09-12 15:12:50.282: [ EVMD][2541405344]0ENV Logging level for
Module: EVMEVT 0
2007-09-12 15:12:50.283: [ EVMD][2541405344]0ENV Logging level for
Module: EVMAPP 0
2007-09-12 15:12:50.285: [ EVMD][2541405344]0ENV Logging level for
Module: EVMAGENT 0
2007-09-12 15:12:50.286: [ EVMD][2541405344]0ENV Logging level for
Module: CRSOCR 0
2007-09-12 15:12:50.288: [ EVMD][2541405344]0ENV Logging level for
Module: CLUCLS 0
2007-09-12 15:12:50.289: [ EVMD][2541405344]0ENV Logging level for
Module: OCRRAW 0
2007-09-12 15:12:50.291: [ EVMD][2541405344]0ENV Logging level for
Module: OCROSD 0
2007-09-12 15:12:50.292: [ EVMD][2541405344]0ENV Logging level for
Module: OCRAPI 0
2007-09-12 15:12:50.294: [ EVMD][2541405344]0ENV Logging level for
Module: OCRUTL 0
2007-09-12 15:12:50.295: [ EVMD][2541405344]0ENV Logging level for
Module: OCRMSG 0
2007-09-12 15:12:50.297: [ EVMD][2541405344]0ENV Logging level for
Module: OCRCLI 0
2007-09-12 15:12:50.299: [ EVMD][2541405344]0ENV Logging level for
Module: CSSCLNT 0
2007-09-12 15:12:50.299: [ EVMD][2541405344]0Creating pidfile /usr/
app/oracle/product/10.2.0/crs/evm/init/urma-atgdb001.pid
[ clsdmt][1105209696]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=urma-
atgdb001DBG_EVMD))
2007-09-12 15:12:51.169: [ EVMD][2541405344]0Authorization database
built successfully.
2007-09-12 15:16:16.897: [ EVMEVT][2541405344][ENTER]0EVM Listening
on: 47904074
2007-09-12 15:16:17.704: [ EVMAPP][2541405344]0EVMD Started
2007-09-12 15:16:17.714: [ EVMD][2541405344]0Authorization database
built successfully.
2007-09-12 15:16:17.724: [ COMMCRS][1136679264]clsc_auth_send:
(0x78b1e0) Connection not active

2007-09-12 15:16:17.724: [ COMMCRS][1136679264]Authorization failed,
network error

2007-09-12 15:16:18.715: [ EVMEVT][1189128544]0Listening at
(ADDRESS=(PROTOCOL=tcp)(HOST=urma-atgdb001-priv)(PORT=49898)) for P2P
evmd connections requests
2007-09-12 15:16:18.718: [ EVMD][2541405344]0Authorization database
built successfully.
2007-09-12 15:16:18.762: [ EVMEVT][1220598112][ENTER]0Establishing
P2P connection with node: urma-atgdb002
2007-09-12 15:16:43.983: [ EVMEVT][1231087968]0Private Member Update
event for urma-atgdb001 received by clssgsgrpstat

In short, sometimes things work on one node, then a reboot, then not
working on the same node, but rather on the other node.

Any insight is appreciated.

Rich