ASM install fails on one node

I have been trying to install 10gRAC on a two virtual node cluster. I installed clusterware and it was successful. Before I started ASM install:
[oracle@rac1 bin]$ ./crs_stat -t
Name Type Target State Host
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
[oracle@rac1 logs]$ /u01/crs/oracle/product/10.2.0/crs/bin/crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
[oracle@rac1 logs]$ ps -ef|grep d.bin
root 3795 1 0 13:57 ? 00:00:35 /u01/crs/oracle/product/10.2.0/crs/bin/crsd.bin reboot
oracle 4966 3793 0 13:59 ? 00:00:06 /u01/crs/oracle/product/10.2.0/crs/bin/evmd.bin
oracle 5082 5059 0 13:59 ? 00:01:06 /u01/crs/oracle/product/10.2.0/crs/bin/ocssd.bin
oracle 30520 4813 0 16:23 pts/3 00:00:00 grep d.bin
During thhe ASM install here is what I got:
WARNING: Error while copying directory /u01/app/oracle/product/10.2.0/db_1 with exclude file list 'null' to nodes 'rac2'. [PRKC-1073 : Failed to transfer directory "/u01/app/oracle/product/10.2.0/db_1" to any of the given nodes "rac2 ".
Error on node rac2:Read from remote host rac2: Connection reset by peer]
Refer to '/u01/app/oracle/oraInventory/logs/installActions2009-04-18_01-30-26PM.log' for details. You may fix the errors on the required remote nodes. Refer to the install guide for error recovery. Click 'Yes' if you want to proceed. Click 'No' to exit the install. Do you want to continue?
INFO: User Selected: Yes/OK
It appears to me as though the installer was not able to copy over the "/u01/app/oracle/product/10.2.0/db_1" directory to the rac2 node. I do not see any reason for that, I have setup ssh user equivalence for both oracle and root users, ssh and scp seem to work both ways. Permissions should not be an issue on one node and not the other as I replicated the permissions.
I continued the installation and ASM is working fine on rac1 node and not on the second node. I tried using the dbca to setup the ASM on the second node and it errors out with a "crs-0223 resource placement error". Here is what I did next:
[oracle@rac1 bin]$ ./srvctl status asm -n rac1
ASM instance +ASM1 is running on node rac1.
[oracle@rac1 bin]$ ./srvctl status asm -n rac2
ASM instance +ASM2 is not running on node rac2.
[oracle@rac1 bin]$ ./crs_sta
crs_start crs_start.bin crs_stat crs_stat.bin
[oracle@rac1 bin]$ ./crs_stat -t
Name Type Target State Host
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE UNKNOWN rac2
ora....C2.lsnr application ONLINE UNKNOWN rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
[oracle@rac1 bin]$ ./crs_start ora.rac2.ASM2.asm
CRS-1028: Dependency analysis failed because of:
'Resource in UNKNOWN state: ora.rac2.ASM2.asm'
CRS-0223: Resource 'ora.rac2.ASM2.asm' has placement error.
I would like to get the ASM instance extended to the second node (rac2) and ofcourse, continue with the database instance creation. How can I accomplish this?
Thanks!

Hi orafun,
this message:
+WARNING: Error while copying directory /u01/app/oracle/product/10.2.0/db_1 with exclude file list 'null' to nodes 'rac2'. [PRKC-1073 : Failed to transfer directory "/u01/app/oracle/product/10.2.0/db_1" to any of the given nodes "rac2 ".+
+Error on node rac2:Read from remote host rac2: Connection reset by peer]+
Refer to '/u01/app/oracle/oraInventory/logs/installActions2009-04-18_01-30-26PM.log' for details. You may fix the errors on the required remote nodes. Refer to the install guide for error recovery. Click 'Yes' if you want to proceed. Click 'No' to exit the install. Do you want to continue?
INFO: User Selected: Yes/OK
Tells you that the Oracle Home could not be copied onto the remote node. The logs mentioned might tell you more, but this is the reason why ASM cannot be started on the other node - there is no software that could be used to start an ASM instance. Now you said:
"+It appears to me as though the installer was not able to copy over the "/u01/app/oracle/product/10.2.0/db_1" directory to the rac2 node. I do not see any reason for that, I have setup ssh user equivalence for both oracle and root users, ssh and scp seem to work both ways. Permissions should not be an issue on one node and not the other as I replicated the permissions+."
My question would be: What do you try to achieve? IF it is your only interest to "get it done and over with", then you can TAR up the Oracle Database home from which you want to run ASM and un-TAR on the remote node. Given that the paths are all correct, the registration already took place and hence, you can try starting the ASM instance on node2. IF you want to know the reason for the issue, further investigation and more information would be required.
Hope that helps. Thanks,
Markus

Similar Messages

  • Ora.asm -init failed on second node root.sh

    Hi All,
    Installing Grid Infrastructure for a 11gr2 Cluster on two nodes Oracle Linux 5 + Vsware vSphere v4, shared disk on same host machine. When run root.sh, first node was success but the second node got following error message (actually the first node was cloned from the seoncd):
    CRS-2672: Attempting to start 'ora.ctssd' on 'wandrac2'
    Start action for octssd aborted
    CRS-2676: Start of 'ora.ctssd' on 'wandrac2' succeeded
    CRS-2672: Attempting to start 'ora.drivers.acfs' on 'wandrac2'
    CRS-2672: Attempting to start 'ora.asm' on 'wandrac2'
    CRS-2676: Start of 'ora.drivers.acfs' on 'wandrac2' succeeded
    CRS-2676: Start of 'ora.asm' on 'wandrac2' succeeded
    CRS-2664: Resource 'ora.ctssd' is already running on 'wandrac2'
    CRS-4000: Command Start failed, or completed with errors.
    Command return code of 1 (256) from command: /orapp/racsl/11.2.0/bin/crsctl start resource ora.asm -init
    Start of resource "ora.asm -init" failed
    Failed to start ASM
    Failed to start Oracle Clusterware stack
    Thanks in advance for any information and helps,

    Hi,
    I came across this error and I am about to start a fresh installation of the grid. (ealier one failed because it was unable to read the memory in rac2 )
    Is there anything specific I can change before I start my installation.
    PS - I didnt get what exactly is going on with the hosts file.
    My files are as follows :
    RAC1 - etc/hosts
    [oracle@falcen6a ~]$ cat /etc/hosts
    # Do not remove the following line, or various programs
    # that require network functionality will fail.
    127.0.0.1 localhost.localdomain localhost
    ::1 localhost6.localdomain6 localhost6
    # Public
    192.168.100.218 falcen6a.a.pri falcen6a
    192.168.100.219 falcen6b.a.pri falcen6b
    # Private
    192.168.210.101 falcen6a-priv.a.pri falcen6a-priv
    192.168.210.102 falcen6b-priv.a.pri falcen6b-priv
    # Virtual
    192.168.100.212 falcen6a-vip.a.pri falcen6a-vip
    192.168.100.213 falcen6b-vip.a.pri falcen6b-vip
    # SCAN
    #192.168.100.208 falcen6-scan.a.pri falcen6-scan
    #192.168.100.209 falcen6-scan.a.pri falcen6-scan
    #192.168.100.210 falcen6-scan.a.pri falcen6-scan
    on RAC2
    [oracle@falcen6b ~]$ cat /etc/hosts
    # Do not remove the following line, or various programs
    # that require network functionality will fail.
    127.0.0.1 localhost.localdomain localhost
    ::1 localhost6.localdomain6 localhost6
    #Public
    192.168.100.218 falcen6a.a.pri falcen6a
    192.168.100.219 falcen6b.a.pri falcen6b
    # Private
    192.168.210.101 falcen6a-priv.a.pri falcen6a-priv
    192.168.210.102 falcen6b-priv.a.pri falcen6b-priv
    # Virtual
    192.168.100.212 falcen6a-vip.a.pri falcen6a-vip
    192.168.100.213 falcen6b-vip.a.pri falcen6b-vip
    # SCAN
    #192.168.100.208 falcen6-scan.a.pri falcen6-scan
    #192.168.100.209 falcen6-scan.a.pri falcen6-scan
    #192.168.100.210 falcen6-scan.a.pri falcen6-scan
    Can someone please confirm this??

  • Root.sh failed in one node - CLSMON and UDLM

    Hi experts.
    My enviroment is:
    2-node SunCluster Update3
    Oracle RAC 10.2.0.1 > planning to upgrade to 10.2.0.4
    The problem is: I installed the CRS services on 2 nodes - OK
    After that, running root.sh fails in 1 node:
    /u01/app/product/10/CRS/root.sh
    WARNING: directory '/u01/app/product/10' is not owned by root
    WARNING: directory '/u01/app/product' is not owned by root
    WARNING: directory '/u01/app' is not owned by root
    WARNING: directory '/u01' is not owned by root
    Checking to see if Oracle CRS stack is already configured
    Checking to see if any 9i GSD is up
    Setting the permissions on OCR backup directory
    Setting up NS directories
    Oracle Cluster Registry configuration upgraded successfully
    WARNING: directory '/u01/app/product/10' is not owned by root
    WARNING: directory '/u01/app/product' is not owned by root
    WARNING: directory '/u01/app' is not owned by root
    WARNING: directory '/u01' is not owned by root
    clscfg: EXISTING configuration version 3 detected.
    clscfg: version 3 is 10G Release 2.
    Successfully accumulated necessary OCR keys.
    Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
    node <nodenumber>: <nodename> <private interconnect name> <hostname>
    node 0: spodhcsvr10 clusternode1-priv spodhcsvr10
    node 1: spodhcsvr12 clusternode2-priv spodhcsvr12
    clscfg: Arguments check out successfully.
    NO KEYS WERE WRITTEN. Supply -force parameter to override.
    -force is destructive and will destroy any previous cluster
    configuration.
    Oracle Cluster Registry for cluster has already been initialized
    Sep 22 13:34:17 spodhcsvr10 root: Oracle Cluster Ready Services starting by user request.
    Startup will be queued to init within 30 seconds.
    Sep 22 13:34:20 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Adding daemons to inittab
    Expecting the CRS daemons to be up within 600 seconds.
    Sep 22 13:34:34 spodhcsvr10 last message repeated 3 times
    Sep 22 13:34:34 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:34:40 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:35:43 spodhcsvr10 last message repeated 9 times
    Sep 22 13:36:07 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:36:07 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:36:14 spodhcsvr10 su: libsldap: Status: 85 Mesg: openConnection: simple bind failed - Timed out
    Sep 22 13:36:19 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:37:35 spodhcsvr10 last message repeated 11 times
    Sep 22 13:37:40 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:37:40 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:37:42 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:38:03 spodhcsvr10 last message repeated 3 times
    Sep 22 13:38:10 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:39:12 spodhcsvr10 last message repeated 9 times
    Sep 22 13:39:13 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:39:13 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:39:19 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:40:42 spodhcsvr10 last message repeated 12 times
    Sep 22 13:40:46 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:40:46 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:40:49 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:42:05 spodhcsvr10 last message repeated 11 times
    Sep 22 13:42:11 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:42:12 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:42:19 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:42:19 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:42:19 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:43:49 spodhcsvr10 last message repeated 13 times
    Sep 22 13:43:51 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:43:51 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:43:56 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Failure at final check of Oracle CRS stack.
    I traced the ocssd.log and found some informations:
    [    CSSD]2010-09-22 14:04:14.739 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:14.742 [6] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2478) LATS(0) Disk lastSeqNo(2478)
    [    CSSD]2010-09-22 14:04:14.742 [7] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:14.744 [7] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2478) LATS(0) Disk lastSeqNo(2478)
    [    CSSD]2010-09-22 14:04:14.745 [8] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:14.746 [8] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2478) LATS(0) Disk lastSeqNo(2478)
    [    CSSD]2010-09-22 14:04:14.785 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:14.785 [10] >TRACE: clssnmFatalThread: spawned
    [    CSSD]2010-09-22 14:04:14.785 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:14.786 [11] >TRACE: clssnmconnect: connecting to node 0, flags 0x0001, connector 1
    [    CSSD]2010-09-22 14:04:23.075 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
    [    CSSD]2010-09-22 14:04:23.075 >USER: CSS daemon log for node spodhcsvr10, number 0, in cluster NET_RAC
    [  clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=spodhcsvr10DBG_CSSD))
    [    CSSD]2010-09-22 14:04:23.082 [1] >TRACE: clssscmain: local-only set to false
    [    CSSD]2010-09-22 14:04:23.096 [1] >TRACE: clssnmReadNodeInfo: added node 0 (spodhcsvr10) to cluster
    [    CSSD]2010-09-22 14:04:23.106 [1] >TRACE: clssnmReadNodeInfo: added node 1 (spodhcsvr12) to cluster
    [    CSSD]2010-09-22 14:04:23.129 [5] >TRACE: [0]Node monitor: dlm attach failed error LK_STAT_NOTCREATED
    [    CSSD]CLSS-0001: skgxn not active
    [    CSSD]2010-09-22 14:04:23.129 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 30
    [    CSSD]2010-09-22 14:04:23.132 [1] >TRACE: clssnmInitNMInfo: misscount set to 600
    [    CSSD]2010-09-22 14:04:23.136 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:23.139 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:23.143 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:25.139 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:25.142 [6] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2488) LATS(0) Disk lastSeqNo(2488)
    [    CSSD]2010-09-22 14:04:25.143 [7] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:25.144 [7] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2488) LATS(0) Disk lastSeqNo(2488)
    [    CSSD]2010-09-22 14:04:25.145 [8] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:25.148 [8] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2489) LATS(0) Disk lastSeqNo(2489)
    [    CSSD]2010-09-22 14:04:25.186 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:25.186 [10] >TRACE: clssnmFatalThread: spawned
    [    CSSD]2010-09-22 14:04:25.186 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:25.187 [11] >TRACE: clssnmconnect: connecting to node 0, flags 0x0001, connector 1
    [    CSSD]2010-09-22 14:04:33.449 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
    [    CSSD]2010-09-22 14:04:33.449 >USER: CSS daemon log for node spodhcsvr10, number 0, in cluster NET_RAC
    [  clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=spodhcsvr10DBG_CSSD))
    [    CSSD]2010-09-22 14:04:33.457 [1] >TRACE: clssscmain: local-only set to false
    [    CSSD]2010-09-22 14:04:33.470 [1] >TRACE: clssnmReadNodeInfo: added node 0 (spodhcsvr10) to cluster
    [    CSSD]2010-09-22 14:04:33.480 [1] >TRACE: clssnmReadNodeInfo: added node 1 (spodhcsvr12) to cluster
    [    CSSD]2010-09-22 14:04:33.498 [5] >TRACE: [0]Node monitor: dlm attach failed error LK_STAT_NOTCREATED
    [    CSSD]CLSS-0001: skgxn not active
    [    CSSD]2010-09-22 14:04:33.498 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 30
    [    CSSD]2010-09-22 14:04:33.500 [1] >TRACE: clssnmInitNMInfo: misscount set to 600
    [    CSSD]2010-09-22 14:04:33.505 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:33.508 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:33.510 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:35.508 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:35.510 [6] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2499) LATS(0) Disk lastSeqNo(2499)
    [    CSSD]2010-09-22 14:04:35.510 [7] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:35.512 [7] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2499) LATS(0) Disk lastSeqNo(2499)
    [    CSSD]2010-09-22 14:04:35.513 [8] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:35.514 [8] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2499) LATS(0) Disk lastSeqNo(2499)
    [    CSSD]2010-09-22 14:04:35.553 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:35.553 [10] >TRACE: clssnmFatalThread: spawned
    [    CSSD]2010-09-22 14:04:35.553 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:35.553 [11] >TRACE: clssnmconnect: connecting to node 0, flags 0x0001, connector 1
    I believe the main error is:
    [    CSSD]2010-09-22 14:04:33.498 [5] >TRACE: [0]Node monitor: dlm attach failed error LK_STAT_NOTCREATED
    [    CSSD]CLSS-0001: skgxn not active
    And the communication between UDLM and CLSMON. But i don't know how to resolve this.
    My UDLM version is 3.3.4.9.
    Somebody have any ideas about this?
    Tks!

    Now i finally installed CRS and run root.sh without errors (i think that problem is in some old file from other instalation tries...)
    But now i have another problem: When install DB software, in step to copy instalation to remote node, this node have some failure in CLSMON/CSSD daemon and panicking:
    Sep 23 16:10:51 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 138. Respawning
    Sep 23 16:10:52 spodhcsvr10 root: Oracle CSSD failure. Rebooting for cluster integrity.
    Sep 23 16:10:52 spodhcsvr10 root: [ID 702911 user.alert] Oracle CSSD failure. Rebooting for cluster integrity.
    Sep 23 16:10:51 spodhcsvr10 root: [ID 702911 user.error] Oracle CLSMON terminated with unexpected status 138. Respawning
    Sep 23 16:10:52 spodhcsvr10 root: [ID 702911 user.alert] Oracle CSSD failure. Rebooting for cluster integrity.
    Sep 23 16:10:56 spodhcsvr10 Cluster.OPS.UCMMD: fatal: received signal 15
    Sep 23 16:10:56 spodhcsvr10 Cluster.OPS.UCMMD: [ID 770355 daemon.error] fatal: received signal 15
    Sep 23 16:10:59 spodhcsvr10 root: Oracle Cluster Ready Services waiting for SunCluster and UDLM to start.
    Sep 23 16:10:59 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 23 16:10:59 spodhcsvr10 root: [ID 702911 user.error] Oracle Cluster Ready Services waiting for SunCluster and UDLM to start.
    Sep 23 16:10:59 spodhcsvr10 root: [ID 702911 user.error] Cluster Ready Services completed waiting on dependencies.
    Notifying cluster that this node is panicking
    The instalation in first node continue and report error in copy to second node.
    Any ideas? Tks!

  • Is RAC node configuration  when disk array fails on one node .

    Hi ,
    We recently had all the filesystem of node 1 of RAC cluster , turned into read only mode. Upon further investigation it was revealed that it was due to disks array failure on node 1 . The database instance on node 2 is up and running fine . The OS team are rebuilding the node 1 from scratch and will restore oracle installable from the backup .
    My question is once all files are restored :
    Do we need to add the node to the RAC configuration ?
    Do we need to do relink of oracle binary files ?
    Can the node be brought up directly once all the oracle installables are restored properly or will the oRacle team require to perform addition steps to bring the node into RAC configuration .Thanks,
    Sachin K

    Hi ,
    If the restore fails in some way . We will require to first remove and then add the nodes to the node 1 cluster right ? Kindly confirm on the below steps.
    In case of such situation below are the steps we plan to follow:
    version ; 10.2.0.5
    Affected node :prd_node1
    Affected instance :PRDB1
    Surviving Node :prd_node2
    Surviving instance: PRDB2
    DB Listener on prd_node1:LISTENER_PRD01
    ASM listener on prd_node1:LISTENER_PRDASM01
    DB Listener on prd_node2:LISTENER_PRD02
    ASM listener on prd_node2:LISTENER_PRDASM02
    Login to the surviving node .In our case its prd_node2
    Step 1 - Remove ONS information :
    Execute as root the following command to find out the remote port number to be used
    $cat $CRS_HOME/opmn/conf/ons.config
    and remove the information pertaining the node to be deleted using
    #$CRS_HOME/bin/racgons remove_config prd_node1:6200
    Step 2 - Remove resources :
    In this step, the resources that were defined on this node has to be removed. These resources include (a) Database (b) Instance (c) ASM. A list of this can
    be acquired by running crs_stat -t command from any node
    The srvctl remove listener command used below is only applicable in 10204 and higher releases including 11.1.0.6. The command will report an error if the
    clusterware version is less than 10204. If clusterware version is less than 10204, use netca to remove the listener
    srvctl remove listener -n prd_node1 -l LISTENER_PRD01
    srvctl remove listener -n prd_node1 -l LISTENER_PRDASM01
    srvctl remove instance -d PRDB -i PRDB1
    srvctl remove asm -n prd_node1 -i +ASM1
    Step 3 Execute rootdeletenode.sh :
    From the node that you are not deleting execute as root the following command which will help find out the node number of the node that you want to delete
    #$CRS_HOME/bin/olsnodes -n
    this number can be passed to the rootdeletenode.sh command which is to be executed as root from any node which is going to remain in the cluster.
    #$CRS_HOME/install/rootdeletenode.sh prd_node1,1
    Step 5 Update the Inventory :
    From the node which is going to remain in the cluster run the following command as owner of the CRS_HOME. The argument to be passed to the CLUSTER_NODES is a
    comma seperated list of node names of the cluster which are going to remain in the cluster. This step needs to be performed from once per home (Clusterware,
    ASM and RDBMS homes).
    ## Example of running runInstaller to update inventory in Clusterware home
    $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORA_CRS_HOME "CLUSTER_NODES=prd_node2" CRS=TRUE
    ## Optionally enclose the host names with {}
    ## Example of running runInstaller to update inventory in ASM home
    $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ASM_HOME "CLUSTER_NODES=prd_node2"
    ## Optionally enclose the host names with {}
    ## Example of running runInstaller to update inventory in RDBMS home
    $CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES=prd_node2"
    ## Optionally enclose the host names with {}
    We need steps to add the node back into the cluster . Can anyone please help us on this ?
    Thanks,
    Sachin K

  • RAC-DATA FILE ACCESSING ISSUE FROM ONE NODE

    Dear All,
    We have a two node RAC (10.2.0.3)running on Hp Unix. From yesterday onwards, from one instance accessing data from a specific data file showing the below error, whereas accessing from other node to the same datafile is working properly.
    Errors in file /oracle/product/admin/tap3plus/bdump/tap3plus4_dbw0_24950.trc:
    ORA-01157: cannot identify/lock data file 75 - see DBWR trace file
    ORA-01110: data file 75: '/dev/vg_rac/rraw_tap3plus_temp_live05'
    ORA-27041: unable to open file
    HPUX-ia64 Error: 19: No such device
    Additional information: 2
    Tue Jan 31 08:52:09 2012
    Errors in file /oracle/product/admin/tap3plus/bdump/tap3plus4_dbw0_24950.trc:
    ORA-01186: file 75 failed verification tests
    ORA-01157: cannot identify/lock data file 75 - see DBWR trace file
    ORA-01110: data file 75: '/dev/vg_rac/rraw_tap3plus_temp_live05'
    Tue Jan 31 08:52:09 2012
    File 75 not verified due to error ORA-01157
    Tue Jan 31 08:52:09 2012
    Thanks in Advance

    user585870 wrote:
    We have a two node RAC (10.2.0.3)running on Hp Unix. From yesterday onwards, from one instance accessing data from a specific data file showing the below error, whereas accessing from other node to the same datafile is working properly.That would be due to some kind of failure in the shared storage layer.
    RAC needs the very same storage layer to be visible and available on each RAC node - thus this needs to be some form of shared cluster storage.
    Should a piece of it fails on one node, that node would not be able to access the RAC database files on that shared storage layer - and will throw the type of errors you are seeing.
    So how does this shared storage layer look like? Fibre channels (HBAs) connected to a Fibre Channel Switch and SAN - making SAN LUNs available as shared storage devices?
    Typically a shared storage failure would throw errors in the kernel log. This is because the error is not an Oracle error, but a kernel error. As it is in your case. The bottom error on the error stack points to the root cause:
    ORA-01157: cannot identify/lock data file 75 - see DBWR trace file
    ORA-01110: data file 75: '/dev/vg_rac/rraw_tap3plus_temp_live05'
    ORA-27041: unable to open file
    HPUX-ia64 Error: 19: No such device
    So HP-UX on that node is not seeing a specific shared storage device.

  • Rac One Node on Rac Servers

    Hi Xperts
    We have this environment:
    2 Rac Nodes 11.2.0.3 Enterprise on Oracle Linux 5.9 . 
    We have one production Database on this Rac and the users ask to create two single instance on each node, something like this:
    Node1 -> Rac Prod1,  Single Test
    Node2-> Rac Prod2,  Single Dev
    I want to create Rac One node for those Database (Dev, Test) and create New Diskgroups for ech database.
    Can I install a Rac One node on those Server  with DBCA?
    Do I nedd to Install new Database Software ?
    Does the installed Rac have some affectation ?
    I just want to be sure about this procedure, before to do anything.
    Thank you
    J.A.

    Hi J.A.
    Yes you can! However you need to install Grid Infrastructure (GI) in cluster on both nodes, then install database software. Either during software installation or after that, DBCA would allow you to create 1) Single instance, 2) RAC database, 3) RAC One Node database. Keep in mind that RAC One Node is an option (additional license) to the Enterprise Edition of Oracle Database.
    I've talked on that topic at the Bulgarian Oracle Users Group at 2011, here is the link to the presentation, you may find it useful. I might upload the videos as well if you need to have something like a proof of concept of just for your own:
    http://sve.to/download/1112/
    Also I would go with one disk group for both databases, as long as they share the same physical disks I don't see the point of doing that ? Having one diskgroup would allow you to utilize better you disk/space resources.
    The procedure would be:
    1. Install GI in cluster.
    2. Install software libraries.
    3. Patch up to 11.2.0.4.
    4. Create RAC One Node database using either command line or using Custom Template of DBCA.
    At the end of the day, if you have standard edition license you can still install GI in cluster and create single instance databases on each server. The downside of doing that is that you need to manually failover the database to remaining node in case of disaster.
    Regards,
    Sve

  • FRA on NFS Oracle RAC One Node

    Hi all,
    we installed Oracle RAC One Node on Oracle Linux. Everything seems to work fine except one little thing: we are trying to change the database to archivelog mode, but when we are trying to relocate the database, we are getting ORA-19816 "WARNING: Files may exist in ... that are not known to database." and "Linux-x86_64 Error: 37: No locks available"
    The FRA is mounted as NFS Share with follwing options: "rw,bg,hard,nointr,rsize=32768,wsize=32768,proto=tcp,noac,vers=3,suid"
    I searched a lot on the Internet but couldn't find any hint. Can anybody point me to the right installation guide?
    Thanks in advanced

    Hi,
    user10191672 wrote:
    Hi all,
    we installed Oracle RAC One Node on Oracle Linux. Everything seems to work fine except one little thing: we are trying to change the database to archivelog mode, but when we are trying to relocate the database, we are getting ORA-19816 "WARNING: Files may exist in ... that are not known to database." and "Linux-x86_64 Error: 37: No locks available"
    The FRA is mounted as NFS Share with follwing options: "rw,bg,hard,nointr,rsize=32768,wsize=32768,proto=tcp,noac,vers=3,suid"
    I searched a lot on the Internet but couldn't find any hint. Can anybody point me to the right installation guide?Check if NFSLOCK service is running... if not start it.
    # service nfslock status*Mount Options for Oracle files when used with NAS devices [ID 359515.1]*
    Mount options for Oracle Datafiles
    rw,bg,hard,nointr,rsize=32768, wsize=32768,tcp,actimeo=0, vers=3,timeo=600For RMAN backup sets, image copies, and Data Pump dump files, the "NOAC" mount option should not be specified - that is because RMAN and Data Pump do not check this option and specifying this can adversely affect performance.
    The following NFS options must be specified for 11.2.0.2 RMAN disk backup directory:
    opts="-fstype=nfs,rsize=65536,wsize=65536,hard,actime=0,intr,nodev,nosuid"Hope this helps,
    Levi Pereira
    Edited by: Levi Pereira on Aug 18, 2011 1:20 PM

  • One node is up but when we start the crs on node 2 then ASM instance killed.

    Hi Friends
    i am facing below strange problem
    one node is up but when we start the crs on node 2 then ASM instance killed. i found that one ip 169.254.98.19 is binding with private interconnect IP rathter than assigned IP.
    Please help me

    Hi Pradeep ,
    Oracle S/W verstion is 11.2.0.3. Actually , we start the CRS and DB on node 1 but when we try to start on node 2 , getting below error.
    [gpnpd(18284748)]CRS-2328:GPNPD started on node ggnetp04.
    2013-09-03 19:48:29.162
    [cssd(17694820)]CRS-1713:CSSD daemon is started in clustered mode
    2013-09-03 19:48:30.557
    [ohasd(13697160)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
    2013-09-03 19:48:41.013
    [cssd(17694820)]CRS-1707:Lease acquisition for node ggnetp04 number 2 completed
    2013-09-03 19:48:42.490
    [cssd(17694820)]CRS-1605:CSSD voting file is online: /dev/rhdiskpower0; details in /opt/app/11.2.0.3/grid/log/ggnetp04/cssd/ocssd.log.
    2013-09-03 19:48:45.483
    [cssd(17694820)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ggnetp03 ggnetp04 .
    2013-09-03 19:48:47.257
    [ctssd(18022612)]CRS-2403:The Cluster Time Synchronization Service on host ggnetp04 is in observer mode.
    2013-09-03 19:48:47.632
    [ctssd(18022612)]CRS-2407:The new Cluster Time Synchronization Service reference node is host ggnetp03.
    2013-09-03 19:48:47.633
    [ctssd(18022612)]CRS-2401:The Cluster Time Synchronization Service started on host ggnetp04.
    [client(16056466)]CRS-10001:03-Sep-13 19:48 ACFS-9391: Checking for existing ADVM/ACFS installation.
    [client(16056468)]CRS-10001:03-Sep-13 19:48 ACFS-9392: Validating ADVM/ACFS installation files for operating system.
    [client(16056470)]CRS-10001:03-Sep-13 19:48 ACFS-9393: Verifying ASM Administrator setup.
    [client(16056472)]CRS-10001:03-Sep-13 19:48 ACFS-9308: Loading installed ADVM/ACFS drivers.
    [client(16056478)]CRS-10001:03-Sep-13 19:48 ACFS-9154: Loading 'oracleadvm.ext' driver.
    [client(16056486)]CRS-10001:03-Sep-13 19:48 ACFS-9154: Loading 'oracleacfs.ext' driver.
    [client(16056494)]CRS-10001:03-Sep-13 19:48 ACFS-9327: Verifying ADVM/ACFS devices.
    [client(16056498)]CRS-10001:03-Sep-13 19:48 ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
    [client(16056504)]CRS-10001:03-Sep-13 19:48 ACFS-9156: Detecting control device '/dev/ofsctl'.
    [client(16056508)]CRS-10001:03-Sep-13 19:48 ACFS-9322: completed
    2013-09-03 19:48:59.667
    [ctssd(18022612)]CRS-2409:The clock on host ggnetp04 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
    2013-09-03 19:50:53.936
    [cssd(17694820)]CRS-1662:Member kill requested by node ggnetp03 for member number 1, group DB+ASM
    2013-09-03 19:50:57.493
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:50:57.494
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log"
    2013-09-03 19:50:58.828
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:50:58.943
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:50:59.235
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:53:00.252
    [cssd(17694820)]CRS-1662:Member kill requested by node ggnetp03 for member number 1, group DB+ASM
    2013-09-03 19:53:03.101
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:53:03.102
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log"
    2013-09-03 19:53:05.430
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:53:05.539
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:53:05.815
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:55:07.436
    [cssd(17694820)]CRS-1662:Member kill requested by node ggnetp03 for member number 1, group DB+ASM
    2013-09-03 19:55:09.673
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:55:09.674
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log"
    2013-09-03 19:55:12.007
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:55:12.115
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:55:12.389
    [/opt/app/11.2.0.3/grid/bin/oraagent.bin(16777322)]CRS-5019:All OCR locations are on ASM disk groups [DATA], and none of these disk groups are mounted. Details are at "(:CLSN00100:)" in "/opt/app/11.2.0.3/grid/log/ggnetp04/agent/ohasd/oraagent_grid/oraagent_grid.log".
    2013-09-03 19:55:12.405
    [ohasd(13697160)]CRS-2807:Resource 'ora.asm' failed to start automatically.
    2013-09-03 19:55:12.405
    [ohasd(13697160)]CRS-2807:Resource 'ora.crsd' failed to start automatically.
    2013-09-03 20:23:32.040
    [ctssd(18022612)]CRS-2409:The clock on host ggnetp04 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.

  • ASM install panics installing node

    Particulars:
    2 node cluster, AIX 6.1 TL01 SP01. 11.1.0.7 CRS is installed, now installing ASM starting with 11.1.0.6(base release). The oui gets about 50% thru then finally dumps with the following error:
    JVMDUMP006I
    On the second node, it appears the installer copies a number of files with filenames containing special characters. Then when the java dump occurrs, the installing node kernel panics and reboots. Have not found much in the way of log files, the oui logs don't show much. Tomorrow will likely be running an OUI trace. Did find on Metalink a note with similar symptoms but for CRS install and OUI. It basically said to increase the memory size for java in the oraparam.ini. I did that and it still dumps.
    Any ideas?
    We did open a TAR but not seeing much progress in resolving the issue.
    TIA,
    Pete's

    Hi orafun,
    this message:
    +WARNING: Error while copying directory /u01/app/oracle/product/10.2.0/db_1 with exclude file list 'null' to nodes 'rac2'. [PRKC-1073 : Failed to transfer directory "/u01/app/oracle/product/10.2.0/db_1" to any of the given nodes "rac2 ".+
    +Error on node rac2:Read from remote host rac2: Connection reset by peer]+
    Refer to '/u01/app/oracle/oraInventory/logs/installActions2009-04-18_01-30-26PM.log' for details. You may fix the errors on the required remote nodes. Refer to the install guide for error recovery. Click 'Yes' if you want to proceed. Click 'No' to exit the install. Do you want to continue?
    INFO: User Selected: Yes/OK
    Tells you that the Oracle Home could not be copied onto the remote node. The logs mentioned might tell you more, but this is the reason why ASM cannot be started on the other node - there is no software that could be used to start an ASM instance. Now you said:
    "+It appears to me as though the installer was not able to copy over the "/u01/app/oracle/product/10.2.0/db_1" directory to the rac2 node. I do not see any reason for that, I have setup ssh user equivalence for both oracle and root users, ssh and scp seem to work both ways. Permissions should not be an issue on one node and not the other as I replicated the permissions+."
    My question would be: What do you try to achieve? IF it is your only interest to "get it done and over with", then you can TAR up the Oracle Database home from which you want to run ASM and un-TAR on the remote node. Given that the paths are all correct, the registration already took place and hence, you can try starting the ASM instance on node2. IF you want to know the reason for the issue, further investigation and more information would be required.
    Hope that helps. Thanks,
    Markus

  • Root.sh failed on second node while installing CRS 10g on centos 5.5

    root.sh failed on second node while installing CRS 10g
    Hi all,
    I am able to install Oracle 10g RAC clusterware on first node of the cluster. However, when I run the root.sh script as root
    user on second node of the cluster, it fails with following error message:
    NO KEYS WERE WRITTEN. Supply -force parameter to override.
    -force is destructive and will destroy any previous cluster
    configuration.
    Oracle Cluster Registry for cluster has already been initialized
    Startup will be queued to init within 90 seconds.
    Adding daemons to inittab
    Expecting the CRS daemons to be up within 600 seconds.
    Failure at final check of Oracle CRS stack.
    10
    and run cluvfy stage -post hwos -n all -verbose,it show message:
    ERROR:
    Could not find a suitable set of interfaces for VIPs.
    Result: Node connectivity check failed.
    Checking shared storage accessibility...
    Disk Sharing Nodes (2 in count)
    /dev/sda db2 db1
    and run cluvfy stage -pre crsinst -n all -verbose,it show message:
    ERROR:
    Could not find a suitable set of interfaces for VIPs.
    Result: Node connectivity check failed.
    Checking system requirements for 'crs'...
    No checks registered for this product.
    and run cluvfy stage -post crsinst -n all -verbose,it show message:
    Result: Node reachability check passed from node "DB2".
    Result: User equivalence check passed for user "oracle".
    Node Name CRS daemon CSS daemon EVM daemon
    db2 no no no
    db1 yes yes yes
    Check: Health of CRS
    Node Name CRS OK?
    db1 unknown
    Result: CRS health check failed.
    check crsd.log and show message:
    clsc_connect: (0x143ca610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_db2_crs))
    clsssInitNative: connect failed, rc 9
    Any help would be greatly appreciated.
    Edited by: 868121 on 2011-6-24 上午12:31

    Hello, it took a little searching, but I found this in a note in the GRID installation guide for Linux/UNIX:
    Public IP addresses and virtual IP addresses must be in the same subnet.
    In your case, you are using two different subnets for the VIPs.

  • Hard Drive Failed, New one Installed, Problems of course

    Hellooo Everyone This is sort of a last ditch effort to see if I can fix this myself or if I am going to have to shell out 500 dollars to get my laptop back in working order. So I have a HP dv6-6140 laptop. A few weeks ago the hard drive failed and I purchased and installed a new one, ran windows 7 and have it on my laptop. When I try to hook up an ethernet cable to download and install the drivers for the laptop the ethernet cable is not recognized it says no internet connection blah blah blah. Ok so I downloaded the drivers to a flash drive and installed some of them onto my laptop, no changes.  So I was thinking of purchasing the recovery discs which im assuming include the drivers for the notebook for 35 dollars which is paltry in comparison to what I would pay if I took it to a computer repair shop, but how do I know that would work? I dont. I am looking for ideas or anything that I may be missing. I am not computer illiterate but I am not an expert. Any advice would be greatly appreciated Thanks!
    Kelly

    Hi, Kelly:
    On your model normally it is very important to install the chipset driver first and reboot before attempting to install the ethernet (wired network) driver.
    The chipset drivers install some motherboard driver which I think enables the communication to the ethernet adapter.
    The problem is, HP does not include the necessary chipset drivers for your model (which I assume is a HP Pavilion dv6-6140us Entertainment Notebook PC).
    So...Download and install the AMD Chipset Drivers and reboot.  You want the first driver listed on the webpage.
    http://support.amd.com/en-us/download/chipset?os=Windows%207%20-%2064
    Then see if the ethernet adapter works again or reinstall the ethernet driver after you reboot the PC.
    The parts list indicates your model comes with a Broadcom wireless/bluetooth card.
    Here are the links to the drivers you need for that...
    http://h10025.www1.hp.com/ewfrf/wc/softwareDownloadIndex?softwareitem=ob-107849-1&cc=us&dlc=en&lc=en
    http://h10025.www1.hp.com/ewfrf/wc/softwareDownloadIndex?softwareitem=ob-131427-1&cc=us&dlc=en&lc=en...

  • One node is failing

    One node is periodically are failing . When users are connecting , they are getting the error :
    ERROR: ORACLE connection error: ORA-01034: ORACLE not availableORA-27101: shared memory realm does not existLinux-x86_64 Error:
    2: No such file or directory.
    And then the command : srvctl status database -d te , is reporting , that one of instances are running , but other are not .
    Please suggest , what log files to check .

    user10890074 wrote:
    One node is periodically are failing . When users are connecting , they are getting the error :
    ERROR: ORACLE connection error: ORA-01034: ORACLE not availableORA-27101: shared memory realm does not existLinux-x86_64 Error:
    2: No such file or directory.
    And then the command : srvctl status database -d te , is reporting , that one of instances are running , but other are not .
    Please suggest , what log files to check .How about you problem?
    It seemed your node hung.
    1. you should check alert log file.
    2. you should check log file in CRS_HOME/log/<nodename)/ PATH

  • Removing one node, re-install and join cluster 3.2/RAC/QFS

    Hi all,
    I have one cluster system with 2 node ( Cluster 3.2, Oracle RAC, QFS). Now one node have been failed and cannot recover. Now I have to reinstall the faulty node.
    How can it removing all the faulty node from the active node?
    Can I reinstall and rejoin new node to cluster?
    Thanks
    Nguyen

    The instructions to orderly remove a node from the cluster (http://docs.sun.com/app/docs/doc/820-4679/gcfso?l=en&a=view) do assume that the cluster node itself is still healthy.
    If you lost a node due to failure/disaster, then you would need to rebuild the same hardware and restore it from your backup. This is described at
    http://docs.sun.com/app/docs/doc/820-4679/cfhdbgbc?l=en&a=view
    Regards
    Thorsten

  • !! ORACM always FAILS one node - Oracle 9i rac -Sles9 - 9.2.0.8 ORACM

    Hi,
    I really need help with this. I applied all the patches possible. I tried sharing the quorum.dbf as a nfs device, raw device, iscsi lun ... i patched the ORACM to 9.2.0.5, 9.2.0.6, and now 9.2.0.8. The setup has two hp dl360 with sles9 sp2, x86_64 and oracle 9.2 rac...
    The problem is the cluster manager starts on one node. and when i run ./ocmstart.sh on the other node, it always fails. The CM.LOG file is pasted below. I get the same errors at all the patch levels. The quorum.dbf is setup as an iscsi lun on a netapp filer, which is then bounded to a raw device on the host. Whichever node i start the oracle cluster manager first, works and the other node always fails with the errors shown below.
    It also keeps complaining about InitializeCM: query_module() failed about the hangcheck timer. The hangcheck timer is already loaded and i can see it in /sbin/lsmod
    I would really appreciate help on this. This is my master's project at school and i cant graduate if this doesnt work. Please provide some guidance.
    thanks
    vishal
    CM.LOG
    tweedledum:/u01/app/oracle/product/920/oracm/log # cat cm.log
    oracm, version[ 9.2.0.8.0.01 ] started {Tue Feb 13 00:56:16 2007 }
    KernelModuleName is hangcheck-timer {Tue Feb 13 00:56:16 2007 }
    OemNodeConfig(): Network Address of node0: 1.1.1.3 (port 9998)
    {Tue Feb 13 00:56:16 2007 }
    OemNodeConfig(): Network Address of node1: 1.1.1.4 (port 9998)
    {Tue Feb 13 00:56:16 2007 }
    WARNING: OemInit2: Opened file(/oradata/quorum.dbf 6), tid = main:182900764192 file = oem.c, line = 503 {Tue Feb 13 00:56:16 2007 }InitializeCM: ModuleName = hangcheck-timer {Tue Feb 13 00:56:16 2007 }
    ClusterListener: Spawned with tid 0x4080e960 pid: 19662 {Tue Feb 13 00:56:16 2007 }
    ERROR: InitializeCM: query_module() failed, tid = main:182900764192 file = cmstartup.c, line = 341 {Tue Feb 13 00:56:16 2007 }Debug Hang : ClusterListener (PID=19662) Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    Debug Hang :StartNMMon (PID=19662) Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    Debug Hang : CmConnectListener (PID=19662):Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    CreateLocalEndpoint(): Network Address: 1.1.1.4
    {Tue Feb 13 00:56:16 2007 }
    PollingThread: Spawned with tid 0x40c10960. pid: 19662 {Tue Feb 13 00:56:16 2007 }
    Debug Hang :PollingThread (PID=19662): Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    SendingThread: Spawned with tid 0x41012960, 0x41012960. pid: 19662 {Tue Feb 13 00:56:16 2007 }
    DiskPingThread: Spawned with tid 0x40e11960. pid: 19662 {Tue Feb 13 00:56:16 2007 }
    Debug Hang : DiskPingThread (PID=19662): Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    Debug Hang :SendingThread (PID=19662): Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    UpdateNodeState(): node(1) added udpated {Tue Feb 13 00:56:19 2007 }
    HandleUpdate(): SYNC(1) from node(0) completed {Tue Feb 13 00:56:19 2007 }
    HandleUpdate(): NODE(0) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(1) {Tue Feb 13 00:56:19 2007 }
    HandleUpdate(): NODE(1) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(2) {Tue Feb 13 00:56:19 2007 }
    --- DUMP GROUP STATE DB ---
    --- END OF GROUP STATE DUMP ---
    --- Begin Dump ---
    oracm, version[ 9.2.0.8.0.01 ] started {Tue Feb 13 00:56:16 2007 }
    TRACE: LogListener: Spawned with tid 0x4060d960., tid = LogListener:1080088928 file = logging.c, line = 116 {Tue Feb 13 00:56:16 2007 }
    TRACE: Can't read registry value for HeartBeat, tid = main:182900764192 file = unixinc.c, line = 1080 {Tue Feb 13 00:56:16 2007 }
    TRACE: Can't read registry value for PollInterval, tid = main:182900764192 file = unixinc.c, line = 1080 {Tue Feb 13 00:56:16 2007 }
    TRACE: Can't read registry value for WatchdogTimerMargin, tid = main:182900764192 file = unixinc.c, line = 1080 {Tue Feb 13 00:56:16 2007 }
    TRACE: Can't read registry value for WatchdogSafetyMargin, tid = main:182900764192 file = unixinc.c, line = 1080 {Tue Feb 13 00:56:16 2007 }KernelModuleName is hangcheck-timer {Tue Feb 13 00:56:16 2007 }
    TRACE: Can't read registry value for ClientTimeout, tid = main:182900764192 file = unixinc.c, line = 1080 {Tue Feb 13 00:56:16 2007 }
    TRACE: InitNMInfo: setting clientTimeout to 140s based on MissCount 210 and PollInterval 1000ms, tid = main:182900764192 file = nmconfig.c, line = 138 {Tue Feb 13 00:56:16 2007 }
    TRACE: InitClusterDb(): getservbyname on CMSrvr failed - 0 : assigning 9998, tid = main:182900764192 file = nmconfig.c, line = 208 {Tue Feb 13 00:56:16 2007 }OemNodeConfig(): Network Address of node0: 1.1.1.3 (port 9998)
    {Tue Feb 13 00:56:16 2007 }
    OemNodeConfig(): Network Address of node1: 1.1.1.4 (port 9998)
    {Tue Feb 13 00:56:16 2007 }
    TRACE: OemCreateListenPort: bound at 9998, tid = main:182900764192 file = oem.c, line = 907 {Tue Feb 13 00:56:16 2007 }
    TRACE: InitClusterDb(): found my node info at 1 name tweedledum, priv int-dum, port 3623, tid = main:182900764192 file = nmconfig.c, line = 261 {Tue Feb 13 00:56:16 2007 }
    TRACE: InitClusterDb(): Local Node(1) NodeName[int-dum], tid = main:182900764192 file = nmconfig.c, line = 279 {Tue Feb 13 00:56:16 2007 }
    TRACE: InitClusterDb(): Cluster(Oracle) with (2) Defined Nodes, tid = main:182900764192 file = nmconfig.c, line = 282 {Tue Feb 13 00:56:16 2007 }
    TRACE: OEMInits(): CM Disk File (/oradata/quorum.dbf), tid = main:182900764192 file = oem.c, line = 248 {Tue Feb 13 00:56:16 2007 }
    WARNING: OemInit2: Opened file(/oradata/quorum.dbf 6), tid = main:182900764192 file = oem.c, line = 503 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(0) rcfg(1) wrtcnt(1171356979) lastcnt(0) alive(1171356979), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(1) rcfg(1) wrtcnt(180) lastcnt(0) alive(1), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(2) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(3) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(4) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(5) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(6) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(7) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(8) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(9) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(10) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(11) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(12) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(13) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(14) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(15) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(16) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(17) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(18) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(19) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(20) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(21) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(22) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(23) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(24) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(25) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(26) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(27) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(28) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(29) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(30) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(31) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(32) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(33) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(34) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(35) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(36) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(37) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(38) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(39) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(40) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(41) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(42) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(43) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(44) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(45) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(46) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(47) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(48) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(49) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(50) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(51) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(52) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(53) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(54) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(55) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(56) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(57) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(58) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(59) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(60) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(61) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(62) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }
    TRACE: ReadOthersDskInfo(): node(63) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:182900764192 file = oem.c, line = 1491 {Tue Feb 13 00:56:16 2007 }InitializeCM: ModuleName = hangcheck-timer {Tue Feb 13 00:56:16 2007 }
    ClusterListener: Spawned with tid 0x4080e960 pid: 19662 {Tue Feb 13 00:56:16 2007 }
    ERROR: InitializeCM: query_module() failed, tid = main:182900764192 file = cmstartup.c, line = 341 {Tue Feb 13 00:56:16 2007 }Debug Hang : ClusterListener (PID=19662) Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    TRACE: ClusterListener (pid=19662, tid=1082190176): Registered with watchdog daemon., tid = ClusterListener:1082190176 file = nmlistener.c, line = 76 {Tue Feb 13 00:56:16 2007 }
    TRACE: CmConnectListener: Spawned with tid 0x40a0f960., tid = CMConnectListerner:1084291424 file = cmclient.c, line = 216 {Tue Feb 13 00:56:16 2007 }Debug Hang :StartNMMon (PID=19662) Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    TRACE: StartNMMon (pid=19662, tid=-1782829536): Registered with watchdog daemon., tid = main:182900764192 file = cmnodemon.c, line = 254 {Tue Feb 13 00:56:16 2007 }Debug Hang : CmConnectListener (PID=19662):Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    TRACE: CmConnectListener (pid=19662, tid=1084291424): Registered with watchdog daemon., tid = CMConnectListerner:1084291424 file = cmclient.c, line = 247 {Tue Feb 13 00:56:16 2007 }CreateLocalEndpoint(): Network Address: 1.1.1.4
    {Tue Feb 13 00:56:16 2007 }
    TRACE: StartClusterJoin(): clusterState(0) nodeState(0), tid = main:182900764192 file = nmmember.c, line = 282 {Tue Feb 13 00:56:16 2007 }PollingThread: Spawned with tid 0x40c10960. pid: 19662 {Tue Feb 13 00:56:16 2007 }
    Debug Hang :PollingThread (PID=19662): Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    TRACE: PollingThread (pid=19662, tid=1086392672): Registered with watchdog daemon., tid = PollingThread:1086392672 file = nmmember.c, line = 765 {Tue Feb 13 00:56:16 2007 }SendingThread: Spawned with tid 0x41012960, 0x41012960. pid: 19662 {Tue Feb 13 00:56:16 2007 }
    DiskPingThread: Spawned with tid 0x40e11960. pid: 19662 {Tue Feb 13 00:56:16 2007 }
    Debug Hang : DiskPingThread (PID=19662): Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    TRACE: DiskPingThread (pid=19662, tid=1088493920): Registered with watchdog daemon., tid = DiskPingThread:1088493920 file = nmmember.c, line = 1083 {Tue Feb 13 00:56:16 2007 }Debug Hang :SendingThread (PID=19662): Registered with ORACM. {Tue Feb 13 00:56:16 2007 }
    TRACE: SendingThread (pid=19662, tid=1090595168): Registered with watchdog daemon., tid = SendingThread:1090595168 file = nmmember.c, line = 581 {Tue Feb 13 00:56:16 2007 }
    TRACE: HandleJoin(): src[1] dest[1] dom[0] seq[1] sync[0], tid = ClusterListener:1082190176 file = nmlisten.c, line = 346 {Tue Feb 13 00:56:16 2007 }
    TRACE: HandleJoin(): JOIN from node(1)->(1), tid = ClusterListener:1082190176 file = nmlisten.c, line = 362 {Tue Feb 13 00:56:16 2007 }
    TRACE: HandleStatus(): node(0) UNKNOWN, tid = ClusterListener:1082190176 file = nmlisten.c, line = 404 {Tue Feb 13 00:56:17 2007 }
    TRACE: HandleStatus(): src[0] dest[1] dom[0] seq[6] sync[1], tid = ClusterListener:1082190176 file = nmlisten.c, line = 415 {Tue Feb 13 00:56:17 2007 }
    TRACE: HandleSync(): src[0] dest[1] dom[0] seq[7] sync[1], tid = ClusterListener:1082190176 file = nmlisten.c, line = 506 {Tue Feb 13 00:56:17 2007 }
    TRACE: SendAck(): node(0) domain(0) syncSeqNo(1) type(11), tid = ClusterListener:1082190176 file = nmmember.c, line = 1922 {Tue Feb 13 00:56:17 2007 }
    TRACE: HandleVote(): src[0] dest[1] dom[0] seq[8] sync[1], tid = ClusterListener:1082190176 file = nmlisten.c, line = 643 {Tue Feb 13 00:56:18 2007 }
    TRACE: SendVoteInfo(): node(0) domain(0) syncSeqNo(1), tid = ClusterListener:1082190176 file = nmmember.c, line = 1736 {Tue Feb 13 00:56:18 2007 }
    TRACE: HandleUpdate(): src[0] dest[1] dom[0] seq[9] sync[1], tid = ClusterListener:1082190176 file = nmlisten.c, line = 849 {Tue Feb 13 00:56:19 2007 }
    TRACE: UpdateNodeState(): nodeNum 0, newState 2, tid = ClusterListener:1082190176 file = nmlisten.c, line = 1153 {Tue Feb 13 00:56:19 2007 }
    TRACE: UpdateNodeState(): nodeNum 1, newState 2, tid = ClusterListener:1082190176 file = nmlisten.c, line = 1153 {Tue Feb 13 00:56:19 2007 }UpdateNodeState(): node(1) added udpated {Tue Feb 13 00:56:19 2007 }
    TRACE: SendAck(): node(0) domain(0) syncSeqNo(1) type(15), tid = ClusterListener:1082190176 file = nmmember.c, line = 1922 {Tue Feb 13 00:56:19 2007 }
    TRACE: HandleUpdate(): about to QueueClientEvent 0, 1, tid = ClusterListener:1082190176 file = nmlisten.c, line = 960 {Tue Feb 13 00:56:19 2007 }
    TRACE: QueueClientEvent(): Sending Event(1) , tid = ClusterListener:1082190176 file = nmlisten.c, line = 1386 {Tue Feb 13 00:56:19 2007 }
    TRACE: QueueClientEvent: Node[0] state = 2, tid = ClusterListener:1082190176 file = nmlisten.c, line = 1390 {Tue Feb 13 00:56:19 2007 }
    TRACE: QueueClientEvent: Node[1] state = 2, tid = ClusterListener:1082190176 file = nmlisten.c, line = 1390 {Tue Feb 13 00:56:19 2007 }HandleUpdate(): SYNC(1) from node(0) completed {Tue Feb 13 00:56:19 2007 }
    TRACE: HandleUpdate: saving incarnation value as 2, tid = ClusterListener:1082190176 file = nmlisten.c, line = 983 {Tue Feb 13 00:56:19 2007 }
    HandleUpdate(): NODE(0) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(1) {Tue Feb 13 00:56:19 2007 }
    HandleUpdate(): NODE(1) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(2) {Tue Feb 13 00:56:19 2007 }
    TRACE: HandleStatus(): src[1] dest[1] dom[0] seq[2] sync[2], tid = ClusterListener:1082190176 file = nmlisten.c, line = 415 {Tue Feb 13 00:56:19 2007 }
    TRACE: StartNMMon(): attached as node 1, tid = main:182900764192 file = cmnodemon.c, line = 288 {Tue Feb 13 00:56:19 2007 }
    TRACE: StartNMMon: starting reconfig(2), tid = main:182900764192 file = cmnodemon.c, line = 395 {Tue Feb 13 00:56:19 2007 }
    TRACE: UpdateEventValue: *(bfffe1f0) = (1, 1), tid = main:182900764192 file = unixinc.c, line = 336 {Tue Feb 13 00:56:19 2007 }
    TRACE: UpdateEventValue: *(401bbeb0) = (3, 1), tid = main:182900764192 file = unixinc.c, line = 336 {Tue Feb 13 00:56:19 2007 }
    TRACE: ReconfigThread: started for reconfig (2), tid = Reconfig Thread:1092696416 file = cmnodemon.c, line = 180 {Tue Feb 13 00:56:19 2007 }NMEVENT_RECONFIG [00][00][00][00][00][00][00][03] {Tue Feb 13 00:56:19 2007 }
    TRACE: CleanupNodeContexts(): cleaning up nodes, rcfg(2), tid = Reconfig Thread:1092696416 file = cmnodemon.c, line = 671 {Tue Feb 13 00:56:19 2007 }
    TRACE: DisconnectNode(): about to disconnect 0, tid = Reconfig Thread:1092696416 file = cmipc.c, line = 851 {Tue Feb 13 00:56:19 2007 }
    TRACE: DisconnectNode(): waiting for 0 listeners to terminate, tid = Reconfig Thread:1092696416 file = cmipc.c, line = 874 {Tue Feb 13 00:56:19 2007 }
    TRACE: UpdateEventValue: *(401be778) = (0, 1), tid = Reconfig Thread:1092696416 file = unixinc.c, line = 336 {Tue Feb 13 00:56:19 2007 }
    TRACE: CleanupNodeContexts(): successful cleanup of nodes rcfg(2), tid = Reconfig Thread:1092696416 file = cmnodemon.c, line = 690 {Tue Feb 13 00:56:19 2007 }
    TRACE: EstablishMasterNode(): MASTER is node(0) reconfigs(2), tid = Reconfig Thread:1092696416 file = cmnodemon.c, line = 832 {Tue Feb 13 00:56:19 2007 }
    TRACE: IncrementEventValue: *(401b97c0) = (1, 1), tid = Reconfig Thread:1092696416 file = unixinc.c, line = 365 {Tue Feb 13 00:56:19 2007 }
    TRACE: PrepareForConnectsX: still waiting at (0), tid = PrepareForConnectsX:1094797664 file = cmipc.c, line = 279 {Tue Feb 13 00:56:19 2007 }
    TRACE: IncrementEventValue: *(401b97c0) = (2, 2), tid = PrepareForConnectsX:1094797664 file = unixinc.c, line = 365 {Tue Feb 13 00:56:19 2007 }--- End Dump ---

    Set the LD_ASSUME_KERNEL before starting the cluster manager:
    export LD_ASSUME_KERNEL=2.4.19
    export ORACLE_HOME=/oracle/app/oracle/product/9.2.0
    rm -f /oracle/app/oracle/product/9.2.0/oracm/log/cm.log
    rm -f /oracle/app/oracle/product/9.2.0/oracm/log/ocmstart.ts
    $ORACLE_HOME/oracm/bin/ocmstart.sh
    tail -f /oracle/app/oracle/product/9.2.0/oracm/log/cm.log

  • HPUX RAC ASM database crashed after pulling all FC cables on one node.

    On the node without any FC cables, syslog shows "rebooting host for integrity". On the good node:
    Wed Mar 31 13:42:47 2010
    Errors in file /opt/oracle/admin/asmdb/bdump/asmdb2_p000_7175.trc:
    ORA-00600: internal error code, arguments: [kclchkrcv_2], [0], [6796576], [], []
    Wed Mar 31 13:42:49 2010
    Errors in file /opt/oracle/admin/asmdb/bdump/asmdb2_p000_7175.trc:
    ORA-01578: ORACLE data block corrupted (file # 2, block # 89)
    ORA-01110: data file 2: '+MYDG1/asmdb/datafile/undotbs1.262.714408035'
    ORA-10564: tablespace UNDOTBS1
    ORA-01110: data file 2: '+MYDG1/asmdb/datafile/undotbs1.262.714408035'
    ORA-10560: block type 'KTU SMU HEADER BLOCK'
    ORA-00600: internal error code, arguments: [kclchkrcv_2], [0], [6796576], [], []
    Wed Mar 31 13:42:49 2010
    Errors in file /opt/oracle/admin/asmdb/bdump/asmdb2_dbw0_5108.trc:
    ORA-00600: internal error code, arguments: [kclrdone_4], [0], [6796576], [], [],
    Wed Mar 31 13:42:51 2010
    Errors in file /opt/oracle/admin/asmdb/bdump/asmdb2_dbw0_5108.trc:
    ORA-00600: internal error code, arguments: [kclrdone_4], [0], [6796576], [], [],
    Wed Mar 31 13:42:51 2010
    DBW0: terminating instance due to error 471
    Wed Mar 31 13:42:51 2010
    Errors in file /opt/oracle/admin/asmdb/bdump/asmdb2_lms0_5097.trc:
    ORA-00471: DBWR process terminated with error
    Wed Mar 31 13:42:51 2010
    Errors in file /opt/oracle/admin/asmdb/bdump/asmdb2_lms1_5099.trc:
    ORA-00471: DBWR process terminated with error
    Wed Mar 31 13:42:51 2010
    Errors in file /opt/oracle/admin/asmdb/bdump/asmdb2_lmon_5093.trc:
    ORA-00471: DBWR process terminated with error
    Instance terminated by DBW0, pid = 5108
    Wed Mar 31 13:43:15 2010

    The loss of one node from the cluster, should not impact another node. Pulling the FC cables on one node, should have no bearing and no impact at all on the other nodes in that RAC.
    Unless by that action, the FC gateway/switch somehow goes pear shape as a result - and starts to push gunk and not the actual data from the storage system.

Maybe you are looking for

  • Xml photo galleries

    Hi, I have one category set up to load photos via xml. I want to be able to click on another category and load another set of photos. How do I set up my actionscript to do this? Should I use the same xml file, or make a second one? I attached my file

  • "Special Stock indicator" field to be a mandatory entry with "501" in MIGO

    Hi, Where would you set the "Special Stock Indicator" as a mandatory entry in MIGO for mvt. type 501? Could you do it in config and if so where (IMG - MM - IM - GR or Mvt Types) ? Many thx.

  • How to import premier xml file into FCP X?

    Hi There, Premier has an export to "Final Cut Pro XML" option but how do you import these files into FCP X? If I navigate from within FCP X, the file is greyed out and if I right click from the finder and open in FCP X I see the error: The document "

  • Panasonic PV-GS250 used to work great with my MAC

    In 2007 I purchased a a Panasonic Digital Camcorder (PV-GS250) and it worked like a dream with my Macbook Pro. Purchased an IMAC (Oct 07) with Leopard OS, the dream continued for both. I hadn't edited for a while and had a new project I was working o

  • Changing the Column Header Color

    I would like to change the foreground and background color of a column header when the user clicks on the column. I know how to detected when the user clicks on a given column, and I know there is a setBackground and setForeground method that needs t