Is RAC node configuration  when disk array fails on one node .

Hi ,
We recently had all the filesystem of node 1 of RAC cluster , turned into read only mode. Upon further investigation it was revealed that it was due to disks array failure on node 1 . The database instance on node 2 is up and running fine . The OS team are rebuilding the node 1 from scratch and will restore oracle installable from the backup .
My question is once all files are restored :
Do we need to add the node to the RAC configuration ?
Do we need to do relink of oracle binary files ?
Can the node be brought up directly once all the oracle installables are restored properly or will the oRacle team require to perform addition steps to bring the node into RAC configuration .Thanks,
Sachin K

Hi ,
If the restore fails in some way . We will require to first remove and then add the nodes to the node 1 cluster right ? Kindly confirm on the below steps.
In case of such situation below are the steps we plan to follow:
version ; 10.2.0.5
Affected node :prd_node1
Affected instance :PRDB1
Surviving Node :prd_node2
Surviving instance: PRDB2
DB Listener on prd_node1:LISTENER_PRD01
ASM listener on prd_node1:LISTENER_PRDASM01
DB Listener on prd_node2:LISTENER_PRD02
ASM listener on prd_node2:LISTENER_PRDASM02
Login to the surviving node .In our case its prd_node2
Step 1 - Remove ONS information :
Execute as root the following command to find out the remote port number to be used
$cat $CRS_HOME/opmn/conf/ons.config
and remove the information pertaining the node to be deleted using
#$CRS_HOME/bin/racgons remove_config prd_node1:6200
Step 2 - Remove resources :
In this step, the resources that were defined on this node has to be removed. These resources include (a) Database (b) Instance (c) ASM. A list of this can
be acquired by running crs_stat -t command from any node
The srvctl remove listener command used below is only applicable in 10204 and higher releases including 11.1.0.6. The command will report an error if the
clusterware version is less than 10204. If clusterware version is less than 10204, use netca to remove the listener
srvctl remove listener -n prd_node1 -l LISTENER_PRD01
srvctl remove listener -n prd_node1 -l LISTENER_PRDASM01
srvctl remove instance -d PRDB -i PRDB1
srvctl remove asm -n prd_node1 -i +ASM1
Step 3 Execute rootdeletenode.sh :
From the node that you are not deleting execute as root the following command which will help find out the node number of the node that you want to delete
#$CRS_HOME/bin/olsnodes -n
this number can be passed to the rootdeletenode.sh command which is to be executed as root from any node which is going to remain in the cluster.
#$CRS_HOME/install/rootdeletenode.sh prd_node1,1
Step 5 Update the Inventory :
From the node which is going to remain in the cluster run the following command as owner of the CRS_HOME. The argument to be passed to the CLUSTER_NODES is a
comma seperated list of node names of the cluster which are going to remain in the cluster. This step needs to be performed from once per home (Clusterware,
ASM and RDBMS homes).
## Example of running runInstaller to update inventory in Clusterware home
$CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORA_CRS_HOME "CLUSTER_NODES=prd_node2" CRS=TRUE
## Optionally enclose the host names with {}
## Example of running runInstaller to update inventory in ASM home
$CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ASM_HOME "CLUSTER_NODES=prd_node2"
## Optionally enclose the host names with {}
## Example of running runInstaller to update inventory in RDBMS home
$CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES=prd_node2"
## Optionally enclose the host names with {}
We need steps to add the node back into the cluster . Can anyone please help us on this ?
Thanks,
Sachin K

Similar Messages

  • SC 3.2 nodes reboot when i reboot the first one

    i had create a cluster with two nodes and quorom (shared file system) beetwen the two nodes. but when i try to reboot one node the second one reboot. i had solaris 10 and sun cluster 3.2. the error in the console is
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000031c498ca848 (ssd25):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000038e49941b19 (ssd26):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039449941c7d (ssd27):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039149941bd1 (ssd29):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039749941cab (ssd30):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005ab794000005ff4a564a48 (ssd47):
    offline or reservation conflict
    Update_drv failed to re-read did.conf file for did driver. Will retry once agai
    n.
    Update_drv failed to re-read did.conf file for did driver after 1 retry. Will t
    ry devfsadm.
    Devfsadm successfully configured did devices.
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000031c498ca848 (ssd25):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000038e49941b19 (ssd26):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039449941c7d (ssd27):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039149941bd1 (ssd29):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005a82cf0000039749941cab (ssd30):
    offline or reservation conflict
    WARNING: /scsi_vhci/ssd@g600a0b80005ab794000005ff4a564a48 (ssd47):
    offline or reservation conflict
    Update_drv failed to re-read did.conf file for did driver. Will retry once agai
    n.
    Update_drv failed to re-read did.conf file for did driver after 1 retry. Will t
    ry devfsadm.
    Devfsadm successfully configured did devices.
    Mohyi

    A couple more questions:
    - does clq status show that the quorum vote is counted correctly?
    - what kind of storage are you using
    - are these newly created LUNs that you are using or is it possible that these have been used before by other hosts or clusters?
    - any interesting error messages in the log files - /var/adm/messages
    - what is the panic string of the other node that reboots?
    I do not think that the did related message is relevant in this context.

  • Root.sh failed in one node - CLSMON and UDLM

    Hi experts.
    My enviroment is:
    2-node SunCluster Update3
    Oracle RAC 10.2.0.1 > planning to upgrade to 10.2.0.4
    The problem is: I installed the CRS services on 2 nodes - OK
    After that, running root.sh fails in 1 node:
    /u01/app/product/10/CRS/root.sh
    WARNING: directory '/u01/app/product/10' is not owned by root
    WARNING: directory '/u01/app/product' is not owned by root
    WARNING: directory '/u01/app' is not owned by root
    WARNING: directory '/u01' is not owned by root
    Checking to see if Oracle CRS stack is already configured
    Checking to see if any 9i GSD is up
    Setting the permissions on OCR backup directory
    Setting up NS directories
    Oracle Cluster Registry configuration upgraded successfully
    WARNING: directory '/u01/app/product/10' is not owned by root
    WARNING: directory '/u01/app/product' is not owned by root
    WARNING: directory '/u01/app' is not owned by root
    WARNING: directory '/u01' is not owned by root
    clscfg: EXISTING configuration version 3 detected.
    clscfg: version 3 is 10G Release 2.
    Successfully accumulated necessary OCR keys.
    Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
    node <nodenumber>: <nodename> <private interconnect name> <hostname>
    node 0: spodhcsvr10 clusternode1-priv spodhcsvr10
    node 1: spodhcsvr12 clusternode2-priv spodhcsvr12
    clscfg: Arguments check out successfully.
    NO KEYS WERE WRITTEN. Supply -force parameter to override.
    -force is destructive and will destroy any previous cluster
    configuration.
    Oracle Cluster Registry for cluster has already been initialized
    Sep 22 13:34:17 spodhcsvr10 root: Oracle Cluster Ready Services starting by user request.
    Startup will be queued to init within 30 seconds.
    Sep 22 13:34:20 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Adding daemons to inittab
    Expecting the CRS daemons to be up within 600 seconds.
    Sep 22 13:34:34 spodhcsvr10 last message repeated 3 times
    Sep 22 13:34:34 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:34:40 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:35:43 spodhcsvr10 last message repeated 9 times
    Sep 22 13:36:07 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:36:07 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:36:14 spodhcsvr10 su: libsldap: Status: 85 Mesg: openConnection: simple bind failed - Timed out
    Sep 22 13:36:19 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:37:35 spodhcsvr10 last message repeated 11 times
    Sep 22 13:37:40 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:37:40 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:37:42 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:38:03 spodhcsvr10 last message repeated 3 times
    Sep 22 13:38:10 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:39:12 spodhcsvr10 last message repeated 9 times
    Sep 22 13:39:13 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:39:13 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:39:19 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:40:42 spodhcsvr10 last message repeated 12 times
    Sep 22 13:40:46 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:40:46 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:40:49 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:42:05 spodhcsvr10 last message repeated 11 times
    Sep 22 13:42:11 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:42:12 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:42:19 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:42:19 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:42:19 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Sep 22 13:43:49 spodhcsvr10 last message repeated 13 times
    Sep 22 13:43:51 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 22 13:43:51 spodhcsvr10 root: Running CRSD with TZ = Brazil/East
    Sep 22 13:43:56 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 10. Respawning
    Failure at final check of Oracle CRS stack.
    I traced the ocssd.log and found some informations:
    [    CSSD]2010-09-22 14:04:14.739 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:14.742 [6] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2478) LATS(0) Disk lastSeqNo(2478)
    [    CSSD]2010-09-22 14:04:14.742 [7] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:14.744 [7] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2478) LATS(0) Disk lastSeqNo(2478)
    [    CSSD]2010-09-22 14:04:14.745 [8] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:14.746 [8] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2478) LATS(0) Disk lastSeqNo(2478)
    [    CSSD]2010-09-22 14:04:14.785 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:14.785 [10] >TRACE: clssnmFatalThread: spawned
    [    CSSD]2010-09-22 14:04:14.785 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:14.786 [11] >TRACE: clssnmconnect: connecting to node 0, flags 0x0001, connector 1
    [    CSSD]2010-09-22 14:04:23.075 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
    [    CSSD]2010-09-22 14:04:23.075 >USER: CSS daemon log for node spodhcsvr10, number 0, in cluster NET_RAC
    [  clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=spodhcsvr10DBG_CSSD))
    [    CSSD]2010-09-22 14:04:23.082 [1] >TRACE: clssscmain: local-only set to false
    [    CSSD]2010-09-22 14:04:23.096 [1] >TRACE: clssnmReadNodeInfo: added node 0 (spodhcsvr10) to cluster
    [    CSSD]2010-09-22 14:04:23.106 [1] >TRACE: clssnmReadNodeInfo: added node 1 (spodhcsvr12) to cluster
    [    CSSD]2010-09-22 14:04:23.129 [5] >TRACE: [0]Node monitor: dlm attach failed error LK_STAT_NOTCREATED
    [    CSSD]CLSS-0001: skgxn not active
    [    CSSD]2010-09-22 14:04:23.129 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 30
    [    CSSD]2010-09-22 14:04:23.132 [1] >TRACE: clssnmInitNMInfo: misscount set to 600
    [    CSSD]2010-09-22 14:04:23.136 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:23.139 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:23.143 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:25.139 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:25.142 [6] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2488) LATS(0) Disk lastSeqNo(2488)
    [    CSSD]2010-09-22 14:04:25.143 [7] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:25.144 [7] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2488) LATS(0) Disk lastSeqNo(2488)
    [    CSSD]2010-09-22 14:04:25.145 [8] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:25.148 [8] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2489) LATS(0) Disk lastSeqNo(2489)
    [    CSSD]2010-09-22 14:04:25.186 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:25.186 [10] >TRACE: clssnmFatalThread: spawned
    [    CSSD]2010-09-22 14:04:25.186 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:25.187 [11] >TRACE: clssnmconnect: connecting to node 0, flags 0x0001, connector 1
    [    CSSD]2010-09-22 14:04:33.449 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
    [    CSSD]2010-09-22 14:04:33.449 >USER: CSS daemon log for node spodhcsvr10, number 0, in cluster NET_RAC
    [  clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=spodhcsvr10DBG_CSSD))
    [    CSSD]2010-09-22 14:04:33.457 [1] >TRACE: clssscmain: local-only set to false
    [    CSSD]2010-09-22 14:04:33.470 [1] >TRACE: clssnmReadNodeInfo: added node 0 (spodhcsvr10) to cluster
    [    CSSD]2010-09-22 14:04:33.480 [1] >TRACE: clssnmReadNodeInfo: added node 1 (spodhcsvr12) to cluster
    [    CSSD]2010-09-22 14:04:33.498 [5] >TRACE: [0]Node monitor: dlm attach failed error LK_STAT_NOTCREATED
    [    CSSD]CLSS-0001: skgxn not active
    [    CSSD]2010-09-22 14:04:33.498 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 30
    [    CSSD]2010-09-22 14:04:33.500 [1] >TRACE: clssnmInitNMInfo: misscount set to 600
    [    CSSD]2010-09-22 14:04:33.505 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:33.508 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:33.510 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:35.508 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/vx/rdsk/racdg/ora_vote1)
    [    CSSD]2010-09-22 14:04:35.510 [6] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2499) LATS(0) Disk lastSeqNo(2499)
    [    CSSD]2010-09-22 14:04:35.510 [7] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/vx/rdsk/racdg/ora_vote2)
    [    CSSD]2010-09-22 14:04:35.512 [7] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2499) LATS(0) Disk lastSeqNo(2499)
    [    CSSD]2010-09-22 14:04:35.513 [8] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/vx/rdsk/racdg/ora_vote3)
    [    CSSD]2010-09-22 14:04:35.514 [8] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(2499) LATS(0) Disk lastSeqNo(2499)
    [    CSSD]2010-09-22 14:04:35.553 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:35.553 [10] >TRACE: clssnmFatalThread: spawned
    [    CSSD]2010-09-22 14:04:35.553 [1] >TRACE: clssscSclsFatal: read value of disable
    [    CSSD]2010-09-22 14:04:35.553 [11] >TRACE: clssnmconnect: connecting to node 0, flags 0x0001, connector 1
    I believe the main error is:
    [    CSSD]2010-09-22 14:04:33.498 [5] >TRACE: [0]Node monitor: dlm attach failed error LK_STAT_NOTCREATED
    [    CSSD]CLSS-0001: skgxn not active
    And the communication between UDLM and CLSMON. But i don't know how to resolve this.
    My UDLM version is 3.3.4.9.
    Somebody have any ideas about this?
    Tks!

    Now i finally installed CRS and run root.sh without errors (i think that problem is in some old file from other instalation tries...)
    But now i have another problem: When install DB software, in step to copy instalation to remote node, this node have some failure in CLSMON/CSSD daemon and panicking:
    Sep 23 16:10:51 spodhcsvr10 root: Oracle CLSMON terminated with unexpected status 138. Respawning
    Sep 23 16:10:52 spodhcsvr10 root: Oracle CSSD failure. Rebooting for cluster integrity.
    Sep 23 16:10:52 spodhcsvr10 root: [ID 702911 user.alert] Oracle CSSD failure. Rebooting for cluster integrity.
    Sep 23 16:10:51 spodhcsvr10 root: [ID 702911 user.error] Oracle CLSMON terminated with unexpected status 138. Respawning
    Sep 23 16:10:52 spodhcsvr10 root: [ID 702911 user.alert] Oracle CSSD failure. Rebooting for cluster integrity.
    Sep 23 16:10:56 spodhcsvr10 Cluster.OPS.UCMMD: fatal: received signal 15
    Sep 23 16:10:56 spodhcsvr10 Cluster.OPS.UCMMD: [ID 770355 daemon.error] fatal: received signal 15
    Sep 23 16:10:59 spodhcsvr10 root: Oracle Cluster Ready Services waiting for SunCluster and UDLM to start.
    Sep 23 16:10:59 spodhcsvr10 root: Cluster Ready Services completed waiting on dependencies.
    Sep 23 16:10:59 spodhcsvr10 root: [ID 702911 user.error] Oracle Cluster Ready Services waiting for SunCluster and UDLM to start.
    Sep 23 16:10:59 spodhcsvr10 root: [ID 702911 user.error] Cluster Ready Services completed waiting on dependencies.
    Notifying cluster that this node is panicking
    The instalation in first node continue and report error in copy to second node.
    Any ideas? Tks!

  • ASM install fails on one node

    I have been trying to install 10gRAC on a two virtual node cluster. I installed clusterware and it was successful. Before I started ASM install:
    [oracle@rac1 bin]$ ./crs_stat -t
    Name Type Target State Host
    ora.rac1.gsd application ONLINE ONLINE rac1
    ora.rac1.ons application ONLINE ONLINE rac1
    ora.rac1.vip application ONLINE ONLINE rac1
    ora.rac2.gsd application ONLINE ONLINE rac2
    ora.rac2.ons application ONLINE ONLINE rac2
    ora.rac2.vip application ONLINE ONLINE rac2
    [oracle@rac1 logs]$ /u01/crs/oracle/product/10.2.0/crs/bin/crsctl check crs
    CSS appears healthy
    CRS appears healthy
    EVM appears healthy
    [oracle@rac1 logs]$ ps -ef|grep d.bin
    root 3795 1 0 13:57 ? 00:00:35 /u01/crs/oracle/product/10.2.0/crs/bin/crsd.bin reboot
    oracle 4966 3793 0 13:59 ? 00:00:06 /u01/crs/oracle/product/10.2.0/crs/bin/evmd.bin
    oracle 5082 5059 0 13:59 ? 00:01:06 /u01/crs/oracle/product/10.2.0/crs/bin/ocssd.bin
    oracle 30520 4813 0 16:23 pts/3 00:00:00 grep d.bin
    During thhe ASM install here is what I got:
    WARNING: Error while copying directory /u01/app/oracle/product/10.2.0/db_1 with exclude file list 'null' to nodes 'rac2'. [PRKC-1073 : Failed to transfer directory "/u01/app/oracle/product/10.2.0/db_1" to any of the given nodes "rac2 ".
    Error on node rac2:Read from remote host rac2: Connection reset by peer]
    Refer to '/u01/app/oracle/oraInventory/logs/installActions2009-04-18_01-30-26PM.log' for details. You may fix the errors on the required remote nodes. Refer to the install guide for error recovery. Click 'Yes' if you want to proceed. Click 'No' to exit the install. Do you want to continue?
    INFO: User Selected: Yes/OK
    It appears to me as though the installer was not able to copy over the "/u01/app/oracle/product/10.2.0/db_1" directory to the rac2 node. I do not see any reason for that, I have setup ssh user equivalence for both oracle and root users, ssh and scp seem to work both ways. Permissions should not be an issue on one node and not the other as I replicated the permissions.
    I continued the installation and ASM is working fine on rac1 node and not on the second node. I tried using the dbca to setup the ASM on the second node and it errors out with a "crs-0223 resource placement error". Here is what I did next:
    [oracle@rac1 bin]$ ./srvctl status asm -n rac1
    ASM instance +ASM1 is running on node rac1.
    [oracle@rac1 bin]$ ./srvctl status asm -n rac2
    ASM instance +ASM2 is not running on node rac2.
    [oracle@rac1 bin]$ ./crs_sta
    crs_start crs_start.bin crs_stat crs_stat.bin
    [oracle@rac1 bin]$ ./crs_stat -t
    Name Type Target State Host
    ora....SM1.asm application ONLINE ONLINE rac1
    ora....C1.lsnr application ONLINE ONLINE rac1
    ora.rac1.gsd application ONLINE ONLINE rac1
    ora.rac1.ons application ONLINE ONLINE rac1
    ora.rac1.vip application ONLINE ONLINE rac1
    ora....SM2.asm application ONLINE UNKNOWN rac2
    ora....C2.lsnr application ONLINE UNKNOWN rac2
    ora.rac2.gsd application ONLINE ONLINE rac2
    ora.rac2.ons application ONLINE ONLINE rac2
    ora.rac2.vip application ONLINE ONLINE rac2
    [oracle@rac1 bin]$ ./crs_start ora.rac2.ASM2.asm
    CRS-1028: Dependency analysis failed because of:
    'Resource in UNKNOWN state: ora.rac2.ASM2.asm'
    CRS-0223: Resource 'ora.rac2.ASM2.asm' has placement error.
    I would like to get the ASM instance extended to the second node (rac2) and ofcourse, continue with the database instance creation. How can I accomplish this?
    Thanks!

    Hi orafun,
    this message:
    +WARNING: Error while copying directory /u01/app/oracle/product/10.2.0/db_1 with exclude file list 'null' to nodes 'rac2'. [PRKC-1073 : Failed to transfer directory "/u01/app/oracle/product/10.2.0/db_1" to any of the given nodes "rac2 ".+
    +Error on node rac2:Read from remote host rac2: Connection reset by peer]+
    Refer to '/u01/app/oracle/oraInventory/logs/installActions2009-04-18_01-30-26PM.log' for details. You may fix the errors on the required remote nodes. Refer to the install guide for error recovery. Click 'Yes' if you want to proceed. Click 'No' to exit the install. Do you want to continue?
    INFO: User Selected: Yes/OK
    Tells you that the Oracle Home could not be copied onto the remote node. The logs mentioned might tell you more, but this is the reason why ASM cannot be started on the other node - there is no software that could be used to start an ASM instance. Now you said:
    "+It appears to me as though the installer was not able to copy over the "/u01/app/oracle/product/10.2.0/db_1" directory to the rac2 node. I do not see any reason for that, I have setup ssh user equivalence for both oracle and root users, ssh and scp seem to work both ways. Permissions should not be an issue on one node and not the other as I replicated the permissions+."
    My question would be: What do you try to achieve? IF it is your only interest to "get it done and over with", then you can TAR up the Oracle Database home from which you want to run ASM and un-TAR on the remote node. Given that the paths are all correct, the registration already took place and hence, you can try starting the ASM instance on node2. IF you want to know the reason for the issue, further investigation and more information would be required.
    Hope that helps. Thanks,
    Markus

  • How to configure hw disk array for ORACLE VM SERVER 2.1.1?

    ORACLE VM SERVER 2.1.1 ON DISK SATA ARRAY R0. AFTER A INSTALLATION ORACLE VM SERVER DON;T LOAD

    No reason to shout, buddy... Turn CAPS LOCK off, please.
    Some more information might come in handy also. Like what kind of hardware you're using.
    Otherwise it's going to be a shot in the dark.
    In the meantime: try some GRUB-parameters to boot from the right device. That's all I can think of right now.

  • What do do when disk utilities fail?

    Hi. I have an iBook that has been switching back and forth between OS X 10.3.5 and OS 9.2.2 pretty smoothly (recently replaced the DC-in board).
    One of the OS 9 programs (Pro Tools) is crashing and giving me an error message saying I need to defragment. Running Norton Speed Disk optimizer in OS 9 normally takes care of this problem. But now, at around the 3/4 complete mark, I get an error message saying the disk is damaged and needs repair. When I run Norton Disk Doctor, it eventually returns an error message saying it can't repair the disk -- too bad.
    I have already run the OS 9 Apple Disk Utility and, on the OS X side, repaired the disk permissions. What now?
    Howie B.

    Okay -- so DiskWarrior ties up your computer for 2 weeks while it does it's thing?
    If your hard drive directory is disastrously scrambled, DW can take days to sort things out, but in 95%-99% of all cases it takes oonly a few minutes. Norton Disk Doctor is perfectly capable of scrambling it disastrously, but if your Mac still starts up and runs, things probably aren't that bad yet.
    A damaged directory makes it likely that there will be some loss of data if you simply copy the entire drive to a backup disk now. The bad directory means the OS can't properly locate every file that it knows is supposed to be on your hard drive, so some of them aren't going to be copied properly. It might be just one insignificant file, or dozens or hundreds of important ones, that are damaged in part or lost altogether in backing up. The ideal time to back up your drive is when you know it's all in good order.
    Erasing your hard drive will also erase the bad directory, so it's an alternative solution to the problem if your data is already backed up. Copying the contents of a backup that you make now, with a damaged directory, back onto your internal drive after erasing it won't fix any files that have already been corrupted, but it will at least copy everything — damaged or not — back into a new, undamaged directory.

  • Oracle 10gR2 RAC EE 10.2.0.5 problem with one node.

    Hi, I have an Oracle Rac 10gR2 10.2.0.5 EE on SLES 10 on ibm ppc. My problem is that in crs don't appear that database is start in one instance. However the database is start when I use srvctl or sqlplus
    crsctl query crs softwareversion
    CRS active version on the cluster is [10.2.0.5.0]
    crsctl query crs activeversion
    CRS active version on the cluster is [10.2.0.5.0]
    oracle@agripa:/u01/oracle/app/product/102_64/crs/bin> ./srvctl config database
    ELIO3
    oracle@agripa:/u01/oracle/app/product/102_64/crs/bin> ./srvctl status database -d ELIO3
    La instancia ELIO32 se está ejecutando en el nodo julia
    La instancia ELIO31 no se está ejecutando en el nodo agripa
    agripa:/u01/oracle/app/product/102_64/crs/bin # ./crs_stat -t
    Nombre         Tipo           Destino   Estado    Host
    ora....31.inst application    ONLINE    OFFLINE
    ora....32.inst application    ONLINE    ONLINE    julia
    ora.ELIO3.db   application    ONLINE    ONLINE    julia
    ora....PA.lsnr application    ONLINE    OFFLINE
    ora....PA.lsnr application    ONLINE    OFFLINE
    ora.agripa.gsd application    ONLINE    OFFLINE
    ora.agripa.ons application    ONLINE    OFFLINE
    ora.agripa.vip application    ONLINE    ONLINE    agripa
    ora....IA.lsnr application    ONLINE    ONLINE    julia
    ora.julia.gsd  application    ONLINE    ONLINE    julia
    ora.julia.ons  application    ONLINE    ONLINE    julia
    ora.julia.vip  application    ONLINE    ONLINE    julia
    oracle@agripa:/u01/oracle/app/product/102_64/app/bin> ./srvctl start instance -d ELIO3 -i ELIO31
    PRKP-1001 : Error al iniciar la instancia ELIO31 en el nodo agripaThis command show an error, however the database is up.
    oracle@agripa:/u01/oracle/app/product/102_64/app/bin> sqlplus / as sysdba
    SQL*Plus: Release 10.2.0.5.0 - Production on Mié Sep 26 09:32:25 2012
    Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
    Conectado a:
    Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Production
    With the Partitioning, Real Application Clusters, OLAP, Data Mining
    and Real Application Testing options
    SQL> select * from v$active_instances;
    INST_NUMBER INST_NAME
              2 julia:ELIO32
              4 agripa:ELIO31
    SQL>But crs, don't show it.
    agripa:/u01/oracle/app/product/102_64/crs/bin # ./crs_stat -t
    Nombre         Tipo           Destino   Estado    Host
    ora....31.inst application    ONLINE    OFFLINE
    ora....32.inst application    ONLINE    ONLINE    julia
    ora.ELIO3.db   application    ONLINE    ONLINE    julia
    ora....PA.lsnr application    ONLINE    OFFLINE
    ora....PA.lsnr application    ONLINE    OFFLINE
    ora.agripa.gsd application    ONLINE    OFFLINE
    ora.agripa.ons application    ONLINE    OFFLINE
    ora.agripa.vip application    ONLINE    ONLINE    agripa
    ora....IA.lsnr application    ONLINE    ONLINE    julia
    ora.julia.gsd  application    ONLINE    ONLINE    julia
    ora.julia.ons  application    ONLINE    ONLINE    julia
    ora.julia.vip  application    ONLINE    ONLINE    juliaSome logs...
    oracle@agripa:/u01/oracle/app/product/102_64/app/log/agripa/racg> cat ora.ELIO3.ELIO31.inst.log
    2012-09-26 09:35:45.108: [ COMMCRS][53473856]clsc_connect: (0x40002a0a6c0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
    2012-09-26 09:36:15.150: [    RACG][173648] [2346][173648][ora.ELIO3.ELIO31.inst]: end for resource = ora.ELIO3.ELIO31.inst, action = rundetach, status = -1, time = 300.600s
    agripa:/u01/oracle/app/product/102_64/crs/bin # ls -ltrach /var/tmp/.oracle/
    total 512
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 08:28 sagripaDBG_CSSD
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 08:28 sOracle_CSS_LclLstnr_crs_3
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 08:28 sOCSSD_LL_agripa_crs
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 08:28 sagripaDBG_EVMD
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 08:28 sCagripa_crs_evm
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 08:28 sAagripa_crs_evm
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 09:46 s#3633.2
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 09:46 s#3633.1
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 10:00 s#11308.1
    srwxrwxrwx 1 oracle oinstall   0 2012-09-24 10:03 s#11308.2
    drwxrwxrwt 8 root   root     232 2012-09-26 09:24 ..
    srwxrwxrwx 1 oracle oinstall   0 2012-09-26 09:30 sora_racg_ELIO3_agripa
    srwxrwxrwx 1 oracle oinstall   0 2012-09-26 09:35 sSYSTEM.evm.acceptor.auth
    srwxrwxrwx 1 root   root       0 2012-09-26 09:35 sagripaDBG_CRSD
    srwxrwxrwx 1 root   root       0 2012-09-26 09:40 sCRSD_UI_SOCKET
    srwxrwxrwx 1 root   root       0 2012-09-26 09:45 sora_crsqs
    srwxrwxrwx 1 root   root       0 2012-09-26 09:45 sprocr_local_conn_0_PROC
    srwxrwxrwx 1 oracle oinstall   0 2012-09-26 09:45 sOCSSD_LL_agripa_
    drwxrwxrwt 2 root   root     640 2012-09-26 09:45 .
    oracle@agripa:/u01/oracle/app/product/102_64/app/log/agripa/racg> cat imon_ELIO3.log
    2012-09-26 09:30:41.047: [    RACG][173648] [1100][173648][ora.ELIO3.ELIO31.inst]: racgimon started
    2012-09-26 09:31:06.174: [    RACG][70660672] [1100][70660672][ora.ELIO3.ELIO31.inst]:
    SQL*Plus: Release 10.2.0.5.0 - Production on Wed Sep 26 09:30:42 2012
    Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
    Enter user-name: Connected to an idle instance.
    SQL> ORACLE instance started.
    Total System Global Area 8589934592 bytes
    2012-09-26 09:31:06.174: [    RACG][70660672] [1100][70660672][ora.ELIO3.ELIO31.inst]: Fixed Size                   2109992 bytes
    Variable Size            1358958040 bytes
    Database Buffers         7214202880 bytes
    Redo Buffers               14663680 bytes
    Base de datos montada.
    Base de datos abierta.
    2012-09-26 09:31:06.174: [    RACG][70660672] [1100][70660672][ora.ELIO3.ELIO31.inst]: SQL> Desconectado de Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Production
    With the Partitioning, Real Application Clusters, OLAP, Data Mining
    and Real Application Testing options
    2012-09-26 09:31:06.404: [    RACG][70660672] [1100][70660672][ora.ELIO3.ELIO31.inst]: clsrcexecut: env _USR_ORA_PFILE=/u01/oracle/app/product/102_64/crs/racg/tmp/ora.ELIO3.ELIO31.inst.ora
    2012-09-26 09:31:06.404: [    RACG][70660672] [1100][70660672][ora.ELIO3.ELIO31.inst]: clsrcexecut: cmd = /u01/oracle/app/product/102_64/app/bin/racgeut -e _USR_ORA_DEBUG=0 -e ORACLE_SID=ELIO31 540 /u01/oracle/app/product/102_64/app/bin/racgmdb -q
    2012-09-26 09:31:06.404: [    RACG][70660672] [1100][70660672][ora.ELIO3.ELIO31.inst]: clsrcexecut: rc = 3, time = 0.230s
    2012-09-26 09:31:07.196: [  OCRMSG][70660672]prom_rpc: CLSC send failure..ret code 11
    2012-09-26 09:31:07.196: [  OCRMSG][70660672]prom_rpc: possible OCR retry scenario
    2012-09-26 09:31:07.254: [  OCRAPI][70660672]procr_open: Node Failure. Attempting retry #0
    2012-09-26 09:31:07.528: [  OCRCLI][70660672]oac_reconnect_server: Could not connect to server. clsc ret 9
    2012-09-26 09:31:07.584: [  OCRAPI][70660672]procr_open: Node Failure. Attempting retry #1
    2012-09-26 09:31:07.847: [  OCRCLI][70660672]oac_reconnect_server: Could not connect to server. clsc ret 9
    2012-09-26 09:31:07.904: [  OCRAPI][70660672]procr_open: Node Failure. Attempting retry #2
    2012-09-26 09:31:08.178: [  OCRCLI][70660672]oac_reconnect_server: Could not connect to server. clsc ret 9
    2012-09-26 09:31:08.235: [  OCRAPI][70660672]procr_open: Node Failure. Attempting retry #3
    2012-09-26 09:31:08.488: [  OCRCLI][70660672]oac_reconnect_server: Could not connect to server. clsc ret 9
    2012-09-26 09:31:08.544: [  OCRAPI][70660672]procr_open: Node Failure. Attempting retry #4
    2012-09-26 09:31:08.808: [  OCRCLI][70660672]oac_reconnect_server: Could not connect to server. clsc ret 9
    2012-09-26 09:31:08.865: [  OCRAPI][70660672]procr_open: Node Failure. Attempting retry #5
    2012-09-26 09:31:08.971: [    RACG][70660672] [1100][70660672][ora.ELIO3.ELIO31.inst]: clsrcsetenvvar: set env var 'ORACLE_CONFIG_HOME' to be '/u01/oracle/app/product/102_64/crs/'
    2012-09-26 09:31:09.767: [    RACG][70660672] [1100][70660672][ora.ELIO3.ELIO31.inst]: clsrcqryapi: crs_qstat error
    2012-09-26 09:36:14.709: [  OCRMSG][148804160]prom_rpc: CLSC send failure..ret code 11
    2012-09-26 09:36:14.709: [  OCRMSG][148804160]prom_rpc: possible OCR retry scenario
    2012-09-26 09:36:14.768: [  OCRAPI][148804160]procr_open: Node Failure. Attempting retry #0
    agripa:/u01/oracle/app/product/102_64/crs/bin # ./crs_getperm ora.ELIO3.ELIO32.inst
    Nombre: ora.ELIO3.ELIO32.inst
    owner:oracle:rwx,pgrp:oinstall:rwx,other::r--,
    agripa:/u01/oracle/app/product/102_64/crs/bin # ./crs_getperm ora.ELIO3.ELIO31.inst
    Nombre: ora.ELIO3.ELIO31.inst
    owner:oracle:rwx,pgrp:oinstall:rwx,other::r--,I try to stop instance with srvctl but instance don't stop.
    oracle@agripa:/u01/oracle/app/product/102_64/app/bin> ./srvctl stop instance -d ELIO3 -i ELIO31
    oracle@agripa:/u01/oracle/app/product/102_64/app/bin> sqlplus / as sysdba
    SQL*Plus: Release 10.2.0.5.0 - Production on Mié Sep 26 09:42:24 2012
    Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
    Conectado a:
    Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - Production
    With the Partitioning, Real Application Clusters, OLAP, Data Mining
    and Real Application Testing options
    SQL>How can I fix it?
    How can I do to srvctl works fine and crs show that database is start??
    Thanks you very much!!!

    My database in another node (julia) was started with sqlplus
    sqlplus / as sysdba
    startup open;The crsd.log appears the follow
    2012-09-24 11:36:56.798: [ CRSMAIN][174064]0CRS Daemon Started.
    2012-09-24 11:36:56.798: [ CRSMAIN][363782720]0Starting runCommandServer for (UI = 1, E2E = 0). 0
    2012-09-24 11:36:56.798: [ CRSMAIN][365879872]0Starting runCommandServer for (UI = 1, E2E = 0). 1
    2012-09-26 09:30:40.776: [  CRSRES][380703296]0startRunnable: setting CLI values
    2012-09-26 09:30:40.814: [  CRSRES][380703296]0Attempting to start `ora.ELIO3.ELIO31.inst` on member `agripa`
    2012-09-26 09:30:54.931: [ default][174064][ENTER]0
    Oracle Database 10g CRS Release 10.2.0.5.0 Production Copyright 1996, 2004, Oracle.  All rights reserved
    2012-09-26 09:30:54.931: [ default][174064]0CRS Daemon Starting
    2012-09-26 09:30:54.932: [ CRSMAIN][174064]0Checking the OCR device
    2012-09-26 09:30:54.935: [ CRSMAIN][174064]0Connecting to the CSS Daemon
    2012-09-26 09:30:55.238: [  CLSVER][174064]0Active Version from OCR:10.2.0.5.0
    2012-09-26 09:30:55.238: [  CLSVER][174064]0Active Version and Software Version are same
    2012-09-26 09:30:55.238: [ CRSMAIN][174064]0Initializing OCR
    2012-09-26 09:30:55.254: [  OCRRAW][174064]proprioo: for disk 0 (/u01/oracle/1020_64/voting1/ocr), id match (1), my id set (1616921495,1352295226) total id sets (1), 1st set (1616921495,1352295226), 2nd set (0,0) my votes (1), total votes (2)
    2012-09-26 09:30:55.254: [  OCRRAW][174064]proprioo: for disk 1 (/u01/oracle/1020_64/voting2/ocr), id match (1), my id set (1616921495,1352295226) total id sets (1), 1st set (1616921495,1352295226), 2nd set (0,0) my votes (1), total votes (2)
    2012-09-26 09:30:55.286: [    CRSD][174064]0ENV Logging level for Module: allcomp  0
    2012-09-26 09:30:55.289: [    CRSD][174064]0ENV Logging level for Module: default  0
    2012-09-26 09:30:55.292: [    CRSD][174064]0ENV Logging level for Module: COMMCRS  0
    2012-09-26 09:30:55.294: [    CRSD][174064]0ENV Logging level for Module: COMMNS  0
    2012-09-26 09:30:55.297: [    CRSD][174064]0ENV Logging level for Module: CRSUI  0
    2012-09-26 09:30:55.300: [    CRSD][174064]0ENV Logging level for Module: CRSCOMM  0
    2012-09-26 09:30:55.303: [    CRSD][174064]0ENV Logging level for Module: CRSRTI  0
    2012-09-26 09:30:55.306: [    CRSD][174064]0ENV Logging level for Module: CRSMAIN  0
    2012-09-26 09:30:55.309: [    CRSD][174064]0ENV Logging level for Module: CRSPLACE  0
    2012-09-26 09:30:55.312: [    CRSD][174064]0ENV Logging level for Module: CRSAPP  0
    2012-09-26 09:30:55.315: [    CRSD][174064]0ENV Logging level for Module: CRSRES  0
    2012-09-26 09:30:55.317: [    CRSD][174064]0ENV Logging level for Module: CRSOCR  0
    2012-09-26 09:30:55.320: [    CRSD][174064]0ENV Logging level for Module: CRSTIMER  0
    2012-09-26 09:30:55.323: [    CRSD][174064]0ENV Logging level for Module: CRSEVT  0
    2012-09-26 09:30:55.326: [    CRSD][174064]0ENV Logging level for Module: CRSD  0
    2012-09-26 09:30:55.329: [    CRSD][174064]0ENV Logging level for Module: CLUCLS  0
    2012-09-26 09:30:55.332: [    CRSD][174064]0ENV Logging level for Module: CLSVER  0
    2012-09-26 09:30:55.335: [    CRSD][174064]0ENV Logging level for Module: OCRRAW  0
    2012-09-26 09:30:55.338: [    CRSD][174064]0ENV Logging level for Module: OCROSD  0
    2012-09-26 09:30:55.341: [    CRSD][174064]0ENV Logging level for Module: CSSCLNT  0
    2012-09-26 09:30:55.343: [    CRSD][174064]0ENV Logging level for Module: OCRAPI  0
    2012-09-26 09:30:55.346: [    CRSD][174064]0ENV Logging level for Module: OCRUTL  0
    2012-09-26 09:30:55.349: [    CRSD][174064]0ENV Logging level for Module: OCRMSG  0
    2012-09-26 09:30:55.352: [    CRSD][174064]0ENV Logging level for Module: OCRCLI  0
    2012-09-26 09:30:55.355: [    CRSD][174064]0ENV Logging level for Module: OCRCAC  0
    2012-09-26 09:30:55.358: [    CRSD][174064]0ENV Logging level for Module: OCRSRV  0
    2012-09-26 09:30:55.360: [    CRSD][174064]0ENV Logging level for Module: OCRMAS  0
    2012-09-26 09:30:55.360: [ CRSMAIN][174064]0Filename is /u01/oracle/app/product/102_64/crs/crs/init/agripa.pid
    [  clsdmt][299713088]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=agripaDBG_CRSD))
    2012-09-26 09:30:55.383: [ CRSMAIN][174064]0Using Authorizer location: /u01/oracle/app/product/102_64/crs/crs/auth/
    2012-09-26 09:30:55.401: [ CRSMAIN][174064]0Initializing RTI
    2012-09-26 09:30:55.436: [CRSTIMER][316899904]0Timer Thread Starting.
    2012-09-26 09:30:55.438: [  CRSRES][174064]0Parameter SECURITY = 1, running in USER Mode
    2012-09-26 09:30:55.438: [ CRSMAIN][174064]0Initializing EVMMgr
    2012-09-26 09:30:55.526: [ CRSMAIN][174064]0CRSD locked during state recovery, please wait.
    2012-09-26 09:30:56.003: [  CRSRES][174064]0ora.agripa.vip check shows ONLINE
    2012-09-26 09:30:56.121: [ CRSMAIN][174064]0CRSD recovered, unlocked.
    2012-09-26 09:30:56.122: [ CRSMAIN][174064]0QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
    2012-09-26 09:30:56.122: [ CRSMAIN][174064]0QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
    2012-09-26 09:30:56.130: [ CRSMAIN][174064]0CRSD UI socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
    2012-09-26 09:30:56.133: [ CRSMAIN][174064]0E2E socket on: (ADDRESS=(PROTOCOL=tcp)(HOST=agripa-priv)(PORT=49896))
    2012-09-26 09:30:56.133: [ CRSMAIN][174064]0Starting Threads
    2012-09-26 09:30:56.134: [ CRSMAIN][174064]0CRS Daemon Started.
    2012-09-26 09:30:56.134: [ CRSMAIN][362721856]0Starting runCommandServer for (UI = 1, E2E = 0). 0
    2012-09-26 09:30:56.134: [ CRSMAIN][364819008]0Starting runCommandServer for (UI = 1, E2E = 0). 1
    2012-09-26 09:31:08.582: [ default][174064][ENTER]0
    Oracle Database 10g CRS Release 10.2.0.5.0 Production Copyright 1996, 2004, Oracle.  All rights reserved
    2012-09-26 09:31:08.582: [ default][174064]0CRS Daemon Starting
    2012-09-26 09:31:08.583: [ CRSMAIN][174064]0Checking the OCR device
    2012-09-26 09:31:08.585: [ CRSMAIN][174064]0Connecting to the CSS Daemon
    2012-09-26 09:31:08.885: [  CLSVER][174064]0Active Version from OCR:10.2.0.5.0
    2012-09-26 09:31:08.885: [  CLSVER][174064]0Active Version and Software Version are same
    2012-09-26 09:31:08.885: [ CRSMAIN][174064]0Initializing OCR
    2012-09-26 09:31:08.891: [  OCRRAW][174064]proprioo: for disk 0 (/u01/oracle/1020_64/voting1/ocr), id match (1), my id set (1616921495,1352295226) total id sets (1), 1st set (1616921495,1352295226), 2nd set (0,0) my votes (1), total votes (2)
    2012-09-26 09:31:08.891: [  OCRRAW][174064]proprioo: for disk 1 (/u01/oracle/1020_64/voting2/ocr), id match (1), my id set (1616921495,1352295226) total id sets (1), 1st set (1616921495,1352295226), 2nd set (0,0) my votes (1), total votes (2)
    2012-09-26 09:31:08.934: [    CRSD][174064]0ENV Logging level for Module: allcomp  0
    2012-09-26 09:31:08.937: [    CRSD][174064]0ENV Logging level for Module: default  0
    2012-09-26 09:31:08.940: [    CRSD][174064]0ENV Logging level for Module: COMMCRS  0
    2012-09-26 09:31:08.942: [    CRSD][174064]0ENV Logging level for Module: COMMNS  0
    2012-09-26 09:31:08.945: [    CRSD][174064]0ENV Logging level for Module: CRSUI  0
    2012-09-26 09:31:08.948: [    CRSD][174064]0ENV Logging level for Module: CRSCOMM  0
    2012-09-26 09:31:08.964: [    CRSD][174064]0ENV Logging level for Module: CRSRTI  0
    2012-09-26 09:31:08.970: [    CRSD][174064]0ENV Logging level for Module: CRSMAIN  0
    2012-09-26 09:31:08.973: [    CRSD][174064]0ENV Logging level for Module: CRSPLACE  0
    2012-09-26 09:31:08.976: [    CRSD][174064]0ENV Logging level for Module: CRSAPP  0
    2012-09-26 09:31:08.979: [    CRSD][174064]0ENV Logging level for Module: CRSRES  0
    2012-09-26 09:31:08.982: [    CRSD][174064]0ENV Logging level for Module: CRSOCR  0
    2012-09-26 09:31:08.985: [    CRSD][174064]0ENV Logging level for Module: CRSTIMER  0
    2012-09-26 09:31:08.988: [    CRSD][174064]0ENV Logging level for Module: CRSEVT  0
    2012-09-26 09:31:08.991: [    CRSD][174064]0ENV Logging level for Module: CRSD  0
    2012-09-26 09:31:08.994: [    CRSD][174064]0ENV Logging level for Module: CLUCLS  0
    2012-09-26 09:31:08.997: [    CRSD][174064]0ENV Logging level for Module: CLSVER  0
    2012-09-26 09:31:08.999: [    CRSD][174064]0ENV Logging level for Module: OCRRAW  0
    2012-09-26 09:31:09.002: [    CRSD][174064]0ENV Logging level for Module: OCROSD  0
    2012-09-26 09:31:09.005: [    CRSD][174064]0ENV Logging level for Module: CSSCLNT  0
    2012-09-26 09:31:09.008: [    CRSD][174064]0ENV Logging level for Module: OCRAPI  0
    2012-09-26 09:31:09.011: [    CRSD][174064]0ENV Logging level for Module: OCRUTL  0
    2012-09-26 09:31:09.014: [    CRSD][174064]0ENV Logging level for Module: OCRMSG  0
    2012-09-26 09:31:09.017: [    CRSD][174064]0ENV Logging level for Module: OCRCLI  0
    2012-09-26 09:31:09.020: [    CRSD][174064]0ENV Logging level for Module: OCRCAC  0
    2012-09-26 09:31:09.023: [    CRSD][174064]0ENV Logging level for Module: OCRSRV  0
    2012-09-26 09:31:09.026: [    CRSD][174064]0ENV Logging level for Module: OCRMAS  0
    2012-09-26 09:31:09.026: [ CRSMAIN][174064]0Filename is /u01/oracle/app/product/102_64/crs/crs/init/agripa.pid
    [  clsdmt][303034944]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=agripaDBG_CRSD))
    2012-09-26 09:31:09.057: [ CRSMAIN][174064]0Using Authorizer location: /u01/oracle/app/product/102_64/crs/crs/auth/
    2012-09-26 09:31:09.072: [ CRSMAIN][174064]0Initializing RTI
    2012-09-26 09:31:09.094: [CRSTIMER][319812160]0Timer Thread Starting.
    2012-09-26 09:31:09.096: [  CRSRES][174064]0Parameter SECURITY = 1, running in USER Mode
    2012-09-26 09:31:09.096: [ CRSMAIN][174064]0Initializing EVMMgr
    2012-09-26 09:31:09.198: [ CRSMAIN][174064]0CRSD locked during state recovery, please wait.
    2012-09-26 09:31:09.728: [  CRSRES][174064]0ora.agripa.vip check shows ONLINE
    2012-09-26 09:31:09.846: [ CRSMAIN][174064]0CRSD recovered, unlocked.
    2012-09-26 09:31:09.847: [ CRSMAIN][174064]0QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
    2012-09-26 09:31:09.847: [ CRSMAIN][174064]0QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
    2012-09-26 09:31:09.855: [ CRSMAIN][174064]0CRSD UI socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
    2012-09-26 09:31:09.858: [ CRSMAIN][174064]0E2E socket on: (ADDRESS=(PROTOCOL=tcp)(HOST=agripa-priv)(PORT=49896))
    2012-09-26 09:31:09.859: [ CRSMAIN][174064]0Starting Threads
    2012-09-26 09:31:09.859: [ CRSMAIN][174064]0CRS Daemon Started.
    2012-09-26 09:31:09.859: [ CRSMAIN][364917312]0Starting runCommandServer for (UI = 1, E2E = 0). 0
    2012-09-26 09:31:09.859: [ CRSMAIN][367014464]0Starting runCommandServer for (UI = 1, E2E = 0). 1
    2012-09-26 09:35:44.874: [ default][174064][ENTER]0
    Oracle Database 10g CRS Release 10.2.0.5.0 Production Copyright 1996, 2004, Oracle.  All rights reserved
    2012-09-26 09:35:44.874: [ default][174064]0CRS Daemon Starting
    2012-09-26 09:35:44.875: [ CRSMAIN][174064]0Checking the OCR device
    2012-09-26 09:35:44.877: [ CRSMAIN][174064]0Connecting to the CSS Daemon
    2012-09-26 09:35:45.186: [  CLSVER][174064]0Active Version from OCR:10.2.0.5.0
    2012-09-26 09:35:45.186: [  CLSVER][174064]0Active Version and Software Version are same
    2012-09-26 09:35:45.186: [ CRSMAIN][174064]0Initializing OCR
    2012-09-26 09:35:45.201: [  OCRRAW][174064]proprioo: for disk 0 (/u01/oracle/1020_64/voting1/ocr), id match (1), my id set (1616921495,1352295226) total id sets (1), 1st set (1616921495,1352295226), 2nd set (0,0) my votes (1), total votes (2)
    2012-09-26 09:35:45.202: [  OCRRAW][174064]proprioo: for disk 1 (/u01/oracle/1020_64/voting2/ocr), id match (1), my id set (1616921495,1352295226) total id sets (1), 1st set (1616921495,1352295226), 2nd set (0,0) my votes (1), total votes (2)
    2012-09-26 09:35:45.236: [    CRSD][174064]0ENV Logging level for Module: allcomp  0
    2012-09-26 09:35:45.242: [    CRSD][174064]0ENV Logging level for Module: default  0
    2012-09-26 09:35:45.245: [    CRSD][174064]0ENV Logging level for Module: COMMCRS  0
    2012-09-26 09:35:45.247: [    CRSD][174064]0ENV Logging level for Module: COMMNS  0
    2012-09-26 09:35:45.250: [    CRSD][174064]0ENV Logging level for Module: CRSUI  0
    2012-09-26 09:35:45.253: [    CRSD][174064]0ENV Logging level for Module: CRSCOMM  0
    2012-09-26 09:35:45.256: [    CRSD][174064]0ENV Logging level for Module: CRSRTI  0
    2012-09-26 09:35:45.259: [    CRSD][174064]0ENV Logging level for Module: CRSMAIN  0
    2012-09-26 09:35:45.262: [    CRSD][174064]0ENV Logging level for Module: CRSPLACE  0
    2012-09-26 09:35:45.265: [    CRSD][174064]0ENV Logging level for Module: CRSAPP  0
    2012-09-26 09:35:45.269: [    CRSD][174064]0ENV Logging level for Module: CRSRES  0
    2012-09-26 09:35:45.273: [    CRSD][174064]0ENV Logging level for Module: CRSOCR  0
    2012-09-26 09:35:45.276: [    CRSD][174064]0ENV Logging level for Module: CRSTIMER  0
    2012-09-26 09:35:45.279: [    CRSD][174064]0ENV Logging level for Module: CRSEVT  0
    2012-09-26 09:35:45.282: [    CRSD][174064]0ENV Logging level for Module: CRSD  0
    2012-09-26 09:35:45.285: [    CRSD][174064]0ENV Logging level for Module: CLUCLS  0
    2012-09-26 09:35:45.288: [    CRSD][174064]0ENV Logging level for Module: CLSVER  0
    2012-09-26 09:35:45.291: [    CRSD][174064]0ENV Logging level for Module: OCRRAW  0
    2012-09-26 09:35:45.293: [    CRSD][174064]0ENV Logging level for Module: OCROSD  0
    2012-09-26 09:35:45.297: [    CRSD][174064]0ENV Logging level for Module: CSSCLNT  0
    2012-09-26 09:35:45.300: [    CRSD][174064]0ENV Logging level for Module: OCRAPI  0
    2012-09-26 09:35:45.302: [    CRSD][174064]0ENV Logging level for Module: OCRUTL  0
    2012-09-26 09:35:45.305: [    CRSD][174064]0ENV Logging level for Module: OCRMSG  0
    2012-09-26 09:35:45.308: [    CRSD][174064]0ENV Logging level for Module: OCRCLI  0
    2012-09-26 09:35:45.311: [    CRSD][174064]0ENV Logging level for Module: OCRCAC  0
    2012-09-26 09:35:45.314: [    CRSD][174064]0ENV Logging level for Module: OCRSRV  0
    2012-09-26 09:35:45.318: [    CRSD][174064]0ENV Logging level for Module: OCRMAS  0
    2012-09-26 09:35:45.318: [ CRSMAIN][174064]0Filename is /u01/oracle/app/product/102_64/crs/crs/init/agripa.pid
    [  clsdmt][303034944]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=agripaDBG_CRSD))
    2012-09-26 09:35:45.341: [ CRSMAIN][174064]0Using Authorizer location: /u01/oracle/app/product/102_64/crs/crs/auth/
    2012-09-26 09:35:45.361: [ CRSMAIN][174064]0Initializing RTI
    2012-09-26 09:35:45.391: [CRSTIMER][319812160]0Timer Thread Starting.
    2012-09-26 09:35:45.393: [  CRSRES][174064]0Parameter SECURITY = 1, running in USER Mode
    2012-09-26 09:35:45.393: [ CRSMAIN][174064]0Initializing EVMMgr
    2012-09-26 09:35:45.481: [ CRSMAIN][174064]0CRSD locked during state recovery, please wait.
    2012-09-26 09:35:45.956: [  CRSRES][174064]0ora.agripa.vip check shows ONLINE
    2012-09-26 09:35:46.097: [ CRSMAIN][174064]0CRSD recovered, unlocked.
    2012-09-26 09:35:46.098: [ CRSMAIN][174064]0QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
    2012-09-26 09:35:46.098: [ CRSMAIN][174064]0QS socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))
    2012-09-26 09:35:46.105: [ CRSMAIN][174064]0CRSD UI socket on: (ADDRESS=(PROTOCOL=ipc)(KEY=CRSD_UI_SOCKET))
    2012-09-26 09:35:46.109: [ CRSMAIN][174064]0E2E socket on: (ADDRESS=(PROTOCOL=tcp)(HOST=agripa-priv)(PORT=49896))
    2012-09-26 09:35:46.109: [ CRSMAIN][174064]0Starting Threads
    2012-09-26 09:35:46.109: [ CRSMAIN][174064]0CRS Daemon Started.
    2012-09-26 09:35:46.109: [ CRSMAIN][364896832]0Starting runCommandServer for (UI = 1, E2E = 0). 0
    2012-09-26 09:35:46.109: [ CRSMAIN][366993984]0Starting runCommandServer for (UI = 1, E2E = 0). 1
    2012-09-26 09:36:09.889: [  CRSRES][381674048]0CRS-1002: Resource 'ora.ELIO3.db' is already running on member 'julia'
    2012-09-26 09:47:41.288: [  CRSRES][379576896]0ora.ELIO3.ELIO31.inst target set to OFFLINE before stop action
    2012-09-26 09:47:41.288: [  CRSRES][379576896]0StopResource: setting CLI values
    2012-09-26 09:47:41.355: [  CRSRES][379576896]0Target set to OFFLINE for `ora.ELIO3.ELIO31.inst`
    2012-09-26 09:47:42.266: [  CRSRES][379576896]0`ora.ELIO3.ELIO31.inst` is already OFFLINE.
    2012-09-26 09:47:43.278: [  CRSRES][379576896]0`ora.ELIO3.ELIO31.inst` is already OFFLINE.
    2012-09-26 09:47:44.286: [  CRSRES][379576896]0`ora.ELIO3.ELIO31.inst` is already OFFLINE.
    2012-09-26 09:47:45.298: [  CRSRES][379576896]0`ora.ELIO3.ELIO31.inst` is already OFFLINE.PD: Excuseme for the long text about crsd.log, but I don't know how can I attach a file.
    Thanks you very much!!.

  • ASM on one node crashes when we start the other two nodes ASM

    We completed database build in Aug 2010
    We complete PSU patching in Jan ending
    Feb 4th the database crashed
    We cannot start ASM on node1
    ASM starts good on node2 and node3 but node1 cannot join
    If ASM is down on node2, node3 then we can start ASM node1Reconfiguration started (old inc 0, new inc 6)
    ASM instance
    List of nodes:
    0 1 2
    Global Resource Directory frozen
    * allocate domain 0, invalid = TRUE
    Communication channels reestablished
    * allocate domain 1, invalid = TRUE
    * allocate domain 2, invalid = TRUE
    Mon Mar 01 16:53:00 2010
    Trace dumping is performing id=[cdmp_20100301165301]
    Mon Mar 01 16:53:55 2010
    ERROR: LMD0 (ospid: 274638) detects an idle connection to instance 2
    Mon Mar 01 16:54:44 2010
    Errors in file /oradb/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_lmon_860280.trc (incident=116865):
    ORA-29740: evicted by member 1, group incarnation 8
    Incident details in: /oradb/oracle/diag/asm/+asm/+ASM1/incident/incdir_116865/+ASM1_lmon_860280_i116865.trc
    Errors in file /oradb/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_lmon_860280.trc:
    ORA-29740: evicted by member 1, group incarnation 8
    LMON (ospid: 860280): terminating the instance due to error 29740
    Mon Mar 01 16:54:46 2010
    System state dump is made for local instance
    Errors in file /oradb/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_diag_614488.trc (incident=116833):
    ORA-29740: evicted by member , group incarnation
    Incident details in: /oradb/oracle/diag/asm/+asm/+ASM1/incident/incdir_116833/+ASM1_diag_614488_i116833.trc
    Mon Mar 01 16:54:46 2010
    ORA-1092 : opitsk aborting process
    Errors in file /oradb/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_diag_614488.trc:
    ORA-29740: evicted by member , group incarnation
    Trace dumping is performing id=[cdmp_20100301165446]
    Instance terminated by LMON, pid = 860280
    Another thing we found that when we start ASM on node1, the cluster interconnect hangs when we try to ping
    We did modify the cluster_interconnect parameter to try to start using public interface but the issued remained the same and we were not able to ping public interface
    The crs is fine
    $ crs_stat -t
    Name Type Target State Host
    ora....p1.inst application ONLINE OFFLINE
    ora....p2.inst application ONLINE ONLINE noden2
    ora....p3.inst application ONLINE ONLINE noden3
    ora....1p2.srv application ONLINE ONLINE noden2
    ora....1p3.srv application ONLINE ONLINE noden3
    ora.....net.cs application ONLINE ONLINE noden1
    ora.appl.db application ONLINE ONLINE noden1
    ora....SM1.asm application ONLINE OFFLINE
    ora....N1.lsnr application ONLINE ONLINE noden1
    ora....8n1.gsd application ONLINE ONLINE noden1
    ora....8n1.ons application ONLINE ONLINE noden1
    ora....8n1.vip application ONLINE ONLINE noden1
    ora....SM2.asm application ONLINE ONLINE noden2
    ora....N2.lsnr application ONLINE ONLINE noden2
    ora....8n2.gsd application ONLINE ONLINE noden2
    ora....8n2.ons application ONLINE ONLINE noden2
    ora....8n2.vip application ONLINE ONLINE noden2
    ora....SM3.asm application ONLINE ONLINE noden3
    ora....N3.lsnr application ONLINE ONLINE noden3
    ora....8n3.gsd application ONLINE ONLINE noden3
    ora....8n3.ons application ONLINE ONLINE noden3
    ora....8n3.vip application ONLINE ONLINE noden3
    Any inpts can help

    Env
    3-node RAC
    oracle version 11.1.0.7
    Latest PSU Jan applied
    OS is AIX version is 6100-02==========
    LMON trace files
    ==========
    Trace file /oradb/oracle/diag/asm/+asm/+ASM1/trace/+ASM1_lmon_860280.trc
    Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
    With the Partitioning, Real Application Clusters, OLAP, Data Mining
    and Real Application Testing options
    ORACLE_HOME = /oradb/oracle/product/11.1/asm_1
    System name:     AIX
    Node name:     host-node1
    Release:     1
    Version:     6
    Machine:     00C39EA44C00
    Instance name: +ASM1
    Redo thread mounted by this instance: 0 <none>
    Oracle process number: 8
    Unix process pid: 860280, image: oracle@host-node1 (LMON)
    *** 2010-03-01 16:50:23.023
    *** SESSION ID:(218.1) 2010-03-01 16:50:23.023
    *** CLIENT ID:() 2010-03-01 16:50:23.023
    *** SERVICE NAME:() 2010-03-01 16:50:23.023
    *** MODULE NAME:() 2010-03-01 16:50:23.023
    *** ACTION NAME:() 2010-03-01 16:50:23.023
    GES resources 5596 pool 6
    GES enqueues 7959
    GES IPC: Receivers 2 Senders 2
    GES IPC: Buffers Receive 1000 Send (i:1150 b:482) Reserve 402
    GES IPC: Msg Size Regular 416 Batch 8192
    Batching factor: enqueue replay 201, ack 224
    Batching factor: cache replay 126 size per lock 64
    kjxggin: CGS tickets = 1000
    kjxgrdmpcpu: CPU Total 6 Core 3 Socket -1 OCPU 6
    kjxgrdmpcpu: High load threshold 21504
    *** 2010-03-01 16:50:23.362
    kjxgmrcfg: Reconfiguration started, type 1
    kjxgmcs: Setting state to 0 0.
    *** 2010-03-01 16:50:23.363
    Name Service frozen
    kjxgmcs: Setting state to 0 1.
    kjxgrdecidever: No old version members in the cluster
    kjxgrssvote: reconfig bitmap chksum 0x88477268 cnt 3 master 0 ret 0
    ksirValidateModuleInfo: action = 10 startup = 0
    Name Service Mode: multi (0x21)
    kjfcpiora: published my fusion master weight 5322
    kjfcpiora: publish my flogb 9
    kjfcpiora: publish my cluster_database_instances parameter=3
    kjxggpoll: change poll time to 50 ms
    kjxgrpropmsg: SSMEMI: inst 1 - no disk vote
    kjxgrpropmsg: SSMEMI: inst 1 - no disk vote
    kjxgrpropmsg: SSMEMI: inst 2 - no disk vote
    SSVOTE: Master indicates no Disk Voting
    kjxgmps: proposing substate 2
    kjxgmcs: Setting state to 6 2.
    kjfmuin: bitmap 0 1 2
    kjfmmhi: received msg from 0 (inc 6)
    kjfmmhi: received msg from 1 (inc 2)
    kjfmmhi: received msg from 2 (inc 4)
    Performed the unique instance identification check
    kjxgmps: proposing substate 3
    kjxgmcs: Setting state to 6 3.
    Name Service recovery started
    Deleted all dead-instance name entries
    kjxgmps: proposing substate 4
    kjxgmcs: Setting state to 6 4.
    Multicasted all local name entries for publish
    Replayed all pending requests
    kjxgmps: proposing substate 5
    kjxgmcs: Setting state to 6 5.
    Name Service normal
    Name Service recovery done
    *** 2010-03-01 16:50:23.889
    *** 2010-03-01 16:50:23.958
    kjxgmps: proposing substate 6
    kjxgmcs: Setting state to 6 6.
    kjxggpoll: change poll time to 600 ms
    2010-03-01 16:50:23.980620 :
    ********* kjfcrfg() called, BEGIN LMON RCFG *********
    kjfcrfg: DRM window size = 0->128 (min lognb = 9)
    2010-03-01 16:50:23.980811 :
    Reconfiguration started (old inc 0, new inc 6)
    ASM instance
    Send timeout: 300 secs
    Defer Queue timeout: 360 secs
    Synchronization timeout: 420 sec
    List of nodes:
    0 1 2
    *** 2010-03-01 16:50:24.023
    2010-03-01 16:50:24.034432 : Global Resource Directory frozen
    node 0
    release 11 1 0 7
    node 1
    release 11 1 0 7
    node 2
    release 11 1 0 7
    number of mastership buckets = 128
    2010-03-01 16:50:24.034959 :
    domain attach called for domid 0
    * kjbdomalc: domain 0 invalid = TRUE
    * kjbdomatt: first attach for domain 0
    asby init, 0/0/x1
    asby returns, 0/0/x1/false
    * Domain maps before reconfiguration:
    * DOMAIN 0 (valid 1): 0
    * End of domain mappings
    * Domain maps after recomputation:
    * DOMAIN 0 (valid 1): 0 1 2
    * End of domain mappings
    Dead inst
    Join inst 0 1 2
    Exist inst
    Active Sendback Threshold = 50 %
    Communication channels reestablished
    2010-03-01 16:50:24.152688 :
    received all domreplay (6.6)
    2010-03-01 16:50:24.152732 :
    sent master 1 (6.6)
    *** 2010-03-01 16:53:00.494
    kjfmReceiverHealthCB_Check: Reciever [0] is healthy.
    2010-03-01 16:52:56.921800 : Received comm error info from 2 (cnt 1)
    kjxgrvalid: valid - 0.1 : (6 6) from 2
    kjxgrrcfgchk: Initiating reconfig, reason=3
    kjxgrrcfgchk: COMM rcfg - Disk Vote Required
    2010-03-01 16:52:57.077877 : kjxgrnetchk: start 0x53001440, end 0x53019ae0
    2010-03-01 16:52:57.077906 : kjxgrnetchk: Sending comm check req to 1
    2010-03-01 16:52:57.078140 : kjxgrnetchk: Sending comm check req to 2
    kjxgrrcfgchk: prev pstate 5 mapsz 512
    kjxgrrcfgchk: new bmp: 0 1 2
    kjxgrrcfgchk: work bmp: 0 1 2
    kjxgrrcfgchk: rr bmp: 0 1 2
    *** 2010-03-01 16:53:00.792
    kjxgmrcfg: Reconfiguration started, type 3
    kjxgmcs: Setting state to 6 0.
    *** 2010-03-01 16:53:00.792
    Name Service frozen
    kjxgmcs: Setting state to 6 1.
    kjxgrdecidever: No old version members in the cluster
    kjxgrmsghndlr: Queue msg (0x110a21e50->0x110f09b90) type 7 for later
    *** 2010-03-01 16:54:43.233
    kjxgrssvote: reconfig bitmap chksum 0x88477268 cnt 3 master 2 ret 0
    kjxgrrcfgchk: disable CGS timeout
    kjxggpoll: change poll time to 50 ms
    * kjfcchknested: CGS rcfg detected in step 7.0.0
    SSVOTE: Master indicates Disk Voting required
    2010-03-01 16:54:37.535518 : kjxgrmsghndlr: evict req from 1 for 0, seq (8, 8) vers 2193970751
    2010-03-01 16:54:37.535587 : kjxgrdtrt: Evicted by 1, seq (8, 8)
    IMR state information
    Member 0, thread -1, state 0x2:c, flags 0x2c48
    RR seq commit 6 cur 8
    Propstate 3 prv 2 pending 0
    rcfg rsn 3, rcfg time 1392514113, mem ct 3
    master 2, master rcfg time 1392479783
    evicted memcnt 0, starttm 0 chkcnt 0
    system load 241 (normal)
    Member information:
    Member 0, incarn 6, version 0x82c5563f, thrd -1
    prev thrd -1, status 0x1203 (JR..), err 0x0000
    Member 1, incarn 6, version 0x82c1073b, thrd 2
    prev thrd -1, status 0x1007 (JRM.), err 0x0002
    Member 2, incarn 6, version 0x82c114ee, thrd 3
    prev thrd -1, status 0x0007 (JRM.), err 0x0000
    =====================================================
    Group name: +ASM
    Member id: 0
    Cached KGXGN event: 0
    Group State:
    State: 6 1
    Reconfig started start-tm 0x4b8c373c tmout period 0xffffffff state 0x2
    Reconfig INPG type 3 inc 6 rsn 0 data 0x0
    Reconfig COMP type 1 inc 6 rsn 0 data 0x0
    Commited Map: 0 1 2
    New Map: 0 1 2
    KGXGN Map: 0 1 2
    KGXGN Map2: 0 1 2
    Master node: 0
    Memcnt 3 Rcvcnt 0
    Substate Proposal: false
    Inc Proposal:
    incarn 0 memcnt 0 master 0
    proposal false matched false
    map:
    Master Inc State:
    incarn 0 memcnt 0 agrees 0 flag 0x1
    wmap:
    nmap:
    ubmap:
    Substate Handler Execution State
    substate 0 status done
    substate 1 status done
    substate 2 status done
    substate 3 status done
    substate 4 status done
    substate 5 status done
    substate 6 status done
    IMR hist: 20[0x0a00:0x53019b0e] 4[0x0007:0x53019b0e] 3[0x0006:0x53019b0e]
    IMR hist: 20[0x0902:0x53019b0e] 20[0x0702:0x53019b0b] 20[0x0702:0x53019b0a]
    IMR hist: 20[0x0702:0x53019b0a] 1[0x0006:0x53019b0a] 20[0x0702:0x53019aff]
    IMR hist: 10[0x0006:0x52fdbdb1] 20[0x0b00:0x52fdbdb1] 9[0x0006:0x52fdbdaf]
    IMR hist: 20[0x0a02:0x52fdbdaf] 20[0x0a01:0x52fdbce1] 20[0x0a00:0x52fdbc8a]
    IMR hist: 4[0x0005:0x52fdbc86] 3[0x0004:0x52fdbc4c] 20[0x0900:0x52fdbc4c]
    IMR hist: 20[0x0802:0x52fdbc08] 20[0x0801:0x52fdbc08] 20[0x0801:0x52fdbc08]
    IMR hist: 20[0x0602:0x52fdbc08] 20[0x0601:0x52fdbc08] 20[0x0601:0x52fdbc08]
    IMR hist: 20[0x0800:0x52fdbc08] 20[0x0700:0x52fdbc08] 20[0x0602:0x52fdbc07]
    IMR hist: 20[0x0800:0x52fdbc07] 20[0x0700:0x52fdbc07] 1[0x0000:0x52fdbbb8]
    IMR hist: 0[0x0000:0x00000000] 0[0x0000:0x00000000]
    KJM HIST LMD0:
    7:0 6:0 5:7:0 12:97697 7:0 6:0 5:7:0 12:97696 7:0 6:0
    5:7:0 12:97703 7:0 6:0 5:7:0 2:0 1:0 12:97713 7:0 6:0
    5:7:0 12:97766 7:0 6:0 5:7:0 12:97782 7:0 6:0 5:7:0 12:97778
    7:0 6:0 5:7:0 12:97799 7:0 6:0 5:7:0 12:97771 7:0 6:0
    5:7:0 12:97784 7:0 6:0 5:7:0 12:97805 7:0 6:0 5:7:0 12:97785
    7:0 6:0 5:7:0 12:97757 7:0 6:0 5:7:0 12:97770 7:0 6:0
    5:7:0 12:97784 7:0 6:0
    KJM HIST LMS0:
    7:0 6:0 5:7:0 10:0 12:97697 7:0 6:0 5:7:0 10:0 12:97696
    7:0 6:0 5:7:0 10:0 12:97703 7:0 6:0 5:7:0 10:0 12:97713
    7:0 6:0 5:7:0 10:0 2:0 12:97766 7:0 6:0 5:7:0 10:0
    12:97782 7:0 6:0 5:7:0 10:0 12:97778 7:0 6:0 5:7:0 10:0
    12:97799 7:0 6:0 5:7:0 10:0 12:97771 7:0 6:0 5:7:0 10:0
    12:97784 7:0 6:0 5:7:0 10:0 12:97805 7:0 6:0 5:7:0 10:0
    12:97785 7:0 6:0 5:7:0
    DUMP state for lmd0 (ospid 274638)
    DUMP IPC context for lmd0 (ospid 274638)
    Dumping process 9.274638 info:
    *** 2010-03-01 16:54:43.664
    Process diagnostic dump for oracle@host-node1 (LMD0), OS id=274638,
    pid: 9, proc_ser: 1, sid: 217, sess_ser: 1
    loadavg : 1.72 1.07 0.90
    swap info: free_mem = 28642.09M rsv = 16.00M
    alloc = 21.13M avail = 4096.00M swap_free = 4074.87M
    F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
    240001 A oracle 274638 1 0 60 20 12ca9f590 156060 16:50:23 - 0:00 asm_lmd0_+ASM1
    Short stack dump:
    <-ksedsts()+0254<-ksdxfstk()+0028<-ksdxcb()+05d8<-sspuser()+0074<-4750<-poll()+000c<-sskgxp_select()+00e4<-skgxpiwait()+08a4<-skgxpwait()+06fc<-ksxpwait()+081c<-ksliwat()+0a58<-kslwaitctx()+0150<-kslwait()+006c<-ksxprcvimd()+0368<-kjctr_rksxp()+013c<-kjctrcv()+0160<-kjcsrmg()+005c<-kjmdm()+2454<-ksbrdp()+075c<-opirip()+0444<-opidrv()+0414<-sou2o()+0090<-opimai_real()+0148<-main()+0090<-__start()+0070
    Process diagnostic dump actual duration=0.161000 sec
    (max dump time=30.000000 sec)
    *** 2010-03-01 16:54:43.825
    SO: 0x70000001ff913a0, type: 2, owner: 0x0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
    proc=0x70000001ff913a0, name=process, file=ksu.h LINE:10706 ID:, pg=0
    (process) Oracle pid:9, ser:1, calls cur/top: 0x70000001f733140/0x70000001f733140
    flags : (0x6) SYSTEM
    flags2: (0x100), flags3: (0x0)
    int error: 0, call error: 0, sess error: 0, txn error 0
    ksudlp FALSE at location: 0
    (post info) last post received: 0 0 83
    last post received-location: kji.h LINE:2369 ID:kjga: clear wait for lmon
    last process to post me: 70000001ff903b0 1 6
    last post sent: 0 0 25
    last post sent-location: ksa2.h LINE:282 ID:ksasnd
    last process posted by me: 70000001ff903b0 1 6
    (latch info) wait_event=68 bits=0
    Process Group: DEFAULT, pseudo proc: 0x70000001f4851d0
    O/S info: user: oracle, term: UNKNOWN, ospid: 274638
    OSD pid info: Unix process pid: 274638, image: oracle@host-node1 (LMD0)
    Dump of memory from 0x070000001FF70038 to 0x070000001FF70240
    70000001FF70030 00000000 00000000 [........]
    70000001FF70040 00000000 00000000 00000000 00000000 [................]
    Repeat 31 times
    SO: 0x70000001f6de4a0, type: 4, owner: 0x70000001ff913a0, flag: INIT/-/-/0x00 if: 0x3 c: 0x3
    proc=0x70000001ff913a0, name=session, file=ksu.h LINE:10719 ID:, pg=0
    (session) sid: 217 ser: 1 trans: 0x0, creator: 0x70000001ff913a0
    flags: (0x51) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
    flags2: (0x408) -/-
    DID: , short-term DID:
    txn branch: 0x0
    oct: 0, prv: 0, sql: 0x0, psql: 0x0, user: 0/SYS
    ksuxds FALSE at location: 0
    service name: SYS$BACKGROUND
    Current Wait Stack:
    0: waiting for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2613 seq_num=2614 snap_id=1
    wait times: snap=0.018269 sec, exc=0.018269 sec, total=0.018269 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    in_wait=1 iflags=0x5a8
    Wait State:
    auto_close=0 flags=0x22 boundary=0x0/-1
    Session Wait History:
    0: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2612 seq_num=2613 snap_id=1
    wait times: snap=0.160172 sec, exc=0.160172 sec, total=0.160172 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000008 sec of elapsed time
    1: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2611 seq_num=2612 snap_id=1
    wait times: snap=0.096359 sec, exc=0.096359 sec, total=0.096359 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000008 sec of elapsed time
    2: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2610 seq_num=2611 snap_id=1
    wait times: snap=0.098065 sec, exc=0.098065 sec, total=0.098065 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000007 sec of elapsed time
    3: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2609 seq_num=2610 snap_id=1
    wait times: snap=0.097831 sec, exc=0.097831 sec, total=0.097831 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000014 sec of elapsed time
    4: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2608 seq_num=2609 snap_id=1
    wait times: snap=0.095876 sec, exc=0.095876 sec, total=0.095876 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000008 sec of elapsed time
    5: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2607 seq_num=2608 snap_id=1
    wait times: snap=0.098788 sec, exc=0.098788 sec, total=0.098788 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000006 sec of elapsed time
    6: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2606 seq_num=2607 snap_id=1
    wait times: snap=0.098854 sec, exc=0.098854 sec, total=0.098854 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000007 sec of elapsed time
    7: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2605 seq_num=2606 snap_id=1
    wait times: snap=0.098040 sec, exc=0.098040 sec, total=0.098040 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000008 sec of elapsed time
    8: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2604 seq_num=2605 snap_id=1
    wait times: snap=0.097322 sec, exc=0.097322 sec, total=0.097322 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000007 sec of elapsed time
    9: waited for 'ges remote message'
    waittime=40, loop=0, p3=44
    wait_id=2603 seq_num=2604 snap_id=1
    wait times: snap=0.097334 sec, exc=0.097334 sec, total=0.097334 sec
    wait times: max=0.080000 sec
    wait counts: calls=1 os=1
    occurred after 0.000008 sec of elapsed time
    Sampled Session History
    The sampled session history is constructed by sampling
    the target session every 1 second. The sampling process
    captures at each sample if the session is in a non-idle wait,
    an idle wait, or not in a wait. If the session is in a
    non-idle wait then one interval is shown for all the samples
    the session was in the same non-idle wait. If the
    session is in an idle wait or not in a wait for
    consecutive samples then one interval is shown for all
    the consecutive samples. Though we display these consecutive
    samples in a single interval the session may NOT be continuously
    idle or not in a wait (the sampling process does not know).
    The history is displayed in reverse chronological order.
    sample interval: 1 sec, max history 120 sec
    KSFD PGA DUMPS
    Number of completed I/O requests=0 flags=0
    END OF PROCESS STATE
    LMON IPC context:
    ksxpdmp: facility 0 (?) (0x1, 0x0) counts 0, 0
    ksxpdmp: Dumping the osd context
    SKGXP: SKGXPCTX: 0x1103bfb58 ctx
    SKGXP:
    SKGXP: WAIT HISTORY
    SKGXP: Time(msec)     Wait Type     Return Code
    SKGXP: ----------     ---------     ------------
    SKGXP: 0          NORMAL          SUCC
    SKGXP: 0          NORMAL          SUCC
    SKGXP: 0          NORMAL          SUCC
    SKGXP: 0          NORMAL          SUCC
    SKGXP: 0          NORMAL          TIMEDOUT
    SKGXP: 12          NORMAL          TIMEDOUT
    SKGXP: 0          NORMAL          TIMEDOUT
    SKGXP: 20          NORMAL          TIMEDOUT
    SKGXP: 0          NORMAL          TIMEDOUT
    SKGXP: 19          NORMAL          TIMEDOUT
    SKGXP: 0          NORMAL          TIMEDOUT
    SKGXP: 20          NORMAL          TIMEDOUT
    SKGXP: 0          NORMAL          TIMEDOUT
    SKGXP: 19          NORMAL          TIMEDOUT
    SKGXP: 0          NORMAL          TIMEDOUT
    SKGXP: 20          NORMAL          TIMEDOUT
    SKGXP: wait delta 0 sec (27 msec) ctx ts 0x3e377 last ts 0x3e381
    SKGXP: user cpu time since last wait 0 sec 0 ticks
    SKGXP: system cpu time since last wait 0 sec 0 ticks
    SKGXP: locked 1
    SKGXP: blocked 51
    SKGXP: timed wait receives 0
    SKGXP: admno 0x485303b1 admport:
    SKGXP: SSKGXPT 0x103c0a74 flags sockno 12 IP 192.168.253.49 UDP 49777
    SKGXP: context timestamp 0x3e377
    SKGXP: buffers queued on port 1105aa950
    SKGXP:
    SKGXP: Dumping Connection Handle Table
    SKGXP: sconno accono ertt state seq# RcvPid TotCreditsSKGXP: sent rtrans acks
    SKGXP: CNH Table Bucket: 10
    SKGXP: 0x339d0248 0x6dd6841c 64 4 32838 589900 8SKGXP: 75d 5d 32838d
    SKGXP: CNH Table Bucket: 11
    SKGXP: 0x339d0249 0x75ef4c98 32 4 32811 1007758 8SKGXP: 48d 12d 32811d
    SKGXP: CNH Table Bucket: 12
    SKGXP: 0x339d024a 0x75703ec2 16 4 32763 524518 8SKGXP: 0d 0d 0d
    SKGXP: CNH Table Bucket: 13
    SKGXP: 0x339d024b 0x41094259 16 4 32763 520260 8SKGXP: 0d 0d 0d
    SKGXP: CNH Table Bucket: 14
    SKGXP: 0x339d024c 0x7c1c696c 16 4 32763 585808 8SKGXP: 0d 0d 0d
    SKGXP: CNH Table Bucket: 15
    SKGXP: 0x339d024d 0x138c8c4a 16 4 32763 843952 8SKGXP: 0d 0d 0d
    SKGXP:
    SKGXP: Dumping Accept Handle Table
    SKGXP: ach accono sconno admno state SndPid seq# rcv rtrans acks credits
    SKGXP: ACH Table Bucket: 1472
    SKGXP: 0x111088010 0x48cb4387 0x3365b236 0x1fe7dc68 40 1007758 32812 49 0 26 8
    SKGXP: ACH Table Bucket: 1474
    SKGXP: 0x11108b730 0x48cb4389 0x1c69654a 0x7183ff4c 40 589900 32838 75 0 52 8
    Incident 116865 created, dump file: /oradb/oracle/diag/asm/+asm/+ASM1/incident/incdir_116865/+ASM1_lmon_860280_i116865.trc
    ORA-29740: evicted by member 1, group incarnation 8
    error 29740 detected in background process
    ORA-29740: evicted by member 1, group incarnation 8
    *** 2010-03-01 16:54:46.430
    LMON (ospid: 860280): terminating the instance due to error 29740
    ksuitm: waiting up to [5] seconds before killing DIAG
    ==========
    DIAG trace files
    =========
    Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
    With the Partitioning, Real Application Clusters, OLAP, Data Mining
    and Real Application Testing options
    ORACLE_HOME = /oradb/oracle/product/11.1/asm_1
    System name:     AIX
    Node name:     host-node1
    Release:     1
    Version:     6
    Machine:     00C39EA44C00
    Instance name: +ASM1
    Redo thread mounted by this instance: 0 <none>
    Oracle process number: 4
    Unix process pid: 614488, image: oracle@host-node1 (DIAG)
    *** 2010-03-01 16:50:22.947
    *** SESSION ID:(222.1) 2010-03-01 16:50:22.947
    *** CLIENT ID:() 2010-03-01 16:50:22.947
    *** SERVICE NAME:() 2010-03-01 16:50:22.947
    *** MODULE NAME:() 2010-03-01 16:50:22.947
    *** ACTION NAME:() 2010-03-01 16:50:22.947
    Node id: 0
    List of nodes: 0, 1, 2,
    *** 2010-03-01 16:50:22.948
    Reconfiguration starts [incarn=0]
    *** 2010-03-01 16:50:22.948
    I'm the master node
    Group reconfiguration cleanup
    *** 2010-03-01 16:50:23.602
    A rcfg proposal from node 2 is received
    *** 2010-03-01 16:50:23.602
    A rcfg proposal from node 1 is received
    *** 2010-03-01 16:50:23.602
    Reconfiguration completes [incarn=3]
    *** 2010-03-01 16:53:00.877
    A dump event msg is rcv'd
    REQUEST:trace dump in directory cdmp_20100301165301
    *** 2010-03-01 16:53:00.877
    Trace dumping is performing id=[cdmp_20100301165301]....
    *** 2010-03-01 16:53:01.041
    Trace dumping is done
    *** 2010-03-01 16:54:46.560
    Instance is terminating by process 860280 [ospid=oracle@host-node1 (LMON)]
    Performing diagnostic data dump for this instance
    Incident 116833 created, dump file: /oradb/oracle/diag/asm/+asm/+ASM1/incident/incdir_116833/+ASM1_diag_614488_i116833.trc
    ORA-29740: evicted by member , group incarnation
    Error 29740 encountered during system state dump
    *** 2010-03-01 16:54:49.280
    ----- Error Stack Dump -----
    ORA-29740: evicted by member , group incarnation
    *** 2010-03-01 16:54:49.281
    Trace dumping is performing id=[cdmp_20100301165446]....
    *** 2010-03-01 16:54:49.433
    Trace dumping is done

  • Show vertical scrollbar when 1D array only shows one element

    When showing only one element of one dimensional array, only the horizontal scrollbar can be displayed.  The vertical scrollbar visible option is greyed out.  Even using property nodes does not make it visible.  I have an array of clusters with many controls in each cluster that are aranged hozizontally across the front panel.  I'd like the user to see only one element (one set of controls) at a time and be able to scroll through them using a vertical scollbar.

    I think Brent's suggestion is very good.  Another hack would be to cover the additional array elements with a decoration so that you only see one element.
    Message Edited by vt92 on 04-21-2009 09:58 AM
    "There is a God shaped vacuum in the heart of every man which cannot be filled by any created thing, but only by God, the Creator, made known through Jesus." - Blaise Pascal
    Attachments:
    need to scroll.PNG ‏14 KB

  • Doubts about RAC infraestructure with one disk array

    Hello everybody,
    I'm writing to you because we have a doubt about the correct infrastructure to implement RAC.
    Please, let me first explain the current design we are using for Oracle DB storage. Currently we are running several standalone instances in several servers, all of them connected to a SAN disk storage array. As we know this is a single point of failure we have redundant controlfiles, archiveds and redos both in the array and in the internal disk of each server, so in case array completely fails we “just” need to recover nightly cold backup, apply archs and redos and everything it's ok. This can be done because we have standalone instances and we can assume this 1 hour downtime.
    Now we want to use these servers and this array to implement a RAC solution and we know this array is our single point of failure and wonder if it's possible to have a multinode RAC solution (not RAC One Node) with redundant controlfiles/archs/redos in internal disks. Is it possible to have each node writing full RAC controlfiles/archs/redos in internal disks and apply these files consistently when the ASM filesystem used for RAC is restores (i.e. with a softlink in an internal disk and using just one node)? Or maybe the recommended solution is to have a second array to avoid this single point of failure?
    Thanks a lot!

    cssl wrote:
    Or maybe the recommended solution is to have a second array to avoid this single point of failure?Correct. This is the proper solution.
    In this case you can also decide to simply use striping on both arrays, then mirror array1's data onto array2 using ASM redundancy options.
    Also keep in mind that redundancy is also need for the connectivity. So you need at least 2 switches to connect to both arrays, and dual HBA ports on each server, with 2 fibres running, one to each switch. You will need multipath driver s/w on the server to deal with the multiple I/O paths to the same storage LUNs.
    Likewise you need to repeat this for your Interconnect. 2 private switches, 2 private NICs on each server that are bonded. Then connect these 2 NICs to the 2 switches, one NIC per switch.
    Also do not forget spares. Spare switches (one each for storage and Interconnect). Spare cables - fibre and whatever is used for the Interconnect.
    Bottom line - not a cheap solution to have full redundancy. What can be done is to combine the storage connection/protocol layer with the Interconnect layer and run both over the same architecture. Oracle's Database Machine and Exadata Storage Servers do this. You can run your storage protocol (e.g. SRP) and your Interconnect protocol (TCP or RDS) over the same 40Gb Infiniband infrastructure.
    Thus only 2 Infiniband switches are needed for redundancy, plus 1 spare. With each server running a dual port HCA and a cable to each of these 2 switches.

  • Disk repair failed how can I back up my files when I can't boot from the start up disk

    When disk repair fails, and I can't boot from the hard drive how can I get to my files to back them up. I have a seagate 1 TB connected to the USB port.

    Perubbit,
    which model MacBook Pro do you have, and which version of OS X is installed on it?

  • Question on 11.2 RAC Networking Configuration

    I have a random question regarding the network interface configuration for (specifically 11.2) an Oracle RAC requirement. Is there a requirement anywhere that states that all nodes in the cluster must have the same named interface on each distinct node serving the same purpose in order for all nodes of the cluster to communicate properly?
    Here is an example of my question - My curiosity is, is this a requirement?
    Node 1
    eth1 - 10.0.0.1 (public)
    eth1:1 10.0.0.25 (vip)
    eth2: 192.168.1.25 (priv/interconnect)
    Node 2
    eth1 10.0.0.2 (public)
    eth1:2 10.0.0.26 (vip)
    eth5: 192.168.1.35 (priv/interconnect)
    Note: eth2 is private on node 1, where eth5 is private on node 2.
    Does node 2 have to follow the same pattern where eth1 is public, and eth2 is private, or for node 2, can eth5 be assigned to the private subnet and eth1 be assigned to the public subnet?
    Odd question I know. Any means to clarify for me would be most helpful and appreciated.
    Thanks,
    CJ

    Hi,
    DBA wrote:
    I have a random question regarding the network interface configuration for (specifically 11.2) an Oracle RAC requirement. Is there a requirement anywhere that states that all nodes in the cluster must have the same named interface on each distinct node serving the same purpose in order for all nodes of the cluster to communicate properly?
    Here is an example of my question - My curiosity is, is this a requirement?
    Node 1
    eth1 - 10.0.0.1 (public)
    eth1:1 10.0.0.25 (vip)
    eth2: 192.168.1.25 (priv/interconnect)
    Node 2
    eth1 10.0.0.2 (public)
    eth1:2 10.0.0.26 (vip)
    eth5: 192.168.1.35 (priv/interconnect)
    Note: eth2 is private on node 1, where eth5 is private on node 2.
    Does node 2 have to follow the same pattern where eth1 is public, and eth2 is private, or for node 2, can eth5 be assigned to the private subnet and eth1 be assigned to the public subnet?
    Doc 11.2.0.2 says:
    About Network Hardware Requirements
    * Public interface names must be the same for all nodes. If the public interface on one node uses the network adapter eth0, then you must configure eth0 as the public interface on all nodes. Network interface names are case-sensitive.
    * You should configure the same private interface names for all nodes as well. If eth1 is the private interface name for the first node, then eth1 should be the private interface name for your second node. Network interface names are case-sensitive.
    http://download.oracle.com/docs/cd/E11882_01/rac.112/e17264/preparing.htm#TDPRC123
    Network Hardware Requirements (Linux Installation)
    If you install Oracle Clusterware using OUI, then the public interface names associated with the network adapters for each network must be the same on all nodes, and the private interface names associated with the network adaptors should be the same on all nodes. This restriction does not apply if you use cloning, either to create a new cluster, or to add nodes to an existing cluster.
    For example: With a two-node cluster, you cannot configure network adapters on node1 with eth0 as the public interface, but on node2 have eth1 as the public interface. Public interface names must be the same, so you must configure eth0 as public on both nodes. You should configure the private interfaces on the same network adapters as well. If eth1 is the private interface for node1, then eth1 should be the private interface for node2.
    http://download.oracle.com/docs/cd/E11882_01/install.112/e17212/prelinux.htm#CWLIN209
    As the documentation says this restriction is valid when Install using the OUI, but have different interface name is supported if configured manually.
    Anyway if you are installing using the OUI you have the option to rename the interface thus like Bjoern said.
    Can We Rename a Solaris Network Interface To Fit a New Node Into an Oracle RAC ? [ID 1288614.1]
    Far as I Know using Solaris/AIX/Linux this is possible but depending on the OS and type of Kernel may not be supported.
    Regards,
    Levi Pereira

  • Backup to Remote Disk Array

    Hi,
    I have a remote disk array of IDE drives (2TB) and want to use RMAN to backup my database to these disks. The question is how can I do this? I though of using NFS or SAMBA but I don't think this is a good idea. Does anyone have any suggestions? A plug in library for rman? I'm using 9i with redhat Linux.
    Thanks in advance,
    Steve.

    nothing is stopping you from using NFS technically speaking, from what i know.
    as long as the master server is available during the backup you should not have any issues. the only issue could be the time it takes to backup the data.
    in your case you can configure those disk arrays to the database host and create some volumes on these disk arrays and backup your database.
    there are no libraries for disks. RMAN installation for that particular version of o/s should take care of it.
    you have to configure RMAN for media management software libraries, when you want to back the data to a tape drive.
    mukundan

  • Why only gets one node when select many nodes of tree in DWCS4 on Mac OS

    I use tag <mm:treecontrol> to create tree in DWCS4 on Mac OS.
    When I select many nodes in tree, but I only get one node by method: selectedNodes.
    codes of created tree as following:
    <mm:treecontrol name='tree' size='20' multiple noheaders>
         <mm:treecolumn state='hidden'>
              <mm:treenode value='A' state='expanded'></mm:treenode>
              <mm:treenode value='B' state='expanded'></mm:treenode>
              <mm:treenode value='C' state='expanded'></mm:treenode>
    </mm:treecontrol>
    Who can  tell me reasons?
    Thanks!
    comments: if don't use tag <mm:treecolumn>, tree will not show on Mac OS.

    Hi macbig,
    I finally got to look at my sister's computer. The HDD "Repair Disk" found missing threads, missing directory records, etc. and ended with:
    Error: Disk Utility can't repair this disk. Back up as many of your files as possible, reformat the disk, and restore your backed-
    up files.
    Then, I tried "Verify Disk" and it found invalid volume file count and ended with:
    The volume Macintosh HD was found corrupted and needs to be repaired.
    Error: This disk needs to be repaired. Click Repair Disk
    I guess running Apple Hardware Test is not going to happen. :/
    I've ordered online a new 2.5 disk, make a Maverick boot USB, and start from scratch. Do you have any other suggestions?
    As for the corrupted old hard drive, do you have any suggestions of how to get out the data somehow?
    Thank you so much!

  • RAC-DATA FILE ACCESSING ISSUE FROM ONE NODE

    Dear All,
    We have a two node RAC (10.2.0.3)running on Hp Unix. From yesterday onwards, from one instance accessing data from a specific data file showing the below error, whereas accessing from other node to the same datafile is working properly.
    Errors in file /oracle/product/admin/tap3plus/bdump/tap3plus4_dbw0_24950.trc:
    ORA-01157: cannot identify/lock data file 75 - see DBWR trace file
    ORA-01110: data file 75: '/dev/vg_rac/rraw_tap3plus_temp_live05'
    ORA-27041: unable to open file
    HPUX-ia64 Error: 19: No such device
    Additional information: 2
    Tue Jan 31 08:52:09 2012
    Errors in file /oracle/product/admin/tap3plus/bdump/tap3plus4_dbw0_24950.trc:
    ORA-01186: file 75 failed verification tests
    ORA-01157: cannot identify/lock data file 75 - see DBWR trace file
    ORA-01110: data file 75: '/dev/vg_rac/rraw_tap3plus_temp_live05'
    Tue Jan 31 08:52:09 2012
    File 75 not verified due to error ORA-01157
    Tue Jan 31 08:52:09 2012
    Thanks in Advance

    user585870 wrote:
    We have a two node RAC (10.2.0.3)running on Hp Unix. From yesterday onwards, from one instance accessing data from a specific data file showing the below error, whereas accessing from other node to the same datafile is working properly.That would be due to some kind of failure in the shared storage layer.
    RAC needs the very same storage layer to be visible and available on each RAC node - thus this needs to be some form of shared cluster storage.
    Should a piece of it fails on one node, that node would not be able to access the RAC database files on that shared storage layer - and will throw the type of errors you are seeing.
    So how does this shared storage layer look like? Fibre channels (HBAs) connected to a Fibre Channel Switch and SAN - making SAN LUNs available as shared storage devices?
    Typically a shared storage failure would throw errors in the kernel log. This is because the error is not an Oracle error, but a kernel error. As it is in your case. The bottom error on the error stack points to the root cause:
    ORA-01157: cannot identify/lock data file 75 - see DBWR trace file
    ORA-01110: data file 75: '/dev/vg_rac/rraw_tap3plus_temp_live05'
    ORA-27041: unable to open file
    HPUX-ia64 Error: 19: No such device
    So HP-UX on that node is not seeing a specific shared storage device.

Maybe you are looking for

  • Outlook 2013 - User cannot send email to one person

    One of our users got a new computer and is running Outlook 2013. They can send emails with no trouble, except to one of our other users. It always gets rejected and the sender gets an email telling them the one they tried to send was undeliverable. S

  • BEA Liquid Thinker Site not working for me

    Hi, I am using Firefox 2.0. on Linux and with Flash Player 9 installed but the Liquid Thinker site still doesn't let me in. What can I do? Many thanks EE

  • Required Windows updates to run/load dlls created with Visual C++ 2013

    I have dlls created with VS2013 C++. Dlls are statically linked with runtime libraries and therefore do not require the redistributable to run on a machine that does not have VS2013 installed. I have a test machine with Win7 SP1 64-bit without any wi

  • Why am I losing text and background when exporting from InDesign?

    Hi, I am VERY new to graphic design & working with InDesign for the first time.  When I export... to pdf, the 2nd page of my 3 page indesign document loses several (but not all) text boxes and the background.  Anyone know why this is?? PLEASE HELP!!

  • Can not hear audio books

    I have downloaded audio books to my touch ipod and I can not hear them even though it shows them playing. Any advice please? TIA