Cluster errors on 1 node of a RAC

Hello All,
I Installed Oracle RAC 11.2.0.1.0, on Oracle Enterprise Linux 5.5 32 bit.
the installation and the database creation went fine and no error were generated.
My RAC is 2 nodes (RAC1 and RAC2).
On RAC1 the instance is up and working but not on RAC2, I am not able to started, even i am not able to connect to sqlplus from RAC2.
I issued*: crsctl stat res -t* on RAC1 and below is the output:
[root@rac1 ~]# crsctl stat res -t
NAME           TARGET STATE        SERVER                   STATE_DETAILS
Local Resources
ora.DATA.dg
               ONLINE ONLINE       rac1
ora.LISTENER.lsnr
               ONLINE OFFLINE      rac1
ora.asm
               ONLINE ONLINE       rac1
ora.eons
               ONLINE ONLINE       rac1
ora.gsd
               OFFLINE OFFLINE      rac1
ora.net1.network
               ONLINE ONLINE       rac1
ora.ons
               ONLINE OFFLINE      rac1
ora.registry.acfs
               ONLINE UNKNOWN      rac1                     CHECK TIMED OUT
Cluster Resources
ora.LISTENER_SCAN1.lsnr
      1        ONLINE ONLINE       rac1
ora.oc4j
      1        OFFLINE OFFLINE
ora.orcl.db
      1        ONLINE ONLINE       rac1
      2        ONLINE OFFLINE
ora.rac1.vip
      1        ONLINE ONLINE       rac1
ora.rac2.vip
      1        ONLINE OFFLINE
ora.scan1.vip
      1        ONLINE ONLINE       rac1 but RAC2 below is the output:
[root@rac2 ~]# crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.when i tried to restart crs on RAC2 below is the output:
[root@rac2 ~]# crsctl stop crs
CRS-2796: The command may not proceed when Cluster Ready Services is not running
CRS-4687: Shutdown command has completed with error(s).
CRS-4000: Command Stop failed, or completed with errors.when i try to start it :
[root@rac2 ~]# crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.
[root@rac2 ~]# your help pls, what should i do? i am new to RAC adminsitration
Regards,

Hi,
I applied these steps and below is teh output, still not able to communicate with crs:
[root@rac2 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac2'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'rac2'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac2'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac2'
CRS-2677: Stop of 'ora.cssdmonitor' on 'rac2' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'rac2' succeeded
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac2' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac2'
CRS-2677: Stop of 'ora.cssd' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac2'
CRS-2673: Attempting to stop 'ora.diskmon' on 'rac2'
CRS-2677: Stop of 'ora.gpnpd' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac2'
CRS-2677: Stop of 'ora.diskmon' on 'rac2' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'rac2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@rac2 ~]# pgrep -l d.bin
[root@rac2 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
[root@rac2 ~]# crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

Similar Messages

Error While Adding Node on RAC

Hi Friends,
Environment:SUN Solris 10
Cluster Version:10.2.0.3
database Version:10.2.0.3.0
Due to H/W failure one of our RAC node(prod1) got formatted.
We have deleted the node(prod1) from RAC successfully.
Following the below link..
http://download.oracle.com/docs/cd/B19306_01/rac.102/b14197/adddelunix.htm#BEIBBADI
But When i am again trying to add the node i am facing the below issue..
while doing Adding an Oracle Clusterware Home to a New Node Using OUI in Interactive Mode
1.Ensure that you have successfully installed Oracle Clusterware on at least one node in your cluster environment. To use these procedures as shown, your $CRS_HOME environment variable must identify your successfully installed Oracle Clusterware home.
2.Go to CRS_home/oui/bin and run the addNode.sh script.
3.The Oracle Universal Installer (OUI) displays the Node Selection Page on which you should select the node or nodes that you want to add and click Next.
4.Verify the entries that OUI displays on the Summary Page and click Next.
5.Run the rootaddNode.sh script from the CRS_home/install/ directory on the node from which you are running OUI.
6.Run the orainstRoot.sh script on the new node if OUI prompts you to do so.when i am doing the step 6 on the new node, i am facing the below error..
bash-3.00# /software/oracle/oraInventory/orainstRoot.sh
Changing permissions of /software/oracle/oraInventory to 770.
Changing groupname of /software/oracle/oraInventory to oinstall.
The execution of the script is complete
bash-3.00# /software/oracle/product/crs/root.sh
WARNING: directory '/software/oracle/product' is not owned by root
WARNING: directory '/software/oracle' is not owned by root
"/opt/oracle/voting/voting1" does not exist. Create it before proceeding.
Make sure that this file is shared across cluster nodes.Please share your views and suggest what to do in this situation...
Regards
Umesh
Edited by: Umesh Gupta on Aug 22, 2011 12:43 PM

Helios- Gunes EROL wrote:
Hi;
Please check your voting disk
See:
http://www.oracledba.org/11g/rac/11g_RAC_Admin_Utilities.html#Viewing_Votedisk_Information:
Regard
HeliosHere are the results..
$./crsctl query css votedisk
0.     0    /opt/oracle/voting/voting1
located 1 votedisk(s).
$ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     301852
         Used space (kbytes)      :       3016
         Available space (kbytes) :     298836
         ID                       :   66623280
         Device/File Name         : /opt/oracle/ocr/ocr1
                                    Device/File integrity check succeeded
                                    Device/File not configured
         Cluster registry integrity check succeededbut from this i could not identify which raw disk is used as Voting and OCR
Regards
Umesh

Using dbca to extend RAC cluster error

Hi all,
I'm trying to extend my 11gR2 RAC cluster (POC) using the Oracle documentation (http://vishalgupta.com/oracle/docs/Database11.2/rac.112/e10718/adddelunix.htm). I've already cloned and extended Clusterware and ASM (Grid Infrastructure) to the new node, as well as cloned the RAC database software to the new node. When I run the below statement to have dbca extend add a new instance on the node for the RAC I get the error shown:
CMD:
$ORACLE_HOME/bin/dbca -silent -addInstance -nodeList newnode13 -gdbName racdb -instanceName racdb4 -sysDBAUserName sys
-sysDBAPassword manager123
ERROR:
cat racdb0.log
"Adding instance" operation on the admin managed database racdb requires instance configured on local node. There is no instance configured on the local node "newnode13".
I set ORACLE_HOME before running dbca, and I've also tried setting ORACLE_SID to both racdb4 and racdb, no change. My environment is below, any help is appreciated.
OS: SLES 11.1
Database: 11.2.0.1
Existing Nodes: node01,node02, node03
New Node: newnode13
DB Name: racdb
Instances: racdb1, racdb2, racdb3
New Instance: racdb4
Thanks.

Silly me, I was running the command from the new node instead of an existing node. I guess it was a rough weekend after all. Thanks all!

Change hostname on nodes in STANDY RAC Cluster

Hi,
I would like to know the required steps to modify and hostname configuration in Oracle RAC if I update the hostnames of the 2 nodes in the RAC
Im running Redhat EL 5 , Oraclle 11g
Thanks in advance

I don't think you could change the hostname with CRS, without re-installing CRS.

Error INS-20802 when installing Oracle RAC 11.2.0.3

I'm trying to install the Oracle Grid Infraestructure in aix 6.1. At the end of the process when it checks "Oracle Cluster Verification utility" I get
the error INS-20802. Someone knows what is the problem?
The only service that is not online is "gsd".
$ ./crs_stat -t
Name Type Target State Host
ora....ER.lsnr ora....er.type ONLINE ONLINE rac1
ora....N1.lsnr ora....er.type ONLINE ONLINE rac2
ora....N2.lsnr ora....er.type ONLINE ONLINE rac1
ora....N3.lsnr ora....er.type ONLINE ONLINE rac1
ora.OCRVTD.dg ora....up.type ONLINE ONLINE rac1
ora.asm ora.asm.type ONLINE ONLINE rac1
ora.cvu ora.cvu.type ONLINE ONLINE rac1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE rac1
ora.oc4j ora.oc4j.type ONLINE ONLINE rac1
ora.ons ora.ons.type ONLINE ONLINE rac1
ora....ry.acfs ora....fs.type ONLINE ONLINE rac1
ora.scan1.vip ora....ip.type ONLINE ONLINE rac2
ora.scan2.vip ora....ip.type ONLINE ONLINE rac1
ora.scan3.vip ora....ip.type ONLINE ONLINE rac1
ora....SM1.asm application ONLINE ONLINE rac1
ora....37.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application OFFLINE OFFLINE
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip ora....t1.type ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....38.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application OFFLINE OFFLINE
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip ora....t1.type ONLINE ONLINE rac2
This is the output of cluvfy.
$ ./cluvfy stage -post crsinst -n rac1,rac2 -verbose
Performing post-checks for cluster services setup
Checking node reachability...
Check: Node reachability from node "rac1"
Destination Node Reachable?
rac1 yes
rac2 yes
Result: Node reachability check passed from node "rac1"
Checking user equivalence...
Check: User equivalence for user "grid"
Node Name Status
rac2 passed
rac1 passed
Result: User equivalence check passed for user "grid"
Checking node connectivity...
Checking hosts config file...
Node Name Status
rac2 passed
rac1 passed
Verification of the hosts config file successful
Interface information for node "rac2"
Name IP Address Subnet Gateway Def. Gateway HW Address MTU
en0 192.168.255.58 192.168.255.0 192.168.255.58 192.168.255.1 EA:3A:6B:80:0A:51 1500
en1 192.168.171.15 192.168.171.0 192.168.171.15 192.168.255.1 EA:3A:6B:80:0A:97 1500
Interface information for node "rac1"
Name IP Address Subnet Gateway Def. Gateway HW Address MTU
en0 192.168.255.57 192.168.255.0 192.168.255.57 192.168.255.1 62:7E:A7:ED:03:51 1500
en1 192.168.171.14 192.168.171.0 192.168.171.14 192.168.255.1 62:7E:A7:ED:03:97 1500
Check: Node connectivity for interface "en0"
Source Destination Connected?
rac2[192.168.255.58] rac2[192.168.255.58] yes
rac1[192.168.255.57] rac1[192.168.255.57] yes
rac1[192.168.255.57] rac1[192.168.255.57] yes
Result: Node connectivity passed for interface "en0"
Check: TCP connectivity of subnet "192.168.255.0"
Source Destination Connected?
rac1:192.168.255.57 rac2:192.168.255.58 passed
Result: TCP connectivity check passed for subnet "192.168.255.0"
Check: Node connectivity for interface "en1"
Source Destination Connected?
rac2[192.168.171.15] rac2[192.168.171.15] yes
rac1[192.168.171.14] rac1[192.168.171.14] yes
Result: Node connectivity passed for interface "en1"
Check: TCP connectivity of subnet "192.168.171.0"
Source Destination Connected?
rac1:192.168.171.14 rac2:192.168.171.15 passed
Result: TCP connectivity check passed for subnet "192.168.171.0"
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.255.0".
Subnet mask consistency check passed for subnet "192.168.171.0".
Subnet mask consistency check passed.
Result: Node connectivity check passed
Checking multicast communication...
Checking subnet "192.168.255.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.255.0" for multicast communication with multicast group "230.0.1.0" passed.
Checking subnet "192.168.171.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.171.0" for multicast communication with multicast group "230.0.1.0" passed.
Check of multicast communication passed.
Check: Time zone consistency
Result: Time zone consistency check passed
Checking Oracle Cluster Voting Disk configuration...
ASM Running check passed. ASM is running on all specified nodes
Oracle Cluster Voting Disk configuration check passed
Checking Cluster manager integrity...
Checking CSS daemon...
Node Name Status
rac2 running
rac1 running
Oracle Cluster Synchronization Services appear to be online.
Cluster manager integrity check passed
Check default user file creation mask
Node Name Available Required Comment
rac2 022 0022 passed
rac1 022 0022 passed
Result: Default user file creation mask check passed
Checking cluster integrity...
Node Name
rac1
rac2
Cluster integrity check passed
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations
ASM Running check passed. ASM is running on all specified nodes
Checking OCR config file "/etc/oracle/ocr.loc"...
OCR config file "/etc/oracle/ocr.loc" check successful
Disk group for ocr location "+OCRVTD" available on all the nodes
NOTE:
This check does not verify the integrity of the OCR contents. Execute 'ocrcheck' as a privileged user to verify the contents of OCR.
OCR integrity check passed
Checking CRS integrity...
Clusterware version consistency passed
The Oracle Clusterware is healthy on node "rac2"
The Oracle Clusterware is healthy on node "rac1"
CRS integrity check passed
Checking node application existence...
Checking existence of VIP node application (required)
Node Name Required Running? Comment
rac2 yes yes passed
rac1 yes yes passed
VIP node application check passed
Checking existence of NETWORK node application (required)
Node Name Required Running? Comment
rac2 yes yes passed
rac1 yes yes passed
NETWORK node application check passed
Checking existence of GSD node application (optional)
Node Name Required Running? Comment
rac2 no no exists
rac1 no no exists
GSD node application is offline on nodes "rac2,rac1"
Checking existence of ONS node application (optional)
Node Name Required Running? Comment
rac2 no yes passed
rac1 no yes passed
ONS node application check passed
Checking Single Client Access Name (SCAN)...
SCAN Name Node Running? ListenerName Port Running?
txora11gr202-scan rac2 true LISTENER_SCAN1 1521 true
txora11gr202-scan rac1 true LISTENER_SCAN2 1521 true
txora11gr202-scan rac1 true LISTENER_SCAN3 1521 true
Checking TCP connectivity to SCAN Listeners...
Node ListenerName TCP connectivity?
rac1 LISTENER_SCAN1 yes
rac1 LISTENER_SCAN2 yes
rac1 LISTENER_SCAN3 yes
TCP connectivity to SCAN Listeners exists on all cluster nodes
Checking name resolution setup for "txora11gr202-scan"...
SCAN Name IP Address Status Comment
txora11gr202-scan 192.168.255.62 passed
txora11gr202-scan 192.168.255.61 passed
txora11gr202-scan 192.168.255.63 passed
Verification of SCAN VIP and Listener setup passed
Checking OLR integrity...
Checking OLR config file...
OLR config file check successful
Checking OLR file attributes...
OLR file check successful
WARNING:
This check does not verify the integrity of the OLR contents. Execute 'ocrcheck -local' as a privileged user to verify the contents of OLR.
OLR integrity check passed
OCR detected on ASM. Running ACFS Integrity checks...
Starting check to see if ASM is running on all cluster nodes...
ASM Running check passed. ASM is running on all specified nodes
Starting Disk Groups check to see if at least one Disk Group configured...
Disk Group Check passed. At least one Disk Group configured
Task ACFS Integrity check passed
Checking to make sure user "grid" is not in "system" group
Node Name Status Comment
rac2 failed exists
rac1 failed exists
Result: User "grid" is part of group "system". Check failed
Checking if Clusterware is installed on all nodes...
Check of Clusterware install passed
Checking if CTSS Resource is running on all nodes...
Check: CTSS Resource running on all nodes
Node Name Status
rac2 passed
rac1 passed
Result: CTSS resource check passed
Querying CTSS for time offset on all nodes...
Result: Query of CTSS for time offset passed
Check CTSS state started...
Check: CTSS state
Node Name State
rac2 Observer
rac1 Observer
CTSS is in Observer state. Switching over to clock synchronization checks using NTP
Starting Clock synchronization checks using Network Time Protocol(NTP)...
NTP Configuration file check started...
The NTP configuration file "/etc/ntp.conf" is available on all nodes
NTP Configuration file check passed
Checking daemon liveness...
Check: Liveness for "xntpd"
Node Name Running?
rac2 yes
rac1 yes
Result: Liveness check passed for "xntpd"
Check for NTP daemon or service alive passed on all nodes
Checking NTP daemon command line for slewing option "-x"
Check: NTP daemon command line
Node Name Slewing Option Set?
rac2 yes
rac1 yes
Result:
NTP daemon slewing option check passed
Checking NTP daemon's boot time configuration, in file "/etc/rc.tcpip", for slewing option "-x"
Check: NTP daemon's boot time configuration
Node Name Slewing Option Set?
rac2 yes
rac1 yes
Result:
NTP daemon's boot time configuration check for slewing option passed
Checking whether NTP daemon or service is using UDP port 123 on all nodes
Check for NTP daemon or service using UDP port 123
Node Name Port Open?
rac2 yes
rac1 yes
Result: Clock synchronization check using Network Time Protocol(NTP) passed
Oracle Cluster Time Synchronization Services check passed
Checking VIP configuration.
Checking VIP Subnet configuration.
Check for VIP Subnet configuration passed.
Checking VIP reachability
Check for VIP reachability passed.
Post-check for cluster services setup was unsuccessful on all the nodes.
Thanks

Hi;
I suggest close your issue here as answered than move your issue Forum Home » High Availability » RAC, ASM & Clusterware Installation which is RAC dedicated forum site.
Regard
Helios

Private Interconnect: Should any nodes other than RAC nodes have one?

The contractors that set up our four-node production 10g RAC (and a standalone development server) also assigned private interconnect addresses to 2 Apache/ApEx servers and a standalone development database server.
There are service names in the tnsnames.ora on all servers in our infrastructure referencing these private interconnects- even the non-rac member servers. The nics on these servers are not bound for failover with the nics bound to the public/VIP addresses. These nics are isolated on their own switch.
Could this configuration be related to lost heartbeats or voting disk errors? We experience rac node expulsions and even arbitrary bounces (reboots!) of all the rac nodes.

I do not have access to the contractors. . . .can only look at what they have left behind and try to figure out their intention. . .
I am reading the Ault/Tumha book Oracle 10g Grid and Real Application Clusters and looking through our own settings and config files and learning srvctl and crsctl commands from their examples. Also googling and OTN searching through the library full of documentation. . .
I still have yet to figure out if the private interconnect spoken about so frequently in cluster configuration documents are the binding to the set of node.vip address specifications in the tnsnames.ora (bound the the first eth adaptor along with the public ip addresses for the nodes) or the binding on the second eth adaptor to the node.prv addresses not found in the local pfile, in the tnsnames.ora, or the listener.ora (but found at the operating system level in the ifconfig). If the node.prv addresses are not the private interconnect then can anyone tell me that they are for?

OUI-25031 Cluster Configuration Assistant Fails Windows 2003 server RAC

Hi All
I am installing RAC on Windows 2003 Server X64, Dell PowerEdge 2950 Servers with iSCSI Storage.
I have 3 Network cards on each node which are configured like this
# Public
10.2.0.8 USTISDB-1.USTIS.Brgpoint.com USTISDB-1 --- in DNS
10.2.0.9 USTISDB-2.USTIS.Brgpoint.com USTISDB-2
#Virtual
10.2.0.10 USTISDB-1-vip.USTIS.Brgpoint.com USTISDB-1-vip --- in DNS
10.2.0.11 USTISDB-2-vip.USTIS.Brgpoint.com USTISDB-2-vip
#Private
192.168.2.2 USTISDB-1-priv.USTIS.Brgpoint.com USTISDB-1-priv --- in Hosts file
192.168.2.3 USTISDB-2-priv.USTIS.Brgpoint.com USTISDB-2-priv
#iSCSI Storage
192.168.1.2 Storage --- in DNS
Everything Pings everything.
When I start the installation , it installs clusterware and completes the remote operations but fails at cluster configuration window.
After checking the logs I found the list of failed commands and it says
INFO: Command = C:\WINDOWS\system32\cmd /c call E:\oracle\product\10.2.0\crs/install/crssetup.config.bat
PROT-1: Failed to initialize ocrconfig
Step 1: checking status of CRS cluster
Step 2: creating directories (E:\oracle\product\10.2.0\crs)
Step 3: configuring OCR repository
ocr upgrade failed with (-1)
Execution of the plugin was aborted
INFO: Configuration assistant "Oracle Clusterware Configuration Assistant" was canceled.
and then I checked E:\oracle\product\10.2.0\crs\cfgtoollogs/configToolFailedCommands and i says
rem Copyright (c) 1999, 2005, Oracle. All rights reserved.
C:\WINDOWS\system32\cmd /c call E:\oracle\product\10.2.0\crs/install/crssetup.config.bat
E:\oracle\product\10.2.0\crs/bin/racgons.exe add_config USTISDB-1.USTIS.Brgpoint.com:6200 USTISDB-2.USTIS.Brgpoint.com:6200
E:\oracle\product\10.2.0\crs/bin/oifcfg.exe setif -global "public"/10.0.0.0:public "private"/192.168.2.0:cluster_interconnect
C:\WINDOWS\system32\cmd /c call E:\oracle\product\10.2.0\crs/bin/vipca.bat -silent -nodelist "USTISDB-1,USTISDB-2" -nodevips "USTISDB-1//255.0.0.0/public,USTISDB-2//255.0.0.0/public"
C:\WINDOWS\system32\cmd /c call E:\oracle\product\10.2.0\crs/bin/cluvfy.bat stage -post crsinst -n "USTISDB-1,USTISDB-2"
Please help, Any help will be highly appreciated.
Edited by: user651560 on Nov 13, 2008 3:24 PM

E:\Staging\102010_win64_x64_clusterware\clusterware\cluvfy>runcluvfy stage -pre crsinst -n ustisdb-1,ustisdb-2 -verbose
The system cannot find the file specified.
Performing pre-checks for cluster services setup
Checking node reachability...
Check: Node reachability from node "USTISDB-1"
Destination Node Reachable?
ustisdb-1 yes
ustisdb-2 yes
Result: Node reachability check passed from node "USTISDB-1".
Checking user equivalence...
Check: User equivalence for user "Administrator"
Node Name Comment
ustisdb-2 passed
ustisdb-1 passed
Result: User equivalence check passed for user "Administrator".
Checking administrative privileges...
Checking node connectivity...
Interface information for node "ustisdb-2"
Interface Name IP Address Subnet
public 10.2.0.9 10.0.0.0
private 192.168.2.3 192.168.2.0
san 192.168.1.6 192.168.1.0
Interface information for node "ustisdb-1"
Interface Name IP Address Subnet
public 10.2.0.8 10.0.0.0
private 192.168.2.2 192.168.2.0
san 192.168.1.5 192.168.1.0
Check: Node connectivity of subnet "10.0.0.0"
Source Destination Connected?
ustisdb-2:public ustisdb-1:public yes
Result: Node connectivity check passed for subnet "10.0.0.0" with node(s) ustisdb-2,ustisdb-1.
Check: Node connectivity of subnet "192.168.2.0"
Source Destination Connected?
ustisdb-2:private ustisdb-1:private yes
Result: Node connectivity check passed for subnet "192.168.2.0" with node(s) ustisdb-2,ustisdb-1.
Check: Node connectivity of subnet "192.168.1.0"
Source Destination Connected?
ustisdb-2:san ustisdb-1:san yes
Result: Node connectivity check passed for subnet "192.168.1.0" with node(s) ustisdb-2,ustisdb-1.
Suitable interfaces for the private interconnect on subnet "10.0.0.0":
ustisdb-2 public:10.2.0.9
ustisdb-1 public:10.2.0.8
Suitable interfaces for the private interconnect on subnet "192.168.2.0":
ustisdb-2 private:192.168.2.3
ustisdb-1 private:192.168.2.2
Suitable interfaces for the private interconnect on subnet "192.168.1.0":
ustisdb-2 san:192.168.1.6
ustisdb-1 san:192.168.1.5
ERROR:
Could not find a suitable set of interfaces for VIPs.
Result: Node connectivity check failed.
Checking system requirements for 'crs'...
Check: Operating system version
Node Name Available Required Comment
ustisdb-2 Windows Server 2003 Windows Server 2003 passed
ustisdb-1 Windows Server 2003 Windows Server 2003 passed
Result: Operating system version check passed.
Check: Total memory
Node Name Available Required Comment
ustisdb-2 15.99GB (16771724KB) 512MB (524288KB) passed
ustisdb-1 15.99GB (16771724KB) 512MB (524288KB) passed
Result: Total memory check passed.
Check: Swap space
Node Name Available Required Comment
ustisdb-2 17.4GB (18241648KB) 1GB (1048576KB) passed
ustisdb-1 17.4GB (18241648KB) 1GB (1048576KB) passed
Result: Swap space check passed.
Check: System architecture
Node Name Available Required Comment
ustisdb-2 64-bit 64-bit passed
ustisdb-1 64-bit 64-bit passed
Result: System architecture check passed.
Check: Free disk space in "C:\DOCUME~1\ADMINI~1.UST\LOCALS~1\Temp\1" dir
Node Name Available Required Comment
ustisdb-2 23.7GB (24848248KB) 400MB (409600KB) passed
ustisdb-1 23.65GB (24797157KB) 400MB (409600KB) passed
Result: Free disk space check passed.
System requirement passed for 'crs'
Pre-check for cluster services setup was unsuccessful on all the nodes.

Guest Cluster error in Hyper-V Cluster

Hello everybody,
in my environment I do have an issue with failover clusters (Exchange, Fileserver) while performing a live migration of one virtual clusternode. The clustergroup is going offline.
The environment is the following:
2x Hyper-V Clusters: Hyper-V-Cluster1 and Hyper-V-Cluster2 (Windows Server 2012 R2) with 5 Nodes per Cluster
1x Scaleout Fileserver (Windows Server 2012 R2) with 2 Nodes
1x Exchange Cluster (Windows Server 2012 R2) with EX01 VM running on Hyper-V-Cluster1 and EX02 VM running on Hyper-V-Cluster2
1x Fileserver Failover Cluster (Windows Server 2012 R2) with FS01 VM running on Hyper-V-Cluster1 and FS02 VM running on Hyper-V-Cluster2
The physical networks on the Hyper-V Nodes are redundant with 2x 10Gb/s uplinks to 2x physical switches for VMs in a LBFO Team:
New-NetLbfoTeam
-Name 10Gbit_TEAM -TeamMembers 10Gbit_01,10Gbit_02
-TeamingMode SwitchIndependent -LoadBalancingAlgorithm HyperVPort
The SMB 3 traffic runs on 2x 10Gb/s NIC without NIC-Teaming (SMB-Multichannel).
SMB is used for livemigrations.
The VMs for clustering were installed according to the technet guideline:
http://technet.microsoft.com/en-us/library/dn265980.aspx
Because my Hyper-V Uplinks are allready redundant, I am using one NIC inside the VM.
As I understand, there is no advantage of using two NICs inside the VM as long they are connected to the same vSwitch.
Now, when I want to perform a hardware maintenance, I have to livemigrate the EX01 VM from Hyper-V-Cluster1-Node-1 to Hyper-V-Cluster1-Node-2.
EX02 VM still runs untouched on Hyper-V-Cluster2-Node-1.
At the end of the livemigration I see error 1135 (source: FailoverClustering) on EX01 VM, which says that EX02 VM was removed from Failover Cluster and I have to check my network.
The clustergroup of exchange is offline after that event and I have to bring it online again manually.
Any ideas what can cause this behavior?
Thanks.
Greetings,
torsten

Hello again,
I found the cause and the solution :-)
In the article here: http://technet.microsoft.com/en-us/library/dn440540.aspx
is the description of my cluster failure:
########## relevant part from article #######################
Protect against short-term network interruptions
Failover cluster nodes use the network to send heartbeat packets to other nodes of the cluster. If a node does not receive a response from another node for a specified period of time, the cluster removes the node from cluster membership. By default, a guest
cluster node is considered down if it does not respond within 5 seconds. Other nodes that are members of the cluster will take over any clustered roles that were running on the removed node.
Typically, during the live migration of a virtual machine there is a fast final transition when the virtual machine is stopped on the source node and is running on the destination node. However, if something causes the final transition to take longer than
the configured heartbeat threshold settings, the guest cluster considers the node to be down even though the live migration eventually succeeds. If the live migration final transition is completed within the TCP time-out interval (typically around 20 seconds),
clients that are connected through the network to the virtual machine seamlessly reconnect.
To make the cluster heartbeat time-out more consistent with the TCP time-out interval, you can change the
SameSubnetThreshold and CrossSubnetThreshold cluster properties from the default of 5 seconds to 20 seconds. By default, the cluster sends a heartbeat every 1 second. The threshold specifies how many heartbeats to miss in succession
before the cluster considers the cluster node to be down.
After changing both parameters in failover cluster as described the error is gone.
Greetings,
torsten

DBCA does not show the remote nodes on a RAC 10.2.0.2 HP-ITANIUM

HI at all,
I have a problem and I hope someone can help me.
A collegue have installed CRS and the Oracle HOME for a RAC 10.2.0.2 on a HP-ITANIUM ( two nodes) .
Im'm using the dbca for creating a cluster database.
I can see the RAC option pages , but in the "select nodes page" I found only the local node and not the remote node.
Any suggestion is appreciated
thank you
Adriano
$ crs_stat
NAME=ora.itmicz50.LISTENER_ITMICZ50.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on itmicz50
NAME=ora.itmicz50.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE on itmicz50
NAME=ora.itmicz50.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE on itmicz50
NAME=ora.itmicz50.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on itmicz50
NAME=ora.itmicz51.LISTENER_ITMICZ51.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on itmicz51
NAME=ora.itmicz51.gsd
TYPE=application
TARGET=ONLINE
STATE=ONLINE on itmicz51
NAME=ora.itmicz51.ons
TYPE=application
TARGET=ONLINE
STATE=ONLINE on itmicz51
NAME=ora.itmicz51.vip
TYPE=application
TARGET=ONLINE
STATE=ONLINE on itmicz51
and the olsnodes command :
$ olsnodes
itmicz50
itmicz51
Message was edited by:
user549224
null

HI Chandra,
thank you very much for your reply. I suppose the oraInventory is the same, however the contents of inventory.xml file is this:
<?xml version="1.0" standalone="yes" ?>


<INVENTORY>
<VERSION_INFO>
<SAVED_WITH>10.2.0.1.0</SAVED_WITH>
<MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME="OUIHome1" LOC="/work/app/oracle/product/CRS" TYPE="O" IDX="1" CRS="true">
<NODE_LIST>
<NODE NAME="itmicz50"/>
<NODE NAME="itmicz51"/>
</NODE_LIST>
</HOME>
<HOME NAME="OUIHome2" LOC="/work/app/oracle/product/RAC10g" TYPE="O" IDX="2">
<NODE_LIST>
<NODE NAME="itmicz50"/> <===== EVIDENCE
</NODE_LIST>
</HOME>
</HOME_LIST>
</INVENTORY>
I can see that within the node_list tag I have only one hostname It is correct?
thank you in advance
Adriano Capruzzi

Will the Application Scope be shared across the cluster in a multi-node OC4

Hi,
I have the following requirement:
Users of the application can only have single (browser) session. When a user who already has a session connects again, he should no longer be allowed to access the older session.
My proposed implementation is:
- After successful login – possibly using a Session Listener - an entry is made in a HashMap UserSessions that lives in the application scope. Key is the username, value is the session id (HttpSession.getId()).
- For every request, using a ServletFilter, we check whether the session is still in the UserSessions HashMap for the current user. If a new session has been created for the same user, the session id for that new session is in the UserSessions map and the servletfilter will not find the session. In that case, the filter should invalidate the session and forward to the user to an error page.
However, the application will run on a multi-node OC4J cluster. I am starting to wonder:
Will the Application Scope be shared across the cluster in a multi-node OC4J environment?
I know session state can be shared. But what application state/scope?
Does anyone know? Do I have to do anything special in the cluster or the application to get this to work?
Thanks for your help.
Lucas

gday Lucas --
Application scope is not replicated across JVM boundaries with OC4J.
I'm sure this used to be described in the doc, but I can't find it now from a quick scan.
If you wanted to use this type of pattern, you could look to use a Coherence cache as distribution mechanism to share objects across multiple JVMs/nodes.
-steve-

SAP 4.6C Installation in Cluster error - Oracle 10g HPUX 11.23

Hi all,
We are installing 4.6C in oracle 10g/hpux 11.23 in a cluster environment in primary node.
During the last phase of installation, I get the below error.
INFO 2008-05-05 10:53:02
    Starting up the SAP System
INFO 2008-05-05 10:53:02 DBR3START_IND_ORA SyCoprocessCreateAsUser:300
    Creating coprocess /bin/sh /home/r3padm/startsap_dbciR3Pd_00 as
    user r3padm and group sapsys ...
INFO 2008-05-05 10:53:02 DBR3START_IND_ORA SyGroupIDGet:100
    Group id for group sapsys is 201.
INFO 2008-05-05 10:53:02 DBR3START_IND_ORA SyUserIDGet:300
    User id for user r3padm is 7029.
INFO 2008-05-05 10:53:02 DBR3START_IND_ORA ExecuteDo:0
    RC code form SyCoprocessWait = 127 .
ERROR 2008-05-05 10:53:02 DBR3START_IND_ORA ExecuteCheck:0
    Exit code from /bin/sh: 127.
ERROR 2008-05-05 10:53:02 DBR3START_IND_ORA InternalInstallationDo:0
    Phase failed.
ERROR 2008-05-05 10:53:02 DBR3START_IND_ORA InstallationDo:0
    Phase failed.
ERROR 2008-05-05 10:53:02 InstController Action:0
    Step DBR3START_IND_ORA could not be performed.
ERROR 2008-05-05 10:53:03 Main
    Installation failed.
ERROR 2008-05-05 10:53:03 Main
    Installation aborted.
But I am able to start SAP manually using startsap and its successfull.
Can someone help me, what the above error is?
Thanks & Regards
Senthil

Markus,
I read the note, and also I have installaed 4.6C/oracle 10g/hpux11.23 for more than 4 servers and it was successfull.
But this time I am installing in a cluster environment. So I am facing this issue. When I checked in the install directory with ls -lrt the last file isdatabase.log
INFO 2008-05-06 16:02:43 RFCRSWBOINI_IND_IND CRfcOpen:0
    RfcOpen() was successful.
INFO 2008-05-06 16:02:44 RFCRSWBOINI_IND_IND CRfcPing:0
    Pinging of RFC destination was successful.
INFO 2008-05-06 16:02:45 RFCRSWBOINI_IND_IND GetSaprelease:0
    46C
INFO 2008-05-06 16:02:46 RFCRSWBOINI_IND_IND ReadInstvers:0
    INSTVERS loaded successfully.
INFO 2008-05-06 16:02:48 RFCRSWBOINI_IND_IND ReadInstvers:0
    INSTVERS loaded successfully.
INFO 2008-05-06 16:02:49 RFCRSWBOINI_IND_IND DeleteRow:0
    Deleted: 1 rows
INFO 2008-05-06 16:02:50 RFCRSWBOINI_IND_IND ReadInstvers:0
    INSTVERS loaded successfully.
INFO 2008-05-06 16:02:50 RFCRSWBOINI_IND_IND CleanupSAPEntries:0
    Deletion of 0 SAP rows
INFO 2008-05-06 16:02:52 RFCRSWBOINI_IND_IND ReadInstvers:0
    INSTVERS loaded successfully.
ERROR 2008-05-06 16:02:53 RFCRSWBOINI_IND_IND StartJob:0
    Job RSWBOINS_JOB could not be started.
ERROR 2008-05-06 16:02:53 RFCRSWBOINI_IND_IND InstallationDo:0
    Phase failed.
ERROR 2008-05-06 16:02:53 InstController Action:0
    Step RFCRSWBOINI_IND_IND could not be performed.
ERROR 2008-05-06 16:02:53 Main
    Installation failed.
ERROR 2008-05-06 16:02:53 Main
    Installation aborted.
dcecpn0:root >
Also when I logged into the SAP and I can see some error messages in sm21
18:18:23 MS                           Q0U Client dbciR3Pd_R3P_00 is not known to the message server
18:18:23 MS                           Q0U Client dbciR3Pd_R3P_00 is not known to the message server
18:18:23 DIA 02 000 SAPSYS            BY4 Database error 2289 at INS access to table DDLOG
18:18:23 DIA 02 000 SAPSYS            BY0 > ORA-02289: sequence does not exist#
18:18:23 MS                           Q0U Client dbciR3Pd_R3P_00 is not known to the message server
18:18:23 MS                           Q0U Client dbciR3Pd_R3P_00 is not known to the message server
18:18:23 MS                           Q0U Client dbciR3Pd_R3P_00 is not known to the message server
Its like I am able to display SCC4, when I change to change mode, I get the below error message
"System error: Unable to lock table/view T000"
Can you help
Thanks
Senthil

Cluster Error SAP ECC EHP5 using DB2

Hi, I´m installing SAP ECC EHP5 over Windows Server 2008 R2 in MSCS using DB2 as Database.
I follow the steps in the installation guide and the first node was sucessfully installed. My problem is that when I execute the second step of the installation (database installation), in the sub-step "Now cluster the database" , after installed the database in the second node and run the utility db2mscs the following error occurs when I tried to initialize the DB2 service:
"An error occurred while attemping to bring the resource ''DB2 Server' online"
Error code: 0x8007138f
The cluster resource could not be found
Could you please give any hint to continue the installation?
Thanks a lot.
Kind regards

I made a mistake and opened the thread 2 times. The other one was answered.
Cluster Error SAP ECC EHP5 using DB2
Edited by: Esteban REyes on Oct 18, 2011 10:32 AM

Web-dynpro application -ERROR: ICF service node "/sap/bc/webdynpro/sap/zqm_cto_arr_general1" does not exist (see SAP Note 1109215) (termination: ERROR_MESSAGE_STATE)

i have created my web-dynpro application in development. and sent to quality . whenever i will execute my dynpro in quality i got one message
ERROR: ICF service node "/sap/bc/webdynpro/sap/zqm_cto_arr_general1" does not exist (see SAP Note 1109215) (termination: ERROR_MESSAGE_STATE)
whenever i saw sicf transaction my web-dynpro is not seen . my dynpro application name is more then 15 character. what i will do . please give me valuable suggestion.....

Hi Ashok,
for your requirement the application is not exist in particular place. It means, the webdynpro application is saved at different package or different location.
Please change the webdynpro component name and save it in particular request in package, then transport it to quality ..(development server )
then go to SICF t.code .. sap->bc->webdynpro->sap->find out your application and activate the service of your webdynpro application.
Now test it ... this solution might helpful to you .
Regards,
Naveen M

Error in BPM - Error when processing node '0000000065' ParForEach index

Hi All,
I have an issue .. I have done 1:n mapping successfully and would like to place the Send step in loop instead of a Block .. The reason being I have the count of how many times the Send step should executed for multiline.. I need to receive an Acknowledgement for each send step.. If I dont receive the required number of Acknowledgements then .. I need to revert back some Creations which is a business requirement ..
So .. I have initialized a container operation variable i to '0' .. Then the loop condition is i < count .. Send the multi-container .. Receive the Response .. If dont receive the desired response for any one of the multi-line then I need to do a cancellation process in a loop again .. So .. now I am getting an Exception "Error when processing node '0000000065' (ParForEach index 000000)
Message no. SWP088" in the loop step ..
It is fine if somebody can suggest alternate logic can be applied as well but first preference to use a loop which consumes lesser system resources ..
Kindly look into the Issue
Regards,
Raj
Edited by: raj2112 on Sep 21, 2010 2:34 PM

if you use a loop step, you will send one message per time. ussing block step you have parallel processing ussing ParForEach. now is for X reason the item cannot be created, do a roll back in the target system. the problem here you will do the rollback once the last message reach.
the other posibility is to handle application ack in the sender step. it will let you know it the message was processed success and this ack could be the end condition of your block step. but you cannot use this with loop step.
take a look to this
http://help.sap.com/saphelp_nw04/helpdata/en/55/65c844539349e9b1450581ab44a5e6/frameset.htm

DAC message while running execution plan - "Error while loading nodes"

I have just installed and setup Informatica 8.6.1, DAC, BI apps 7.9.6 for a Oracle Ebs R12.1.1 source instance
In informatica I have defined 2 relational sources "DataWarehouse" and "ORA_R1211" - the same names as in physical data sources of DAC
I have mentioned the flatfile parameter as "ORA_R1211_Flatfile" in DAC
after successfully build, when I run the ETL the error "Error while loading nodes" occurs.
the log file shows the following details:
START OF ETL
20 SEVERE Sat Apr 23 21:57:58 GST 2011
ANOMALY INFO::: Error while loading nodes.
EXCEPTION CLASS::: java.lang.NullPointerException
com.siebel.etl.engine.core.SessionHandler.getNodes(SessionHandler.java:2842)
com.siebel.etl.engine.core.SessionHandler.loadNodes(SessionHandler.java:473)
com.siebel.etl.engine.core.ETL.thisETLProcess(ETL.java:372)
com.siebel.etl.engine.core.ETL.run(ETL.java:658)
com.siebel.etl.engine.core.ETL.execute(ETL.java:910)
com.siebel.etl.etlmanager.EtlExecutionManager$1.executeEtlProcess(EtlExecutionManager.java:210)
com.siebel.etl.etlmanager.EtlExecutionManager$1.run(EtlExecutionManager.java:165)
java.lang.Thread.run(Thread.java:619)
21 SEVERE Sat Apr 23 21:57:58 GST 2011
* CLOSING THE CONNECTION POOL DataWarehouse
22 SEVERE Sat Apr 23 21:57:58 GST 2011
* CLOSING THE CONNECTION POOL ORA_R1211
23 SEVERE Sat Apr 23 21:57:58 GST 2011
END OF ETL
*****************

Hi,
Mark the current EP as completed and re-assemble the subject area, generate parameters and build the EP and run the load.
Thanks,
Navin KumarBolla

Cluster errors on 1 node of a RAC

Similar Messages

Maybe you are looking for