Archive REDO When One RAC Node is Down

I have a question about how redo log get archived when one of the instances in a two node RAC cluster is down (not open, not mounted).
For example, let's assume instance1 was shutdown and only instance 2 is running.
I have 6 redo logs:
<font face="courier">
SQL> SELECT GROUP#, THREAD#, SEQUENCE#, STATUS FROM V$LOG;
GROUP# THREAD# SEQUENCE# STATUS
1 1 3390 INACTIVE
2 1 3389 INACTIVE
3 1 3391 ACTIVE
5 2 3886 INACTIVE
4 2 3887 INACTIVE
6 2 3888 CURRENT
</font>
If I run the following statement what will happen?
<font face="courier">
SQL> ALTER SYSTEM ARCHIVE LOG CURRENT;
</font>
The documentation says +Specify CURRENT to manually archive the current redo log file group of the specified thread, forcing a log switch. If you omit the THREAD parameter, then Oracle Database archives all redo log file groups from all enabled threads, including logs previous to current logs. You can specify CURRENT only when the database is open.+
Would Oracle archive sequence# 3391 from thread 1 even though the instance is not open?

When your instance are not working, it means that it doesn't have any CURRENT redo log. So when you issue switching logfile - it launches archiving only current redo log, for working instance only. That 3391 redo log is not current, it has Active status, that means that it could be needed for recovery purpose, but it had to be archived earlier.

Similar Messages

What would happened when one RAC node's public NIC down ?

Dear all,
There's a two-node RAC on my office. As my observation, when one RAC node's public NIC is down, the "crs_stat -t" command wouldn't show any resource is OFFLINE. So at this moment if i use sqlplus to connect to my RAC db, it still will choose node1 or node2 instance randomly right? And if i've been assigned to the node's instance whose public NIC is down, my sqlplus would failed to connect to db?
So, can we say that Oracle RAC can't supply HA function if one node's public NIC is down? Or is there just any other solution to solve this issue? Any suggestion would be appreciated.

Public node is down means that only one node would be able to take the load i.e. in your case it will be Node 2. It may happen that CRS is unable to record the status - in such case there may be failed connections to one node.

One RAC node is down give the following error when starting the database!

wHEN TRYING TO START THE DATABASE ON RAC ENVIORNMENT
SQL> connect sys as sysdba
Enter password:
Connected to an idle instance.
SQL> startup
ORA-27102: out of memory
HPUX-ia64 Error: 12: Not enough space
SQL> exit
Disconnected
When we r trying to start the database it said
$ bdf
Filesystem kbytes used avail %used Mounted on
/dev/vg00/lvol3 2097152 240944 1841896 12% /
/dev/vg00/lvol1 344064 115376 226944 34% /stand
/dev/vg00/lvol8 10485760 9370960 1106232 89% /var
/dev/vg00/lvol7 4866048 2557680 2290400 53% /usr
/dev/vg00/u02 10485760 3502229 6547116 35% /u02
/dev/vg00/u01 10485760 10476596 9164 100% /u01
/dev/vg00/lvol4 2097152 601872 1483944 29% /tmp
/dev/vg00/lvol6 4194304 3231000 955792 77% /opt
/dev/vg00/lvol5 524288 311520 211136 60% /home
WHERE /U01 WAS 100%. Now i emptied the space in /u01 to
$ bdf
Filesystem kbytes used avail %used Mounted on
/dev/vg00/lvol3 2097152 240944 1841896 12% /
/dev/vg00/lvol1 344064 115376 226944 34% /stand
/dev/vg00/lvol8 10485760 9370960 1106232 89% /var
/dev/vg00/lvol7 4866048 2557680 2290400 53% /usr
/dev/vg00/u02 10485760 3502229 6547116 35% /u02
/dev/vg00/u01 10485760 9508934 930943 91% /u01
/dev/vg00/lvol4 2097152 601872 1483944 29% /tmp
/dev/vg00/lvol6 4194304 3231000 955792 77% /opt
/dev/vg00/lvol5 524288 311520 211136 60% /home
When trying to start the db again its giving the following error...
SQL> connect sys as sysdba
Enter password:
Connected to an idle instance.
SQL> startup
ORA-27102: out of memory
HPUX-ia64 Error: 12: Not enough space
SQL> exit
Disconnected
here i changed the sga_target and now it says
ORACLE instance started.
Total System Global Area 436207616 bytes
Fixed Size 1297912 bytes
Variable Size 148648456 bytes
Database Buffers 285212672 bytes
Redo Buffers 1048576 bytes
ORA-01105: mount is incompatible with mounts by other instances
ORA-19808: recovery destination parameter mismatch
What could be the issue..
Ur help would be highly appreciated...

Hello
SQL> startup
ORA-27102: out of memory
HPUX-ia64 Error: 12: Not enough space
SQL> exiterror is not related to space on your mount point. it is related to memory.
if you are getting this error means chekc at the OS level whether something is consuming more memory due to which it is not allowing oracle to allocate sga.
Check top/sar/glance to see who is consuming more memory
Total System Global Area 436207616 bytes
Fixed Size 1297912 bytes
Variable Size 148648456 bytes
Database Buffers 285212672 bytes
Redo Buffers 1048576 bytes
ORA-01105: mount is incompatible with mounts by other instances
ORA-19808: recovery destination parameter mismatchit is not the best practice to maintain differnet parameters for each instance in RAC env. also check db_recovery_file_dest and db_recovery_file_dest_size is same on all node. it should be same i.e it should be a shared location.
Anil Malkai

How to execute DBMS_JOB at exactly one RAC node

Hello,
after unsuccessfully searching for "RAC" and "DBMS_JOB" I open this thread.
Can you tell me how to dedicate one RAC-node for doing my batch-jobs
which are started by using dbms_job (so there is no tnsnames.ora which is used).
Thanks in advance

hi,
Let's say the instances are named:I1, I2, I3, I4
Issue:ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0 SCOPE=BOTH SID='I1';
ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0 SCOPE=BOTH SID='I2';
ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0 SCOPE=BOTH SID='I3';
ALTER SYSTEM SET JOB_QUEUE_PROCESSES=10 SCOPE=BOTH SID='I4';So that only instance I4 will run jobs.
Regards,
Yoann.

JDBC read stuck if RAC node goes down

We did several tests with Java applications against our RAC DB and face a hanging application if we power off the RAC node that executes the current (long) running query.
We can see that the application receives HA-events via UCP:
2015-01-22 13:02:11 | r-thread-1 | WARN | o.ucp.jdbc.oracle.ONSDatabaseFailoverEvent    | NO timezone in HA event
However, the application started a query before and the query is not aborted with an exception. A Thread dump after about 7 minutes shows that the application is hanging in a socket read call:
"pool-1-thread-1" #32 prio=5 os_prio=0 tid=0x00007fedf45b2000 nid=0xbc4 runnable [0x00007fee00cd3000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:150)
    at java.net.SocketInputStream.read(SocketInputStream.java:121)
    at oracle.net.ns.Packet.receive(Packet.java:283)
    at oracle.net.ns.DataPacket.receive(DataPacket.java:103)
    at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:230)
    at oracle.net.ns.NetInputStream.read(NetInputStream.java:175)
    at oracle.net.ns.NetInputStream.read(NetInputStream.java:100)
    at oracle.net.ns.NetInputStream.read(NetInputStream.java:85)
    at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:123)
    at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:79)
    at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1122)
    at oracle.jdbc.driver.T4CMAREngine.unmarshalSB1(T4CMAREngine.java:1099)
    at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:288)
    at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:191)
    at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:523)
    at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:207)
    at oracle.jdbc.driver.T4CPreparedStatement.executeForDescribe(T4CPreparedStatement.java:863)
    at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1153)
    at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1275)
    at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3576)
    at oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedStatement.java:3620)
    - locked <0x00000000c0ddcb20> (a oracle.jdbc.driver.T4CConnection)
    at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeQuery(OraclePreparedStatementWrapper.java:1491)
    at org.springframework.jdbc.core.JdbcTemplate$1.doInPreparedStatement(JdbcTemplate.java:703)
The expected behaviour would be that a running query is aborted with an exception. (BTW: This happens if the service is taken down with "shutdown immediate". All ok for this case.)
We consider to implement custom ONS listeners [1], but we actually expect that UCP would handle such situations or lets us register strategies/callbacks for certain events.
Our config:
Oracle Enterprise 11.2.0.4.0 with RAC
ons.jar 12.1.0.1
ojdbc6.jar 11.2.0.2
ucp.jar 12.1.0.1
Server JRE 1.8.0_25
Any hints appreciated.
[1] http://docs.oracle.com/cd/E11882_01/java.112/e16548/apxracfan.htm#JJDBC28945

You're concept isn't right:
http://docs.oracle.com/cd/E11882_01/server.112/e25494/restart.htm#ADMIN13178
Overview of Fast Application Notification
FAN is a notification mechanism that Oracle Restart can use to notify other processes about configuration changes that include service status changes, such as UP or DOWN events. FAN provides the ability to immediately terminate inflight transaction when an instance or server fails. Integrated Oracle clients receive the events and respond. Applications can respond either by propagating the error to the user or by resubmitting the transactions and masking the error from the application user. When a DOWN event occurs, integrated clients immediately clean up connections to the terminated database. When an UP event occurs, the clients create new connections to the new primary database instance.
Also, take a look at these docs: http://docs.oracle.com/cd/E11882_01/java.112/e12265/rac.htm#JJUCP08100 ; and https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=890204623685515&id=566573.1&_afrWindowMode=0&_adf.ctrl-s…
And make a test, execute a query that took about 1 minute and after you executed, just power down the node where it is executing, to see if it will retrieve the results.
Regards.

If use MSSQ , when oracle rac node reboot, client get TPEOS error

Hi, all
in my tuxedo applicaton, if we use Single Server, Single Queue mode , when reboot any Oracle RAC node, our application is ok, client can get correct result. but if we use MSSQ（Multi Server, Single Queue) , if Oracle RAC node is ok , our application also is ok. but if we reboot any Oracle RAC node, client program can continue run, get correct result, but always get TPEOS error , for this situation， server can get client request, but client can not get server reply, only get TPEOS error.
our enviroment is :
oracle RAC ,10g 10.2.0.4 , two instances ,rac1 rac2, and two DTP services s1 and s2, set s1 and s2 services TAF is basic
tuxedo 10R3 , two nodes ,work in MP model ，use XA access oracle rac database，services have Transaction and not Transaction
OS is linux AS4 U5, 64bits
service program use OCI
can any one encounter this problem ?

Hi, first thanks you
in ULOG file , only have failover information, not any other error message, in client side also has no other error.
not use MSSQ, ubb file about MSSQ config
SERVERS
DEFAULT:
CLOPT="-A "
sinUpdate_server SRVGRP=GROUP11 SRVID=80 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinUpdate_server SRVGRP=GROUP12 SRVID=160 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinCount_server SRVGRP=GROUP11 SRVID=240 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinCount_server SRVGRP=GROUP12 SRVID=320 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinSelect_server SRVGRP=GROUP11 SRVID=360 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinSelect_server SRVGRP=GROUP12 SRVID=400 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinInsert_server SRVGRP=GROUP11 SRVID=520 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinInsert_server SRVGRP=GROUP12 SRVID=560 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDelete_server SRVGRP=GROUP11 SRVID=600 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDelete_server SRVGRP=GROUP12 SRVID=640 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDdl_server SRVGRP=GROUP11 SRVID=700 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDdl_server SRVGRP=GROUP12 SRVID=740 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
lockselect_server SRVGRP=GROUP11 SRVID=800 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
lockselect_server SRVGRP=GROUP12 SRVID=840 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
#mulup_server SRVGRP=GROUP11 SRVID=1 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
#mulup_server SRVGRP=GROUP12 SRVID=60 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinUpdate_server SRVGRP=GROUP13 SRVID=83 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinUpdate_server SRVGRP=GROUP14 SRVID=164 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinCount_server SRVGRP=GROUP13 SRVID=243 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinCount_server SRVGRP=GROUP14 SRVID=324 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinSelect_server SRVGRP=GROUP13 SRVID=363 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinSelect_server SRVGRP=GROUP14 SRVID=404 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinInsert_server SRVGRP=GROUP13 SRVID=523 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinInsert_server SRVGRP=GROUP14 SRVID=564 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDelete_server SRVGRP=GROUP13 SRVID=603 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDelete_server SRVGRP=GROUP14 SRVID=644 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDdl_server SRVGRP=GROUP13 SRVID=703 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
sinDdl_server SRVGRP=GROUP14 SRVID=744 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
lockselect_server SRVGRP=GROUP13 SRVID=803 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
lockselect_server SRVGRP=GROUP14 SRVID=844 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
#mulup_server SRVGRP=GROUP13 SRVID=13 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
#mulup_server SRVGRP=GROUP14 SRVID=64 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y
WSL SRVGRP=GROUP11 SRVID=1000
CLOPT="-A -- -n//120.3.8.237:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP12 SRVID=1001
CLOPT="-A -- -n//120.3.8.238:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP13 SRVID=1003
CLOPT="-A -- -n//120.3.8.237:7203 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP14 SRVID=1004
CLOPT="-A -- -n//120.3.8.238:7204 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
if we use MSSQ ,ubb file about MSSQ config is
*SERVERS
DEFAULT:
CLOPT="-A -p 1,60:1,30"
sinUpdate_server SRVGRP=GROUP11 SRVID=80 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate11 REPLYQ=Y
sinUpdate_server SRVGRP=GROUP12 SRVID=160 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate12 REPLYQ=Y
sinCount_server SRVGRP=GROUP11 SRVID=240 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount11 REPLYQ=Y
sinCount_server SRVGRP=GROUP12 SRVID=320 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount12 REPLYQ=Y
sinSelect_server SRVGRP=GROUP11 SRVID=360 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelec11 REPLYQ=Y
sinSelect_server SRVGRP=GROUP12 SRVID=400 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelect12 REPLYQ=Y
sinInsert_server SRVGRP=GROUP11 SRVID=520 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert11 REPLYQ=Y
sinInsert_server SRVGRP=GROUP12 SRVID=560 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert12 REPLYQ=Y
sinDelete_server SRVGRP=GROUP11 SRVID=600 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete11 REPLYQ=Y
sinDelete_server SRVGRP=GROUP12 SRVID=640 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete12 REPLYQ=Y
sinDdl_server SRVGRP=GROUP11 SRVID=700 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl11 REPLYQ=Y
sinDdl_server SRVGRP=GROUP12 SRVID=740 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl12 REPLYQ=Y
lockselect_server SRVGRP=GROUP11 SRVID=800 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect11 REPLYQ=Y
lockselect_server SRVGRP=GROUP12 SRVID=840 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect12 REPLYQ=Y
#mulup_server SRVGRP=GROUP11 SRVID=1 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup11 REPLYQ=Y
#mulup_server SRVGRP=GROUP12 SRVID=60 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup12 REPLYQ=Y
sinUpdate_server SRVGRP=GROUP13 SRVID=83 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate13 REPLYQ=Y
sinUpdate_server SRVGRP=GROUP14 SRVID=164 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinUpdate14 REPLYQ=Y
sinCount_server SRVGRP=GROUP13 SRVID=243 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount13 REPLYQ=Y
sinCount_server SRVGRP=GROUP14 SRVID=324 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinCount14 REPLYQ=Y
sinSelect_server SRVGRP=GROUP13 SRVID=363 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelec13 REPLYQ=Y
sinSelect_server SRVGRP=GROUP14 SRVID=404 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinSelect14 REPLYQ=Y
sinInsert_server SRVGRP=GROUP13 SRVID=523 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert13 REPLYQ=Y
sinInsert_server SRVGRP=GROUP14 SRVID=564 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinInsert14 REPLYQ=Y
sinDelete_server SRVGRP=GROUP13 SRVID=603 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete13 REPLYQ=Y
sinDelete_server SRVGRP=GROUP14 SRVID=644 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDelete14 REPLYQ=Y
sinDdl_server SRVGRP=GROUP13 SRVID=703 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl13 REPLYQ=Y
sinDdl_server SRVGRP=GROUP14 SRVID=744 MIN=5 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=sinDdl14 REPLYQ=Y
lockselect_server SRVGRP=GROUP13 SRVID=803 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect13 REPLYQ=Y
lockselect_server SRVGRP=GROUP14 SRVID=844 MIN=10 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=lockselect14 REPLYQ=Y
#mulup_server SRVGRP=GROUP13 SRVID=13 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup13 REPLYQ=Y
#mulup_server SRVGRP=GROUP14 SRVID=64 MIN=2 MAX=30 MAXGEN=10 GRACE=10 RESTART=Y RQADDR=mulup14 REPLYQ=Y
WSL SRVGRP=GROUP11 SRVID=1000
CLOPT="-A -- -n//120.3.8.237:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP12 SRVID=1001
CLOPT="-A -- -n//120.3.8.238:7200 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP13 SRVID=1003
CLOPT="-A -- -n//120.3.8.237:7203 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
WSL SRVGRP=GROUP14 SRVID=1004
CLOPT="-A -- -n//120.3.8.238:7204 -I 60 -T 60 -w WSH -m 50 -M 100 -x 6 -N 3600"
about above ubb file ,has any error ? or not correct use MSSQ
look forward to you answer,thanks.

Dbconsole failed to start on one RAC node

Hi
I have 2 RAC nodes (RHEL 4) and 10.2.0.1. On one dbconsole is running and on other I get the following. Earlier dbconsole
on both the nodes used to run perfectly fine. I will appreacite any suggestions to rectify this problem.
Regards
oracle@rac01<18>:/u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log> emctl start dbconsole
TZ set to Canada/Newfoundland
Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
http://rac01:1158/em/console/aboutApplication
Agent Version : 10.1.0.4.1
OMS Version : Unknown
Protocol Version : 10.1.0.2.0
Agent Home : /u01/app/oracle/product/10.2/db_1/rac01_RACDB1
Agent binaries : /u01/app/oracle/product/10.2/db_1
Agent Process ID : 23329
Parent Process ID : 21132
Agent URL : http://rac01:3938/emd/main
Started at : 2007-07-25 11:37:32
Started by user : oracle
Last Reload : 2007-07-25 11:37:32
Last successful upload : (none)
Last attempted upload : (none)
Total Megabytes of XML files uploaded so far : 0.00
Number of XML files pending upload : 371
Size of XML files pending upload(MB) : 7.66
Available disk space on upload filesystem : 44.78%
Agent is already started. Will restart the agent
Stopping agent ... stopped.
Starting Oracle Enterprise Manager 10g Database Control ............................................................................................. failed.
Logs are generated in directory /u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log
oracle@rac01<19>:/u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log>
ON OTHER NODE:
oracle@rac02<2>:/u01/app/oracle> emctl start dbconsole
TZ set to Canada/Newfoundland
Oracle Enterprise Manager 10g Database Control Release 10.2.0.1.0
Copyright (c) 1996, 2005 Oracle Corporation. All rights reserved.
http://rac01:1158/em/console/aboutApplication
Starting Oracle Enterprise Manager 10g Database Control .................................... started.
Logs are generated in directory /u01/app/oracle/product/10.2/db_1/rac02_RACDB2/sysman/log
oracle@rac02<3>:/u01/app/oracle>

Thanks for your time and reply .
Well, here is what I got, couldn't make out from here.
Regards
oracle@rac01<19>:/u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log> ls -lart
total 13500
drwxr----- 7 oracle dba 4096 Jul 14 10:48 ..
-rw-r----- 1 oracle dba 0 Jul 14 10:48 emdctl.log
drwxrwx--- 2 oracle dba 4096 Jul 14 10:54 nmcRACDB11521
-rw-r----- 1 oracle dba 4655792 Jul 24 23:01 emoms.trc
-rw-r----- 1 oracle dba 4655792 Jul 24 23:01 emoms.log
drwxr----- 3 oracle dba 4096 Jul 25 11:35 .
-rw-r----- 1 oracle dba 4096 Jul 25 12:05 emdb.nohup.lr
-rw-r----- 1 oracle dba 1074 Jul 25 12:05 emagent_perl.trc
-rw-r----- 1 oracle dba 1731 Jul 25 12:06 emagent.log
-rw-r----- 1 oracle dba 1080 Jul 25 12:07 emagentfetchlet.trc
-rw-r----- 1 oracle dba 1080 Jul 25 12:07 emagentfetchlet.log
-rw-r----- 1 oracle dba 81089 Jul 25 13:28 emdctl.trc
-rw-r----- 1 oracle dba 3309143 Jul 25 13:28 emdb.nohup
-rw-r----- 1 oracle dba 1044518 Jul 25 13:28 emagent.trc
oracle@rac01<20>:/u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log> cat emagent.log
2007-07-14 10:50:44 Thread-3086936288 Starting Agent 10.1.0.4.1 from /u01/app/oracle/product/10.2/db_1 (00701)
2007-07-14 10:51:16 Thread-3086936288 EMAgent started successfully (00702)
2007-07-14 14:38:21 Thread-3086935744 Starting Agent 10.1.0.4.1 from /u01/app/oracle/product/10.2/db_1 (00701)
2007-07-14 14:39:00 Thread-3086935744 EMAgent started successfully (00702)
2007-07-24 07:05:06 Thread-3086935744 Starting Agent 10.1.0.4.1 from /u01/app/oracle/product/10.2/db_1 (00701)
2007-07-24 07:07:11 Thread-3086935744 target {+ASM1_rac01, osm_instance} is broken: cannot compute dynamic properties in time. (00155)
2007-07-24 07:07:14 Thread-3086935744 EMAgent started successfully (00702)
2007-07-24 12:06:27 Thread-3086935744 EMAgent normal shutdown (00703)
2007-07-24 12:08:26 Thread-3086935744 Starting Agent 10.1.0.4.1 from /u01/app/oracle/product/10.2/db_1 (00701)
2007-07-24 12:08:51 Thread-3086935744 EMAgent started successfully (00702)
2007-07-25 11:35:35 Thread-3086935744 EMAgent normal shutdown (00703)
2007-07-25 11:37:32 Thread-3086935744 Starting Agent 10.1.0.4.1 from /u01/app/oracle/product/10.2/db_1 (00701)
2007-07-25 11:39:29 Thread-3086935744 target {+ASM1_rac01, osm_instance} is broken: cannot compute dynamic properties in time. (00155)
2007-07-25 11:39:30 Thread-3086935744 EMAgent started successfully (00702)
2007-07-25 12:03:36 Thread-3086935744 EMAgent normal shutdown (00703)
2007-07-25 12:05:15 Thread-3086935744 Starting Agent 10.1.0.4.1 from /u01/app/oracle/product/10.2/db_1 (00701)
2007-07-25 12:06:23 Thread-3086935744 target {+ASM1_rac01, osm_instance} is broken: cannot compute dynamic properties in time. (00155)
2007-07-25 12:06:24 Thread-3086935744 EMAgent started successfully (00702)
oracle@rac01<21>:/u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log> cat emagentfetchlet.log
2007-07-14 11:01:44,208 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-14 14:40:29,096 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-24 07:10:44,123 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-24 12:12:48,187 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-25 11:41:25,628 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-25 12:07:30,335 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
oracle@rac01<22>:/u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log>
oracle@rac01<22>:/u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log> tail -40 emagentfetchlet.trc
2007-07-14 11:01:44,208 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-14 14:40:29,096 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-24 07:10:44,123 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-24 12:12:48,187 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-25 11:41:25,628 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
2007-07-25 12:07:30,335 [main] WARN track.OracleInventory collectInventory.439 - ECM: The inventory location file for the special Windows NT case does not exist or is unreadable.
oracle@rac01<25>:/u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log> tail -10 emdctl.trc
2007-07-25 13:01:02 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:04:41 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:07:12 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:10:50 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:14:32 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:18:09 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:20:40 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:24:27 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:28:06 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:31:43 Thread-3086935744 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
oracle@rac01<28>:/u01/app/oracle/product/10.2/db_1/rac01_RACDB1/sysman/log> tail -10 emagent.trc
2007-07-25 13:31:44 Thread-43162528 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:31:44 Thread-43162528 ERROR pingManager: nmepm_pingReposURL: Cannot connect to http://rac01:1158/em/upload/: retStatus=-32
2007-07-25 13:32:14 Thread-74791840 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:32:14 Thread-74791840 ERROR pingManager: nmepm_pingReposURL: Cannot connect to http://rac01:1158/em/upload/: retStatus=-32
2007-07-25 13:32:14 Thread-74791840 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:32:14 Thread-74791840 ERROR pingManager: nmepm_pingReposURL: Cannot connect to http://rac01:1158/em/upload/: retStatus=-32
2007-07-25 13:32:44 Thread-74791840 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:32:44 Thread-74791840 ERROR pingManager: nmepm_pingReposURL: Cannot connect to http://rac01:1158/em/upload/: retStatus=-32
2007-07-25 13:32:44 Thread-74791840 WARN http: snmehl_connect: connect failed to (rac01:1158): Connection refused (error = 111)
2007-07-25 13:32:44 Thread-74791840 ERROR pingManager: nmepm_pingReposURL: Cannot connect to http://rac01:1158/em/upload/: retStatus=-32
Message was edited by:
Singh

ORA-12514 on R12 EBS Apps server when 1 DB RAC node crashed/down

Just now Production 11.2.0.2 RAC DB on windows 2008 server Node1 crashed. While on Node 2 all services are up and running including database. But from EBS R12.1.2 application server when connecting as username/password from sql*plus is throwing ORA-12514 error.
While It is connecting if I give username/password@TNSNAME but not without @TNSNAME. Due to this none of the application services are starting.
Please help/advise. Thank you.
Following is the tnsnames.ora,
# This file is automatically generated by AutoConfig. It will be read and
# overwritten. If you were instructed to edit this file, or if you are not
# able to use the settings created by AutoConfig, refer to Metalink Note
# 387859.1 for assistance.
#$Header: NetServiceHandler.java 120.19.12010000.6 2010/03/09 08:11:36 jmajumde ship $
ORCL=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldbscan)(PORT=1521))
            (CONNECT_DATA=
                (SERVICE_NAME=ORCL)
                (INSTANCE_NAME=ORCL1)
ORCL1=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldbscan)(PORT=1521))
            (CONNECT_DATA=
                (SERVICE_NAME=ORCL)
                (INSTANCE_NAME=ORCL1)
ORCL1_FO=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldbscan)(PORT=1521))
            (CONNECT_DATA=
                (SERVICE_NAME=ORCL)
                (INSTANCE_NAME=ORCL1)
ORCL_FO=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldbscan)(PORT=1521))
            (CONNECT_DATA=
                (SERVICE_NAME=ORCL)
                (INSTANCE_NAME=ORCL1)
ORCL1=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldbscan)(PORT=1521))
            (CONNECT_DATA=
                (SERVICE_NAME=ORCL)
                (INSTANCE_NAME=ORCL1)
ORCL1_FO=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldbscan)(PORT=1521))
            (CONNECT_DATA=
                (SERVICE_NAME=ORCL)
                (INSTANCE_NAME=ORCL1)
ORCL2=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldb2-vip.sa.company.net)(PORT=1521))
            (CONNECT_DATA=
                (SERVICE_NAME=ORCL)
                (INSTANCE_NAME=ORCL2)
ORCL2_FO=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldb2-vip.sa.company.net)(PORT=1521))
            (CONNECT_DATA=
                (SERVICE_NAME=ORCL)
                (INSTANCE_NAME=ORCL2)
ORCL_BALANCE=
        (DESCRIPTION=
            (ADDRESS_LIST=
                (LOAD_BALANCE=YES)
                (FAILOVER=YES)
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldb1-vip.sa.company.net)(PORT=1521))
                (ADDRESS=(PROTOCOL=tcp)(HOST=orcldb2-vip.sa.company.net)(PORT=1521))
            (CONNECT_DATA=
                (SERVICE_NAME=ORCL)
FNDFS_orclAPPL=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orclAPPL.sa.company.net)(PORT=1626))
            (CONNECT_DATA=
                (SID=FNDFS)
FNDFS_orclAPPL.sa.company.net=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orclAPPL.sa.company.net)(PORT=1626))
            (CONNECT_DATA=
                (SID=FNDFS)
FNDFS_ORCL_orclAPPL=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orclAPPL.sa.company.net)(PORT=1626))
            (CONNECT_DATA=
                (SID=FNDFS)
FNDFS_ORCL_orclAPPL.sa.company.net=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orclAPPL.sa.company.net)(PORT=1626))
            (CONNECT_DATA=
                (SID=FNDFS)
FNDSM_orclAPPL_ORCL=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orclAPPL.sa.company.net)(PORT=1626))
            (CONNECT_DATA=
                (SID=FNDSM)
FNDSM_orclAPPL.sa.company.net_ORCL=
        (DESCRIPTION=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orclAPPL.sa.company.net)(PORT=1626))
            (CONNECT_DATA=
                (SID=FNDSM)
FNDFS_APPLTOP_orclappl=
        (DESCRIPTION=
            (ADDRESS_LIST=
                (ADDRESS=(PROTOCOL=tcp)(HOST=orclAPPL.sa.company.net)(PORT=1626))
            (CONNECT_DATA=
                (SID=FNDFS)
IFILE=E:\ORACLE\ORCL\INST\APPS\ORCL_orclappl\ora\10.1.2\network\admin\ORCL_orclappl_ifile.ora

A database/OS crash shouldn't have such impact on your configuration. Please review the following docs and verify your setup.
Using Oracle 11g Release 2 Real Application Clusters with Oracle E-Business Suite Release 12 (Doc ID 823587.1)
Configuring and Managing E-Business Application Tier for RAC (Doc ID 1311528.1)
Thanks,
Hussein

WLC 5508 : session disconnected when one lag-port is down.

Hello,
I have a WLC 5508 ( version 6.0.182).
When the port1 and port2 are connected ( The switch is configured with a etherchannel in forced mode) everything works fine: There is traffic on the 2 ports.
When I disconnect one of the 2 ports, I can still ping outside with my PC client, but all my tcp sesssions goes down and I even cannot restart my session. The only way I found is to do a "Disconnect / Reconnect" on my PC wireless connection.
Do you know this probleme ?
Is it a way to avoid it ?
Michel Misonne

CSCth12513 LAG fail-over does not work on CT5508
This bug is fixed in the special release available through TAC : 6.0.199.157 and 7.0.xxxx
Hope this helps.
Nicolas
===
Dont' forget to rate answers that you find useful

Scan-vip running only on one RAC node

Hi ,
While setting up RAC11.2 on Centos 5.7 , I was getting this error during the grid installation:
PRCR-1079 : Failed to start resource ora.scan1.vip
CRS-5005: IP Address: 192.168.100.208 is already in use in the network
CRS-2674: Start of 'ora.scan1.vip' on 'falcen6b' failed
CRS-2632: There are no more servers to try to place resource 'ora.scan1.vip' on that would satisfy its placement policy
PRCR-1079 : Failed to start resource ora.scan2.vip
CRS-5005: IP Address: 192.168.100.209 is already in use in the network
CRS-2674: Start of 'ora.scan2.vip' on 'falcen6b' failed
CRS-2632: There are no more servers to try to place resource 'ora.scan2.vip' on that would satisfy its placement policy
PRCR-1079 : Failed to start resource ora.scan3.vip
CRS-5005: IP Address: 192.168.100.210 is already in use in the network
CRS-2674: Start of 'ora.scan3.vip' on 'falcen6b' failed
CRS-2632: There are no more servers to try to place resource 'ora.scan3.vip' on that would satisfy its placement policy
I figured that the scan service is able to run only on one node at a time. When I stopped the service on rac1 and started it on rac2 the service is starting.
But I think for the grid installation the scan service has to simultaneously run on both the nodes.
How do I resolve it?
Any suggestions please.
PS - I am planning to try with the patch 11.0.2.3 but it will be a while till i get access to it.
Till then can someone suggest a workaround?

Hi Balazs Papp and onedbguru,
I was able to resolve that error by running the following command on rac2, now that part of the installer passed.
crsctl start res ora.scan1.vip
However the cluster verification utility is failing at the end of installer.
When I executed the below command, this is my output:
[oracle@falcen6a grid]$ ./runcluvfy.sh stage -post crsinst -n falcen6a,falcen6b -verbose
Performing post-checks for cluster services setup
Checking node reachability...
Check: Node reachability from node "falcen6a"
Destination Node Reachable?
falcen6a yes
falcen6b yes
Result: Node reachability check passed from node "falcen6a"
Checking user equivalence...
Check: User equivalence for user "oracle"
Node Name Comment
falcen6b passed
falcen6a passed
Result: User equivalence check passed for user "oracle"
Checking time zone consistency...
Time zone consistency check passed.
Checking Cluster manager integrity...
Checking CSS daemon...
Node Name Status
falcen6b running
falcen6a running
Oracle Cluster Synchronization Services appear to be online.
Cluster manager integrity check passed
UDev attributes check for OCR locations started...
Result: UDev attributes check passed for OCR locations
UDev attributes check for Voting Disk locations started...
Result: UDev attributes check passed for Voting Disk locations
Check default user file creation mask
Node Name Available Required Comment
falcen6b 0022 0022 passed
falcen6a 0022 0022 passed
Result: Default user file creation mask check passed
Checking cluster integrity...
Cluster is divided into 2 partitions
Partition 1 consists of the following members:
Node Name
falcen6b
Partition 2 consists of the following members:
Node Name
falcen6a
Cluster integrity check failed. Cluster is divided into 2 partition(s).
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations
ERROR:
PRVF-4193 : Asm is not running on the following nodes. Proceeding with the remaining nodes.
Checking OCR config file "/etc/oracle/ocr.loc"...
OCR config file "/etc/oracle/ocr.loc" check successful
ERROR:
PRVF-4195 : Disk group for ocr location "+DATA" not available on the following nodes:
Checking size of the OCR location "+DATA" ...
Size check for OCR location "+DATA" successful...
OCR integrity check failed
Checking CRS integrity...
ERROR:
PRVF-5316 : Failed to retrieve version of CRS installed on node "falcen6b"
The Oracle clusterware is healthy on node "falcen6b"
The Oracle clusterware is healthy on node "falcen6a"
CRS integrity check failed
Checking node application existence...
Checking existence of VIP node application
Node Name Required Status Comment
falcen6b yes unknown failed
falcen6a yes unknown failed
Result: Check failed.
Checking existence of ONS node application
Node Name Required Status Comment
falcen6b no unknown ignored
falcen6a no online passed
Result: Check ignored.
Checking existence of GSD node application
Node Name Required Status Comment
falcen6b no unknown ignored
falcen6a no does not exist ignored
Result: Check ignored.
Checking existence of EONS node application
Node Name Required Status Comment
falcen6b no unknown ignored
falcen6a no online passed
Result: Check ignored.
Checking existence of NETWORK node application
Node Name Required Status Comment
falcen6b no unknown ignored
falcen6a no online passed
Result: Check ignored.
Checking Single Client Access Name (SCAN)...
SCAN VIP name Node Running? ListenerName Port Running?
falcen6-scan unknown false LISTENER 1521 false
WARNING:
PRVF-5056 : Scan Listener "LISTENER" not running
Checking name resolution setup for "falcen6-scan"...
SCAN Name IP Address Status Comment
falcen6-scan 192.168.100.210 passed
falcen6-scan 192.168.100.208 passed
falcen6-scan 192.168.100.209 passed
Verification of SCAN VIP and Listener setup failed
OCR detected on ASM. Running ACFS Integrity checks...
Starting check to see if ASM is running on all cluster nodes...
PRVF-5137 : Failure while checking ASM status on node "falcen6b"
Starting Disk Groups check to see if at least one Disk Group configured...
Disk Group Check passed. At least one Disk Group configured
Task ACFS Integrity check failed
Checking Oracle Cluster Voting Disk configuration...
Oracle Cluster Voting Disk configuration check passed
Checking to make sure user "oracle" is not in "root" group
Node Name Status Comment
falcen6b does not exist passed
falcen6a does not exist passed
Result: User "oracle" is not part of "root" group. Check passed
Checking if Clusterware is installed on all nodes...
Check of Clusterware install passed
Checking if CTSS Resource is running on all nodes...
Check: CTSS Resource running on all nodes
Node Name Status
falcen6b passed
falcen6a passed
Result: CTSS resource check passed
Querying CTSS for time offset on all nodes...
Result: Query of CTSS for time offset passed
Check CTSS state started...
Check: CTSS state
Node Name State
falcen6b Observer
falcen6a Observer
CTSS is in Observer state. Switching over to clock synchronization checks using NTP
Starting Clock synchronization checks using Network Time Protocol(NTP)...
NTP Configuration file check started...
The NTP configuration file "/etc/ntp.conf" is available on all nodes
NTP Configuration file check passed
Checking daemon liveness...
Check: Liveness for "ntpd"
Node Name Running?
falcen6b yes
falcen6a yes
Result: Liveness check passed for "ntpd"
Checking NTP daemon command line for slewing option "-x"
Check: NTP daemon command line
Node Name Slewing Option Set?
falcen6b yes
falcen6a yes
Result:
NTP daemon slewing option check passed
Checking NTP daemon's boot time configuration, in file "/etc/sysconfig/ntpd", for slewing option "-x"
Check: NTP daemon's boot time configuration
Node Name Slewing Option Set?
falcen6b yes
falcen6a yes
Result:
NTP daemon's boot time configuration check for slewing option passed
NTP common Time Server Check started...
NTP Time Server "133.243.236.19" is common to all nodes on which the NTP daemon is running
NTP Time Server "133.243.236.18" is common to all nodes on which the NTP daemon is running
NTP Time Server "210.173.160.86" is common to all nodes on which the NTP daemon is running
NTP Time Server ".LOCL." is common to all nodes on which the NTP daemon is running
Check of common NTP Time Server passed
Clock time offset check from NTP Time Server started...
Checking on nodes "[falcen6b, falcen6a]"...
Check: Clock time offset from NTP Time Server
Time Server: 133.243.236.19
Time Offset Limit: 1000.0 msecs
Node Name Time Offset Status
falcen6b 15.332 passed
falcen6a -1.503 passed
Time Server "133.243.236.19" has time offsets that are within permissible limits for nodes "[falcen6b, falcen6a]".
Time Server: 133.243.236.18
Time Offset Limit: 1000.0 msecs
Node Name Time Offset Status
falcen6b 15.115 passed
falcen6a -1.614 passed
Time Server "133.243.236.18" has time offsets that are within permissible limits for nodes "[falcen6b, falcen6a]".
Time Server: 210.173.160.86
Time Offset Limit: 1000.0 msecs
Node Name Time Offset Status
falcen6b 15.219 passed
falcen6a -1.527 passed
Time Server "210.173.160.86" has time offsets that are within permissible limits for nodes "[falcen6b, falcen6a]".
Time Server: .LOCL.
Time Offset Limit: 1000.0 msecs
Node Name Time Offset Status
falcen6b 0.0 passed
falcen6a 0.0 passed
Time Server ".LOCL." has time offsets that are within permissible limits for nodes "[falcen6b, falcen6a]".
Clock time offset check passed
Result: Clock synchronization check using Network Time Protocol(NTP) passed
Oracle Cluster Time Synchronization Services check passed
Post-check for cluster services setup was unsuccessful on all the nodes.
[oracle@falcen6a grid]$
Any suggestions?

Ocmstart.sh is not starting in one RAC node

I am having a trouble starting "oracm" on one node. I can see it is working on second node but not on first node. Is there any one haveing the same issue.

Hi Chandra Sorry for the delay, I got the output from cm.out file during restart of ocmstart.sh script
=========================================================
TRACE: HandleJoin(): JOIN from node(1)->(1), tid = ClusterListener:3076 file = nmlisten.c, line = 362 {Thu Apr 10 21:39:10 2003 }
TRACE: HandleStatus(): node(0) UNKNOWN, tid = ClusterListener:3076 file = nmlisten.c, line = 404 {Thu Apr 10 21:39:11 2003 }
TRACE: HandleStatus(): src[0] dest[1] dom[0] seq[33] sync[10], tid = ClusterListener:3076 file = nmlisten.c, line = 415 {Thu Apr 10 21:39:11 2003 }
TRACE: HandleSync(): src[0] dest[1] dom[0] seq[34] sync[10], tid = ClusterListener:3076 file = nmlisten.c, line = 506 {Thu Apr 10 21:39:11 2003 }
TRACE: SendAck(): node(0) domain(0) syncSeqNo(10) type(11), tid = ClusterListener:3076 file = nmmember.c, line = 1913 {Thu Apr 10 21:39:11 2003 }
TRACE: HandleVote(): src[0] dest[1] dom[0] seq[35] sync[10], tid = ClusterListener:3076 file = nmlisten.c, line = 643 {Thu Apr 10 21:39:12 2003 }
TRACE: SendVoteInfo(): node(0) domain(0) syncSeqNo(10), tid = ClusterListener:3076 file = nmmember.c, line = 1727 {Thu Apr 10 21:39:12 2003 }
TRACE: HandleShutdown(): src[0] dest[1] dom[0] seq[0] sync[10] type[4], tid = ClusterListener:3076 file = nmlisten.c, line = 1087 {Thu Apr 10 21:39:13 2003 }
TRACE: IncrementEventValue: *(80f2920) = (1, 1), tid = ClusterListener:3076 file = unixinc.c, line = 253 {Thu Apr 10 21:39:13 2003 }--- End Dump ---
oracm, version[ 9.2.0.2.0.47 ] started {Thu Apr 10 21:40:07 2003 }
KernelModuleName is hangcheck-timer {Thu Apr 10 21:40:07 2003 }
OemNodeConfig(): Network Address of node0: 192.168.0.50 (port 9998)
{Thu Apr 10 21:40:07 2003 }
OemNodeConfig(): Network Address of node1: 192.168.0.51 (port 9998)
{Thu Apr 10 21:40:07 2003 }
WARNING: OemInit2: Opened file(/var/opt/oracle/oradata/orcl/CMQuorumFile 8), tid = main:1024 file = oem.c, line = 491 {Thu Apr 10 21:40:07 2003 }Debug Hang : ClusterListener (PID=2541) Registered withwatchdog daemon. {Thu Apr 10 21:40:09 2003 }
InitializeCM: ModuleName = hangcheck-timer {Thu Apr 10 21:40:09 2003 }
InitializeCM: Kernel module hangcheck-timer is already loaded {Thu Apr 10 21:40:09 2003 }
Debug Hang : CmConnectListener (PID=2542):Registered with watchdog daemon. {Thu Apr 10 21:40:09 2003 }
Debug Hang :StartNMMon (PID=2536) Registered with watchdog daemon. {Thu Apr 10 21:40:09 2003 }
CreateLocalEndpoint(): Network Address: 192.168.0.51
{Thu Apr 10 21:40:09 2003 }
Debug Hang :PollingThread (PID=135159169): Registered with {Thu Apr 10 21:40:09 2003 }
Debug Hang : DiskPingThread (PID=135159169): Registered with {Thu Apr 10 21:40:09 2003 }
Debug Hang :SendingThread (PID=135159169): Registered with {Thu Apr 10 21:40:09 2003 }
--- DUMP GROUP STATE DB ---
--- END OF GROUP STATE DUMP ---
--- Begin Dump ---
TRACE: LogListener: Spawned with tid 0x803., tid = 2051 file = logging.c, line = 116 {Thu Apr 10 21:40:07 2003 }oracm, version[ 9.2.0.2.0.47 ] started {Thu Apr 10 21:40:07 2003 }
TRACE: Can't read registry value for WatchdogTimerMargin, tid = main:1024 file = unixinc.c, line = 1011 {Thu Apr 10 21:40:07 2003 }
TRACE: Can't read registry value for WatchdogSafetyMargin, tid = main:1024 file = unixinc.c, line = 1011 {Thu Apr 10 21:40:07 2003 }KernelModuleName is hangcheck-timer {Thu Apr 10 21:40:07 2003 }
TRACE: Can't read registry value for ClientTimeout, tid = main:1024 file = unixinc.c, line = 1011 {Thu Apr 10 21:40:07 2003 }
TRACE: InitNMInfo: setting clientTimeout to 215s based on MissCount 215 and PollInterval 1000ms, tid = main:1024 file = nmconfig.c, line = 137 {Thu Apr 10 21:40:07 2003 }
TRACE: InitClusterDb(): getservbyname on CMSrvr failed - 22 : assigning 9998, tid = main:1024 file = nmconfig.c, line = 212 {Thu Apr 10 21:40:07 2003 }OemNodeConfig(): Network Address of node0: 192.168.0.50 (port 9998)
{Thu Apr 10 21:40:07 2003 }
OemNodeConfig(): Network Address of node1: 192.168.0.51 (port 9998)
{Thu Apr 10 21:40:07 2003 }
TRACE: OemCreateListenPort: bound at 9998, tid = main:1024 file = oem.c, line = 858 {Thu Apr 10 21:40:07 2003 }
TRACE: InitClusterDb(): found my node info at 1 name rac2, priv rac2priv, port 3623, tid = main:1024 file = nmconfig.c, line = 265 {Thu Apr 10 21:40:07 2003 }
TRACE: InitClusterDb(): Local Node(1) NodeName[rac2], tid = main:1024 file = nmconfig.c, line = 283 {Thu Apr 10 21:40:07 2003 }
TRACE: InitClusterDb(): Cluster(Oracle) with (2) Defined Nodes, tid = main:1024 file = nmconfig.c, line = 286 {Thu Apr 10 21:40:07 2003 }
TRACE: OEMInits(): CM Disk File (/var/opt/oracle/oradata/orcl/CMQuorumFile), tid = main:1024 file = oem.c, line = 243 {Thu Apr 10 21:40:07 2003 }
WARNING: OemInit2: Opened file(/var/opt/oracle/oradata/orcl/CMQuorumFile 8), tid = main:1024 file = oem.c, line = 491 {Thu Apr 10 21:40:07 2003 }
TRACE: ReadOthersDskInfo(): node(0) rcfg(11) wrtcnt(1052) lastcnt(0) alive(1052), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(1) rcfg(10) wrtcnt(2) lastcnt(0) alive(1), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(2) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(3) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(4) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(5) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(6) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(7) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(8) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(9) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(10) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(11) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(12) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(13) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(14) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(15) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(16) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(17) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(18) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(19) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(20) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(21) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(22) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(23) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(24) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(25) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(26) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(27) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(28) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(29) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(30) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(31) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(32) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(33) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(34) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(35) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(36) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(37) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(38) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(39) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(40) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(41) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(42) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(43) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(44) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(45) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(46) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(47) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(48) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(49) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(50) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(51) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(52) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(53) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(54) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(55) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(56) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(57) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(58) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(59) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(60) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(61) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(62) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ReadOthersDskInfo(): node(63) rcfg(0) wrtcnt(0) lastcnt(0) alive(0), tid = main:1024 file = oem.c, line = 1442 {Thu Apr 10 21:40:09 2003 }
TRACE: ClusterListener: Spawned with tid 0xc04., tid = 3076 file = nmlistener.c, line = 53 {Thu Apr 10 21:40:09 2003 }Debug Hang : ClusterListener (PID=2541) Registered withwatchdog daemon. {Thu Apr 10 21:40:09 2003 }
TRACE: ClusterListener (pid=2541, tid=3076): Registered with watchdog daemon., tid = 3076 file = nmlistener.c, line = 75 {Thu Apr 10 21:40:09 2003 }InitializeCM: ModuleName = hangcheck-timer {Thu Apr 10 21:40:09 2003 }
InitializeCM: Kernel module hangcheck-timer is already loaded {Thu Apr 10 21:40:09 2003 }
TRACE: CmConnectListener: Spawned with tid 0x1005., tid = 4101 file = cmclient.c, line = 215 {Thu Apr 10 21:40:09 2003 }Debug Hang : CmConnectListener (PID=2542):Registered with watchdog daemon. {Thu Apr 10 21:40:09 2003 }
TRACE: CmConnectListener (pid=2542, tid=4101): Registered with watchdog daemon., tid = 4101 file = cmclient.c, line = 246 {Thu Apr 10 21:40:09 2003 }Debug Hang :StartNMMon (PID=2536) Registered with watchdog daemon. {Thu Apr 10 21:40:09 2003 }
TRACE: StartNMMon (pid=2536, tid=1024): Registered with watchdog daemon., tid = main:1024 file = cmnodemon.c, line = 251 {Thu Apr 10 21:40:09 2003 }CreateLocalEndpoint(): Network Address: 192.168.0.51
{Thu Apr 10 21:40:09 2003 }
TRACE: StartClusterJoin(): clusterState(0) nodeState(0), tid = main:1024 file = nmmember.c, line = 277 {Thu Apr 10 21:40:09 2003 }
TRACE: PollingThread: Spawned with tid 0x1406., tid = 5126 file = nmmember.c, line = 742 {Thu Apr 10 21:40:09 2003 }Debug Hang :PollingThread (PID=135159169): Registered with {Thu Apr 10 21:40:09 2003 }
TRACE: PollingThread (pid=2543, tid=5126): Registered with watchdog daemon., tid = 5126 file = nmmember.c, line = 760 {Thu Apr 10 21:40:09 2003 }
TRACE: DiskPingThread: Spawned with tid 0x1807., tid = 6151 file = nmmember.c, line = 1050 {Thu Apr 10 21:40:09 2003 }Debug Hang : DiskPingThread (PID=135159169): Registered with {Thu Apr 10 21:40:09 2003 }
TRACE: DiskPingThread (pid=2544, tid=6151): Registered with watchdog daemon., tid = 6151 file = nmmember.c, line = 1074 {Thu Apr 10 21:40:09 2003 }
TRACE: SendingThread: Spawned with tid 0x1c08, 0x0., tid = 7176 file = nmmember.c, line = 511 {Thu Apr 10 21:40:09 2003 }Debug Hang :SendingThread (PID=135159169): Registered with {Thu Apr 10 21:40:09 2003 }
TRACE: SendingThread (pid=2545, tid=7176): Registered with watchdog daemon., tid = 7176 file = nmmember.c, line = 576 {Thu Apr 10 21:40:09 2003 }
TRACE: HandleJoin(): src[1] dest[1] dom[0] seq[1] sync[0], tid = ClusterListener:3076 file = nmlisten.c, line = 346 {Thu Apr 10 21:40:09 2003 }
TRACE: HandleJoin(): JOIN from node(1)->(1), tid = ClusterListener:3076 file = nmlisten.c, line = 362 {Thu Apr 10 21:40:09 2003 }
TRACE: HandleStatus(): node(0) UNKNOWN, tid = ClusterListener:3076 file = nmlisten.c, line = 404 {Thu Apr 10 21:40:10 2003 }
TRACE: HandleStatus(): src[0] dest[1] dom[0] seq[36] sync[11], tid = ClusterListener:3076 file = nmlisten.c, line = 415 {Thu Apr 10 21:40:10 2003 }
TRACE: HandleSync(): src[0] dest[1] dom[0] seq[37] sync[11], tid = ClusterListener:3076 file = nmlisten.c, line = 506 {Thu Apr 10 21:40:10 2003 }
TRACE: SendAck(): node(0) domain(0) syncSeqNo(11) type(11), tid = ClusterListener:3076 file = nmmember.c, line = 1913 {Thu Apr 10 21:40:10 2003 }
TRACE: HandleVote(): src[0] dest[1] dom[0] seq[38] sync[11], tid = ClusterListener:3076 file = nmlisten.c, line = 643 {Thu Apr 10 21:40:11 2003 }
TRACE: SendVoteInfo(): node(0) domain(0) syncSeqNo(11), tid = ClusterListener:3076 file = nmmember.c, line = 1727 {Thu Apr 10 21:40:11 2003 }
TRACE: HandleShutdown(): src[0] dest[1] dom[0] seq[0] sync[11] type[4], tid = ClusterListener:3076 file = nmlisten.c, line = 1087 {Thu Apr 10 21:40:12 2003 }
TRACE: IncrementEventValue: *(80f2920) = (1, 1), tid = ClusterListener:3076 file = unixinc.c, line = 253 {Thu Apr 10 21:40:12 2003 }--- End Dump ---
=========================================================

Exchange sp1 ECP hang when one mailbox server is down

i have native exchange sp1
i have 1 cas server and 2 mailbox server
if join on ECP i see total function and see mailbox server 1 and 2
but if shutdown server mbox 1 or server mbox 2= the ecp not responding
i close IE and reboot cas server but when open ECP the autentication working but menu is very slow..
and often block when select : exchange admin center/server/serverdatabase
i see eventlog and cas server tries to connect
to the server off and loop.
I thoughtthat thecasno longer sawthe emailson the serveroff
not that
the whole system went wild
if restart server mbox which before
was off the ecp immediatelyworking perfect
this happens
is if I turn off the server mbox
1 or 2
thanks for help

Hi,
Any server in a DAG can host a copy of a mailbox database from any other server in the DAG. When a server is added to a DAG, it works with the other servers in the DAG to provide automatic recovery from failures that affect mailbox databases, such as a disk,
server, or network failure.
We can undestand more information from this document.
https://technet.microsoft.com/en-us/library/dd979799(v=exchg.150).aspx
Best Regards.
Please remember to mark the replies as answers if they help, and unmark the answers if they provide no help. If you have feedback for TechNet Support, contact [email protected]
Lynn-Li
TechNet Community Support

Health Service Heartbeat Failure Alert for Generated when one Management Server Down,

Hi,
I have Two Management Server, every one manage about 100 server, when one Management Server goes down unexpected, I receive 100 Alert for 100 Server Health Service Heartbeat Failure.
My Question, why when the Management Server down, it send that all Managed agent Health Service Heartbeat Failure?
Is there a way to change this?

SCOM 2012 agent will autofailover when primary server is down. You can check the failover management server by using the following powershell cmdlet:
#Verify Failover for Agents reporting to MS1
$Agents = Get-SCOMAgent | where {$_.PrimaryManagementServerName -eq 'MS1.DOMAIN.COM'}
$Agents | sort | foreach {
Write-Host "";
"Agent :: " + $_.Name;
"--Primary MS :: " + ($_.GetPrimaryManagementServer()).ComputerName;
$failoverServers = $_.getFailoverManagementServers();
foreach ($managementServer in $failoverServers) {
"--Failover MS :: " + ($managementServer.ComputerName);
Write-Host "";
http://www.systemcentercentral.com/how-does-the-failover-process-work-in-opsmgr-2012-scom-sysctr/

Remove RAC node on Windows

I have done all the steps to remove one RAC node but got stuck at the step of running rootdelete.sh file from $CRS_HOME/install directory as I don't have this file in windows environment.
What is the equivalent file for rootdelete.sh on windows platform. I want to run this to remove the node info from the clusterware entry.
Is there a good document that explains about removing the node on windows platform.

Hello,
You need to run the following steps to remove a node from a RAC cluster on Windows platform:
Perform the following steps on a node other than the node you want to delete:
1. Run the Database Configuration Assistant (DBCA) utility to delete the instance.
2. Then run the Net Configuration Assistant (NetCA) to delete the listener.
3. If the node that you are deleting has ASM instance, then delete the ASM instance using the srvctl stop asm and srvctl remove asm commands.
4. Run the command srvctl stop nodeapps -n nodename of the node to be deleted to stop the node applications.
5. Run the command srvctl remove nodeapps -n nodename of the node to be deleted to remove the node applications.
6. Stop isqlplus if it is running.
7. Run the command setup.exe -updateNodeList ORACLE_HOME=Oracle_home ORACLE_HOME_NAME=Oracle_home_name CLUSTER_NODES=remaining
nodes where remaining nodes is a list of the nodes that are to remain part of the cluster.
Perform the following steps on the deleted RAC node:
1. Run the command setup.exe -updateNodeList -local -noClusterEnabled ORACLE_HOME=Oracle_home ORACLE_HOME_NAME=Oracle_home_name CLUSTER_NODES="".
Note that you do not need a value for "" after the CLUSTER_NODES= entry in this command. If you delete more than one node, then you must run this command on every deleted node to remove the Oracle home if you have a non-shared Oracle home (non-cluster file system) installation.
2. On the same node, delete the Windows Registry entries and ASM services using Oradim.
3. From the deleted RAC node, run the command Oracle_home\oui\bin\setup.exe to start the Oracle Universal Installer (OUI). Select Deinstall Products and select the Oracle home that you want to de-install.
4. Then to delete the CRS node, from a remaining node run the command crssetup del -nn node_name of the deleted node, node number
5. Then run the command setup.exe -updateNodeList ORACLE_HOME=CRS home ORACLE_HOME_NAME=CRS home name CLUSTER_NODES=remaining nodes where remaining nodes is a list of the nodes that are to remain in the cluster.
6. Then on the deleted CRS node, run the command setup.exe -updateNodeList -local -noClusterEnabled ORACLE_HOME=CRS home ORACLE_HOME_NAME=CRS home name CLUSTER_NODES=""
7. Remove the Oracle home manually from the new node if the home is not shared and then manually remove the HKLM/software/Oracle registry keys and the Oracle services. 7
8. After adding or deleting nodes from your Oracle Database 10g with RAC environment, and after you are sure that your system is functioning properly, make a backup of the contents of the voting disk using the dd.exe utility. The dd.exe utility is part of the MKS toolkit.
ASM Instance Cleanup Procedures after Node Deletion on Windows-Based Platforms
The delete node procedure requires the following additional steps on Windows-based systems to remove the ASM instances:
1. If this is the Oracle home from which the node-specific listener named LISTENER_nodename runs, then use NetCA to remove this listener and its CRS resources. If necessary, re-create this listener in another home.
2. If this is the Oracle home from which the ASM instance runs, then remove the ASM configuration by running the following command for all nodes on which this Oracle home exists:
srvctl stop asm -n node
Then run the following command for the nodes that you are removing:
srvctl remove asm -n node
3. If you are using a cluster file system for your ASM Oracle home, then run the following commands on the local node:
4. rd -s -q %ORACLE_BASE%\admin\+ASM
delete %ORACLE_HOME%\database\*ASM*
5. If you are not using a cluster file system for your ASM Oracle home, then run the delete command mentioned in the previous step on each node on which the Oracle home exists.
6. Run the following command on each node that has an ASM instance:
oradim -delete -asmsid +ASMnode_number
Source:
Oracle® Real Application Clusters Administrator's Guide
10g Release 1 (10.1)
Part Number B10765-02
Chapter 5: Adding and Deleting Nodes and Instances
Hope this helps,
Ben Prusinski, Oracle 10g OCP
http://oracle-magician.blogspot.com

Backing up the RAC DB when either one of the node is down

11.2.0.2/Solaris 10 (x86-64bit) For our 2-Node production RAC DB, I had configured RMAN backup from Node1 using Cronjob. Last weekend our Node1 went down. Our SMS notifying system which sends SMS alerts to our Mobiles went down on the weekend as well. Only by Monday Noon we came to know that Node1 is down and that there is no backup for Saturday and Sunday.
How can i make sure that RMAN backup of the DB will be taken even if either one of the Nodes go down ? My friend suggested IBM TWS scheduler. Can Tivoli Work Scheduler detect a dead RAC Node and fire RMAN backup from the surviving node ?

I don't know the answer regarding TWS, but if you run the backup from crontab I guess that you don't have any 3rd party tool now.
I think the easiest solution will be to have the script and crontab job on both servers and decide which one runs the backup.
For example, the script that is scheduled in the crontab will do:
1. if $HOSTNAME is node1 run the backup. If $HOSTNAME is node2, check if node1 is up and if not run the backup.
2. This is more elegant, check the "crsctl status resource" for something and run the backup accordingly. For example, the script will check where SCAN1 VIP is located and this is the node which will run the backup.
HTH
Liron

Archive REDO When One RAC Node is Down

Similar Messages

Maybe you are looking for