VIP failover in Oracle RAC

Dear all,
I am using Oracle Rac 10gR2 running on top of Sun Cluster 3.2u3.
I have a test to check the failover ability of VIP in Oracle RAC, however the result was not as I expected.
The test scenario was:
- Turn on the 02 nodes and wait for all services including both Sun Cluster and Oracle RAC online.
- Using SQL Navigator to connect to the database using the VIP on node1. (VIP1)
- Shutdown the node1.
- All services and resources on node2 still online, however after a long time (about 10 mins), I did not see the VIP1 failover to the alive node.
- The "crs_stat -t" command did not show the VIP1 online on node2 (alice node).
- The SQL Navigator could not establish the connection to the databasse using the VIP1 any more.
The output of "crs_stat -t" command before shutting down the node1:
oracle@t5120-02 $ crs_stat -t
Name Type Target State Host
ora.orcl.db application ONLINE ONLINE t5120-02
ora....l1.inst application ONLINE ONLINE t5120-01
ora....l2.inst application ONLINE ONLINE t5120-02
ora....01.lsnr application ONLINE ONLINE t5120-01
ora....-01.gsd application ONLINE ONLINE t5120-01
ora....-01.ons application ONLINE ONLINE t5120-01
ora....-01.vip application ONLINE ONLINE t5120-01
ora....02.lsnr application ONLINE ONLINE t5120-02
ora....-02.gsd application ONLINE ONLINE t5120-02
ora....-02.ons application ONLINE ONLINE t5120-02
ora....-02.vip application ONLINE ONLINE t5120-02
The output of "crs_stat -t" command after shutting down the node1:
oracle@t5120-02 $ crs_stat -t
Name Type Target State Host
ora.orcl.db application ONLINE ONLINE t5120-02
ora....l1.inst application OFFLINE OFFLINE
ora....l2.inst application ONLINE ONLINE t5120-02
ora....01.lsnr application OFFLINE OFFLINE
ora....-01.gsd application OFFLINE OFFLINE
ora....-01.ons application OFFLINE OFFLINE
ora....-01.vip application OFFLINE OFFLINE
ora....02.lsnr application ONLINE ONLINE t5120-02
ora....-02.gsd application ONLINE ONLINE t5120-02
ora....-02.ons application ONLINE ONLINE t5120-02
ora....-02.vip application ONLINE ONLINE t5120-02
So my questions are:
- Was my test scenario correct to check the failover ability of VIP in Oracle RAC?
- Is there any additional configuration needed to perform on the system to achieve the VIP failover?
Please help me in this case as I am new to Oracle RAC.
Thanks.
HuyNQ.

Dear Rajesh,
Sorry for late reply.
I have already tested 02 cases: shutting down a node and crashing a node. Below are the output of the log files in the 2 test cases.
Once again, when shutting down a node, the VIP did not failover although the CRS on that node was shutdown before all other services and resources of Sun Cluster shutdown.
Please help to check the log files and give me advise if you see anything abnormally.
Thanks.
* In case of shutting down the node 1: (at about 09:05 Sep 17)
Shutdown node 1:
root@t5120-01 # shutdown -y -g0 -i0
Shutdown started. Fri Sep 17 09:04:55 ICT 2010
Changing to init state 0 - please wait
Broadcast Message from root (console) on t5120-01 Fri Sep 17 09:04:55...
THE SYSTEM t5120-01 IS BEING SHUT DOWN NOW ! ! !
Log off now or risk your files being damaged
crsd.log file on node 2:
root@t5120-02 # more /u01/app/oracle/10.2.0/crs/log/t5120-02/crsd/crsd.log
2010-09-16 16:35:56.281: [  CRSRES][1326] t5120-02 : CRS-1019: Resource ora.t5120-01.gsd (application) cannot run on t5120-02
2010-09-16 16:35:56.320: [  CRSRES][1325] t5120-02 : CRS-1019: Resource ora.t5120-01.LISTENER_T5120-01.lsnr (application) cannot run on t5120-02
2010-09-16 16:35:56.346: [  CRSRES][1327] t5120-02 : CRS-1019: Resource ora.t5120-01.ons (application) cannot run on t5120-02
2010-09-16 17:06:10.202: [  CRSRES][1520] StopResource: setting CLI values
2010-09-17 09:06:10.567: [ CRSCOMM][5709] CLEANUP: Searching for connections to failed node t5120-01
2010-09-17 09:06:10.577: [  CRSEVT][5709] Processing member leave for t5120-01, incarnation: 11
2010-09-17 09:06:10.665: [    CRSD][5709] SM: recovery in process: 8
2010-09-17 09:06:10.665: [  CRSEVT][5709] Do failover for: t5120-01
2010-09-17 09:06:10.826: [  CRSEVT][5709] Post recovery done evmd event for: t5120-01
2010-09-17 09:06:10.898: [    CRSD][5709] SM: recoveryDone: 0
2010-09-17 09:06:10.918: [  CRSEVT][5710] Processing RecoveryDone
crs_stat -t on node 2:
oracle@t5120-02 $ crs_stat -t
Name Type Target State Host
ora.orcl.db application ONLINE ONLINE t5120-02
ora....l1.inst application OFFLINE OFFLINE
ora....l2.inst application ONLINE ONLINE t5120-02
ora....01.lsnr application OFFLINE OFFLINE
ora....-01.gsd application OFFLINE OFFLINE
ora....-01.ons application OFFLINE OFFLINE
ora....-01.vip application OFFLINE OFFLINE
ora....02.lsnr application ONLINE ONLINE t5120-02
ora....-02.gsd application ONLINE ONLINE t5120-02
ora....-02.ons application ONLINE ONLINE t5120-02
ora....-02.vip application ONLINE ONLINE t5120-02
* In case of crashing the node 1: (at about 09:32 Sep 17)
Crash the node 1:
root@t5120-01 # Sep 17 09:31:16 t5120-01 Cluster.CCR: pmmd: fsync_core_files: could not get any core file paths: pcorefile error Invalid argument, gcorefile error Invalid argument, zcorefile error Invalid argument
Sep 17 09:31:16 t5120-01 Cluster.CCR: [ID 408757 daemon.alert] pmmd: fsync_core_files: could not get any core file paths: pcorefile error Invalid argument, gcorefile error Invalid argument, zcorefile error Invalid argument
Notifying cluster that this node is panicking
crsd.log file on node 2:
root@t5120-02 # tail -30 /u01/app/oracle/10.2.0/crs/log/t5120-02/crsd/crsd.log
2010-09-16 16:35:56.281: [  CRSRES][1326] t5120-02 : CRS-1019: Resource ora.t5120-01.gsd (application) cannot run on t5120-02
2010-09-16 16:35:56.320: [  CRSRES][1325] t5120-02 : CRS-1019: Resource ora.t5120-01.LISTENER_T5120-01.lsnr (application) cannot run on t5120-02
2010-09-16 16:35:56.346: [  CRSRES][1327] t5120-02 : CRS-1019: Resource ora.t5120-01.ons (application) cannot run on t5120-02
2010-09-16 17:06:10.202: [  CRSRES][1520] StopResource: setting CLI values
2010-09-17 09:06:10.567: [ CRSCOMM][5709] CLEANUP: Searching for connections to failed node t5120-01
2010-09-17 09:06:10.577: [  CRSEVT][5709] Processing member leave for t5120-01, incarnation: 11
2010-09-17 09:06:10.665: [    CRSD][5709] SM: recovery in process: 8
2010-09-17 09:06:10.665: [  CRSEVT][5709] Do failover for: t5120-01
2010-09-17 09:06:10.826: [  CRSEVT][5709] Post recovery done evmd event for: t5120-01
2010-09-17 09:06:10.898: [    CRSD][5709] SM: recoveryDone: 0
2010-09-17 09:06:10.918: [  CRSEVT][5710] Processing RecoveryDone
2010-09-17 09:32:08.810: [ CRSCOMM][5837] CLEANUP: Searching for connections to failed node t5120-01
2010-09-17 09:32:08.811: [  CRSEVT][5837] Processing member leave for t5120-01, incarnation: 13
2010-09-17 09:32:08.824: [    CRSD][5837] SM: recovery in process: 8
2010-09-17 09:32:08.824: [  CRSEVT][5837] Do failover for: t5120-01
2010-09-17 09:32:09.036: [  CRSRES][5837] startup = 0
2010-09-17 09:32:09.075: [  CRSRES][5837] startup = 0
2010-09-17 09:32:09.106: [  CRSRES][5837] startup = 0
2010-09-17 09:32:09.132: [  CRSRES][5837] startup = 0
2010-09-17 09:32:09.153: [  CRSRES][5837] startup = 0
2010-09-17 09:32:09.565: [  CRSRES][5839] startRunnable: setting CLI values
2010-09-17 09:32:09.575: [  CRSRES][5839] Attempting to start `ora.t5120-01.vip` on member `t5120-02`
2010-09-17 09:32:16.276: [  CRSRES][5839] Start of `ora.t5120-01.vip` on member `t5120-02` succeeded.
2010-09-17 09:32:16.340: [  CRSEVT][5837] Post recovery done evmd event for: t5120-01
2010-09-17 09:32:16.342: [    CRSD][5837] SM: recoveryDone: 0
2010-09-17 09:32:16.348: [  CRSEVT][5846] Processing RecoveryDone
crs_stat -t on node 2:
oracle@t5120-02 $ crs_stat -t
Name Type Target State Host
ora.orcl.db application ONLINE ONLINE t5120-02
ora....l1.inst application ONLINE OFFLINE
ora....l2.inst application ONLINE ONLINE t5120-02
ora....01.lsnr application ONLINE OFFLINE
ora....-01.gsd application ONLINE OFFLINE
ora....-01.ons application ONLINE OFFLINE
ora....-01.vip application ONLINE ONLINE t5120-02
ora....02.lsnr application ONLINE ONLINE t5120-02
ora....-02.gsd application ONLINE ONLINE t5120-02
ora....-02.ons application ONLINE ONLINE t5120-02
ora....-02.vip application ONLINE ONLINE t5120-02

Similar Messages

  • JDBC connection creation with ORACLE RAC

    Hello All,
    Here my scenario is when ever one of my VIP instance in Oracle RAC goes down.Weblogic/Java(JDBC) is taking close to 3 minutes for failover with secondary host. I am looking for a solution to reduce the connect time failover seconds..
    jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=HOST1)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=HOST2)(PORT=1521))(FAILOVER=on)(LOAD_BALANCE=off))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=SERVICE)))

    Hi,
    In such case failover will depend on RAC instead on Weblogic.
    Oracle will never recommend such way.
    Please try to use Mutli datasource to implement such failover and in older version we have 60 seconds for failover time.
    But we can change but adding one Bug (currently not remember).
    Regards,
    Kal

  • Failover not happening the Oracle RAC 10g

    Hi All,
    I am new to RAC.
    I have installed Oracle RAC 10g on Redhat Linux 4.0. Till yesterday failover was happening that is when i stopped one instance on node01 the vip of node01 was transferred to node02.This was shown using ifconfig -a but now that is now happening.Don't know as what has happened.Can you please help me out
    Below information is given:
    [oracle@node01 ~]$ crs_stat -t
    Name Type Target State Host
    ora.hitesh.db application ONLINE ONLINE node02
    ora....h1.inst application ONLINE ONLINE node01
    ora....h2.inst application OFFLINE OFFLINE
    ora....SM1.asm application ONLINE ONLINE node01
    ora....01.lsnr application ONLINE ONLINE node01
    ora.node01.gsd application ONLINE ONLINE node01
    ora.node01.ons application ONLINE ONLINE node01
    ora.node01.vip application ONLINE ONLINE node01
    ora....SM2.asm application ONLINE ONLINE node02
    ora....02.lsnr application ONLINE ONLINE node02
    ora.node02.gsd application ONLINE ONLINE node02
    ora.node02.ons application ONLINE ONLINE node02
    ora.node02.vip application ONLINE ONLINE node02
    Listner status on node01 is given:
    [oracle@node01 ~]$ lsnrctl status
    LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 06-APR-2013 12:59:29
    Copyright (c) 1991, 2005, Oracle. All rights reserved.
    Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
    STATUS of the LISTENER
    Alias LISTENER_NODE01
    Version TNSLSNR for Linux: Version 10.2.0.1.0 - Production
    Start Date 06-APR-2013 11:59:03
    Uptime 0 days 1 hr. 0 min. 25 sec
    Trace Level off
    Security ON: Local OS Authentication
    SNMP OFF
    Listener Parameter File /home/oracle/oracle/product/10.2.0/db_1/network/admin/listener.ora
    Listener Log File /home/oracle/oracle/product/10.2.0/db_1/network/log/listener_node01.log
    Listening Endpoints Summary...
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.131)(PORT=1521)))
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=1521)))
    Services Summary...
    Service "+ASM" has 1 instance(s).
    Instance "+ASM1", status BLOCKED, has 1 handler(s) for this service...
    Service "+ASM_XPT" has 1 instance(s).
    Instance "+ASM1", status BLOCKED, has 1 handler(s) for this service...
    Service "PLSExtProc" has 1 instance(s).
    Instance "PLSExtProc", status UNKNOWN, has 1 handler(s) for this service...
    Service "hitesh" has 2 instance(s).
    Instance "hitesh1", status READY, has 2 handler(s) for this service...
    Instance "hitesh2", status READY, has 1 handler(s) for this service...
    Service "hiteshXDB" has 2 instance(s).
    Instance "hitesh1", status READY, has 1 handler(s) for this service...
    Instance "hitesh2", status READY, has 1 handler(s) for this service...
    Service "hitesh_XPT" has 2 instance(s).
    Instance "hitesh1", status READY, has 2 handler(s) for this service...
    Instance "hitesh2", status READY, has 1 handler(s) for this service...
    The command completed successfully
    [root@node01 oracle]# crsctl check crs
    CSS appears healthy
    CRS appears healthy
    EVM appears healthy
    [root@node01 oracle]# ps -ef | grep lmon
    oracle 5741 1 0 12:07 ? 00:00:03 ora_lmon_hitesh1
    root 22582 20805 0 13:01 pts/2 00:00:00 grep lmon
    oracle 23643 1 0 11:58 ? 00:00:01 asm_lmon_+ASM1
    Please let me know what information else is required
    Edited by: user12924280 on Apr 6, 2013 12:36 AM

    Since you didn't say "thank you", I assumed my time was of no value to you.
    However, I shall try again.
    There is no relationship between instance failure and VIP failover. How can there be? What if you are running ten instances on each node, and one fails? Would you want the VIP to relocate? And I've already told you how to test it: kill the node. Just reboot it.

  • Oracle RAC 11g R1 Release Connection Failover Problem

    Hi All,
    In our Architecture we are using Oracle RAC 11g R1. Below is the JDBC URL :
    JDBCURL = jdbc:oracle:thin:@(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP)(HOST = Host1-vip)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = Host2-vi
    p)(PORT = 1521))(LOAD_BALANCE = ON)(FAILOVER=ON)(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = <Service_name>)))
    We are using two node RAC. The problem is whenever we are rebooting a Node and rejoin the cluster, Application Servers are not able to recognize that.
    Suppose we have node1 and node2, I will take down node1 (freeze the cluster) and then reboot node1 and bring it back up( and join the cluster). At this point, My application servers are not able to recognize that some new DBserver(node1) had joined the cluster until I restart my application servers.
    Please Provide me a solution for this. Thanks alot to everyone in advance.
    Edited by: 877010 on Aug 4, 2011 2:00 PM
    Edited by: 877010 on Aug 8, 2011 10:19 AM

    Please try using this
    JDBCURL = jdbc:oracle:thin:@(DESCRIPTION =(ADDRESS = (PROTOCOL = TCP)(HOST = Host1-vip)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = Host2-vi
    p)(PORT = 1521))(LOAD_BALANCE = YES)(FAILOVER=YES)(CONNECT_DATA =(SERVER = DEDICATED)(SERVICE_NAME = <Service_name>)))

  • Oracle11g r2 Grid/RAC  VIP failover instead of SCAN VIP failover

    Dear Experts and Gurus
    Our Platform: 2-Node ORACLE11G r2 RAC/GRID 11.2.0.1.0
    ReadHat Enterprise Linux5.3 64 bit
    We have not available the DNS Server for used to SCAN feature of Oracle11g r2 GRID/RAC.
    we have successfully deployed the the setup using scan-vip in /etc/host in our production site.
    we want to used the Oracle11g r2 Grid/RAC as Oracle10g r2 RAC/Oracle11g r1 RAC(VIP Failover)
    plz find the default configurations of my setup.
    cat /etc/hosts
    #public
    xxx.xxx.0.1 xyz-ch-aaadb-01
    xxx.xxx.0.2 xyzl-ch-aaadb-02
    #Virtual
    xxx.xxx.0.3 xyz-ch-aaadb-01-vip
    xxx.xxx.0.4 xyz-ch-aaadb-02-vip
    #Private
    10.10.0.1 xyz-ch-aaadb-01-priv
    10.10.0.2 xyz-ch-aaadb-01-priv
    #Scan
    xxx.xxx.0.5 rac-scan
    cat listener.ora
    listener.ora in both the RAC nodes
    LISTENER=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))) # line added by Agent
    LISTENER_SCAN1=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))) # line added by Agent
    ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN1=ON # line added by Agent
    ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER=ON # line added by Agent
    cat tnsnames.ora.
    AAADB =
    (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac-scan)(PORT = 1521))
    (CONNECT_DATA =
    (SERVER = DEDICATED)
    (SERVICE_NAME = aaadb)
    aaadb2 =
    (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = xyz-ch-aaadb-02-vip)(PORT = 1521))
    (CONNECT_DATA =
    (SERVER = DEDICATED)
    (SERVICE_NAME = aaadb)
    (INSTANCE_NAME = aaadb2)
    aaadb1 =
    (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = xyz-ch-aaadb-01-vip)(PORT = 1521))
    (CONNECT_DATA =
    (SERVER = DEDICATED)
    (SERVICE_NAME = aaadb)
    (INSTANCE_NAME = aaadb1)
    listener parameters
    RAC-NODE1
    SQL> show parameter listener
    NAME TYPE VALUE
    listener_networks string
    local_listener string (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=xyz-ch-aaadb-01-vip)(PORT=1521))))
    remote_listener string rac-scan:1521
    RAC-NODE2
    SQL> show parameter listener
    NAME TYPE VALUE
    listener_networks string
    local_listener string (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=xyz-ch-aaadb-02-vip)(PORT=1521))))
    remote_listener string rac-scan:1521
    listener status
    RAC Node-1
    [oracle@aaarac1 ~]$ lsnrctl status
    LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 30-AUG-2011 23:43:50
    Copyright (c) 1991, 2009, Oracle. All rights reserved.
    Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
    STATUS of the LISTENER
    Alias LISTENER
    Version TNSLSNR for Linux: Version 11.2.0.1.0 - Production
    Start Date 30-AUG-2011 22:31:34
    Uptime 0 days 1 hr. 12 min. 15 sec
    Trace Level off
    Security ON: Local OS Authentication
    SNMP OFF
    Listener Parameter File /u01/app/11.2.0/grid/network/admin/listener.ora
    Listener Log File /u01/app/oracle/diag/tnslsnr/aaarac1/listener/alert/log.xml
    Listening Endpoints Summary...
    (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xyz-ch-aaadb-01-vip)(PORT=1521)))
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xxx.xxx.0.1)(PORT=1521)))
    Services Summary...
    Service "+ASM" has 1 instance(s).
    Instance "+ASM1", status READY, has 1 handler(s) for this service...
    Service "aaadb" has 1 instance(s).
    Instance "aaadb1", status READY, has 1 handler(s) for this service...
    Service "aaadbXDB" has 1 instance(s).
    Instance "aaadb1", status READY, has 1 handler(s) for this service...
    The command completed successfully
    RAC Node-2
    [oracle@aaarac2 ~]$ lsnrctl status
    LSNRCTL for Linux: Version 11.2.0.1.0 - Production on 30-AUG-2011 23:44:27
    Copyright (c) 1991, 2009, Oracle. All rights reserved.
    Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
    STATUS of the LISTENER
    Alias LISTENER
    Version TNSLSNR for Linux: Version 11.2.0.1.0 - Production
    Start Date 30-AUG-2011 22:08:45
    Uptime 0 days 1 hr. 35 min. 42 sec
    Trace Level off
    Security ON: Local OS Authentication
    SNMP OFF
    Listener Parameter File /u01/app/11.2.0/grid/network/admin/listener.ora
    Listener Log File /u01/app/oracle/diag/tnslsnr/aaarac2/listener/alert/log.xml
    Listening Endpoints Summary...
    (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xyz-ch-aaadb-02-vip)(PORT=1521)))
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=xxx.xxx.0.2)(PORT=1521)))
    Services Summary...
    Service "+ASM" has 1 instance(s).
    Instance "+ASM2", status READY, has 1 handler(s) for this service...
    Service "aaadb" has 1 instance(s).
    Instance "aaadb2", status READY, has 1 handler(s) for this service...
    Service "aaadbXDB" has 1 instance(s).
    Instance "aaadb2", status READY, has 1 handler(s) for this service...
    The command completed successfully
    plz suggest the provide the step to configure the listener.ora and tnsnames.ora for use the Oracle11g r2 Grid/RAC to use as
    VIP failover instead of SCAN-VIP failover.
    Regards
    Hitgon
    Edited by: hitgon on Aug 31, 2011 12:14 AM

    hitgon wrote:
    Dear Experts and Gurus
    plz suggest the provide the step to configure the listener.ora and tnsnames.ora for use the Oracle11g r2 Grid/RAC to use as
    VIP failover instead of SCAN-VIP failover.
    Regards
    Hitgon
    Hi,
    Have a read http://download.oracle.com/docs/cd/E11882_01/network.112/e10836/advcfg.htm#NETAG348
    Hope it helps
    CHeers

  • In oracle rac, If user query a select query and in processing data is fetched but in the duration of fetching the particular node is evicted then how failover to another node internally?

    In oracle rac, If user query a select query and in processing data is fetched but in the duration of fetching the particular node is evicted then how failover to another node internally?

    The query is re-issued as a flashback query and the client process can continue to fetch from the cursor. This is described in the Net Services Administrators Guide, the section on Transparent Application Failover.

  • Oracle RAC installation failover

    Hi,
    I have an Oracle RAC installation with 2 nodes with the data stored on a shared OCFS partition. I had a client test the connection using jdbc string for RAC failover. I tried shutting down one of the nodes on the RAC installation and the client could not connect to the oracle cluster database for the next 5 to 10mins.
    I understand that the client would failover to the next available listener (On the next retry connection) if the node it is currently listening to has failed. Is there any configuration i should make to increase the failover efficiency?
    Thanks for any advice.

    Hi,
    Server side failover is arranged by setting the remote_listener parameter.
    Client side failover is set by using T(ransparent) A(pplication) F(ailover) (9i and higher)
    or F(ast)C(onnection)F(ailover). Both are documented in the Net administrators manual for the version you didn't care to mention.
    As far as I know, both TAF and FCF are not supported by the JDBC thin driver.
    Sybrand Bakker
    Senior Oracle DBA

  • Oracle RAC nodeapp 启动报错-vip:IP:192.168.2.200 is already up in the network

    Linux redhat 5 Oracle RAC 10.2.0.5环境
    启动nodeapp报错如下:
    [oracle@rac1 ~]$ srvctl start nodeapps -n rac1
    CRS-0210: Could not find resource ora.rac1.gsd.
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
    crsd.log部分日志如下:
    2012-11-21 20:10:16.831: [  CRSRES][2717907856]0startRunnable: setting CLI values
    2012-11-21 20:10:16.967: [  CRSRES][2717907856]0Attempting to start `ora.rac1.vip` on member `rac1`
    2012-11-21 20:10:44.246: [  CRSAPP][2717907856]0StartResource error for ora.rac1.vip error code = 1
    2012-11-21 20:10:47.007: [  CRSRES][2717907856]0Start of `ora.rac1.vip` on member `rac1` failed.
    2012-11-21 20:10:47.529: [  CRSRES][2717907856]0Attempting to start `ora.rac1.vip` on member `rac2`
    2012-11-21 20:11:18.649: [  CRSRES][2717907856]0Start of `ora.rac1.vip` on member `rac2` failed.
    2012-11-21 20:11:18.897: [  CRSRES][2717907856]0CRS-1006: No more members to consider
    2012-11-21 20:11:20.986: [  CRSRES][2717907856]0startRunnable: setting CLI values
    2012-11-21 20:11:21.190: [  CRSRES][2717907856]0Attempting to start `ora.rac1.vip` on member `rac1`
    2012-11-21 20:11:48.846: [  CRSAPP][2717907856]0StartResource error for ora.rac1.vip error code = 1
    2012-11-21 20:11:51.203: [  CRSRES][2717907856]0Start of `ora.rac1.vip` on member `rac1` failed.
    2012-11-21 20:11:51.492: [  CRSRES][2717907856]0rac2 : CRS-1019: Resource ora.rac1.LISTENER_RAC1.lsnr (application) cannot run on rac2
    请问如何进一步排查“rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)”问题?感谢

    ping信息:
    [oracle@rac1 ~]$ ping 192.168.2.200
    PING 192.168.2.200 (192.168.2.200) 56(84) bytes of data.
    64 bytes from 192.168.2.200: icmp_seq=1 ttl=64 time=0.043 ms
    64 bytes from 192.168.2.200: icmp_seq=2 ttl=64 time=0.126 ms
    64 bytes from 192.168.2.200: icmp_seq=3 ttl=64 time=0.059 ms
    64 bytes from 192.168.2.200: icmp_seq=4 ttl=64 time=0.158 ms
    64 bytes from 192.168.2.200: icmp_seq=5 ttl=64 time=0.643 ms
    64 bytes from 192.168.2.200: icmp_seq=6 ttl=64 time=0.034 ms
    64 bytes from 192.168.2.200: icmp_seq=7 ttl=64 time=0.046 ms
    64 bytes from 192.168.2.200: icmp_seq=8 ttl=64 time=0.043 ms
    64 bytes from 192.168.2.200: icmp_seq=9 ttl=64 time=0.048 ms
    64 bytes from 192.168.2.200: icmp_seq=10 ttl=64 time=0.031 ms
    telnet信息
    [oracle@rac1 ~]$ telnet 192.168.2.200
    Trying 192.168.2.200...
    Connected to 192.168.2.200.
    Escape character is '^]'.
    Red Hat Enterprise Linux Server release 5 (Tikanga)
    Kernel 2.6.18-8.el5xen on an i686
    login:
    ssh信息:
    [oracle@rac1 ~]$ ssh 192.168.2.200
    Last login: Sun Nov 18 13:37:10 2012 from rac2-vip
    ifconfig -a信息:
    [root@rac1 ~]# ifconfig -a
    eth0 Link encap:Ethernet HWaddr 00:0C:29:B6:CE:6B
    inet addr:192.168.2.100 Bcast:192.168.2.255 Mask:255.255.255.0
    inet6 addr: fe80::20c:29ff:feb6:ce6b/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:142396 errors:0 dropped:0 overruns:0 frame:0
    TX packets:172561 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:41403126 (39.4 MiB) TX bytes:96009307 (91.5 MiB)
    Interrupt:18 Base address:0x1480
    eth1 Link encap:Ethernet HWaddr 00:0C:29:B6:CE:75
    inet addr:192.168.2.200 Bcast:192.168.2.255 Mask:255.255.255.0
    inet6 addr: fe80::20c:29ff:feb6:ce75/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:14082 errors:0 dropped:0 overruns:0 frame:0
    TX packets:29 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:9789756 (9.3 MiB) TX bytes:1434 (1.4 KiB)
    Interrupt:19 Base address:0x1800
    eth2 Link encap:Ethernet HWaddr 00:0C:29:B6:CE:7F
    inet addr:192.168.3.100 Bcast:192.168.3.255 Mask:255.255.255.0
    inet6 addr: fe80::20c:29ff:feb6:ce7f/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:12665 errors:0 dropped:0 overruns:0 frame:0
    TX packets:32728 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:6590271 (6.2 MiB) TX bytes:28437643 (27.1 MiB)
    Interrupt:16 Base address:0x1880
    lo Link encap:Local Loopback
    inet addr:127.0.0.1 Mask:255.0.0.0
    inet6 addr: ::1/128 Scope:Host
    UP LOOPBACK RUNNING MTU:16436 Metric:1
    RX packets:30893 errors:0 dropped:0 overruns:0 frame:0
    TX packets:30893 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:9402131 (8.9 MiB) TX bytes:9402131 (8.9 MiB)
    sit0 Link encap:IPv6-in-IPv4
    NOARP MTU:1480 Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

  • Oracle 11g clusterware VIP failover failed

    I installed Oracle 11g Clusterware succesfully, without any errors as per link:
    http://www.oracle-base.com/articles/11g/OracleDB11gR1RACInstallationOnRHEL5UsingVMwareESXAndNFS.php
    After that,I did vip failover test
    I rebooted the node-2
    Before reboot,
    [root@advansrac1 bin]# ./crs_stat -t
    Name Type Target State Host
    ora....ac1.gsd application ONLINE ONLINE rac1
    ora....ac1.ons application ONLINE ONLINE rac1
    ora....ac1.vip application ONLINE ONLINE rac1
    ora....ac2.gsd application ONLINE ONLINE rac2
    ora....ac2.ons application ONLINE ONLINE rac2
    ora....ac2.vip application ONLINE ONLINE rac2
    After reboot,
    Name Type Target State Host
    ora....ac1.gsd application ONLINE ONLINE rac1
    ora....ac1.ons application ONLINE ONLINE rac1
    ora....ac1.vip application ONLINE ONLINE rac1
    ora....ac2.gsd application ONLINE ONLINE rac2
    ora....ac2.ons application ONLINE ONLINE rac2
    ora....ac2.vip application ONLINE UNKNOWN rac1
    [root@rac1 bin]# ./crs_stop ora.rac2.vip
    Attempting to stop `ora.rac2.vip` on member `advansrac1`
    Stop of `ora.rac2.vip` on member `advansrac1` succeeded.
    [root@rac1 bin]# ./crs_stat -t
    Name Type Target State Host
    ora....ac1.gsd application ONLINE ONLINE rac1
    ora....ac1.ons application ONLINE ONLINE rac1
    ora....ac1.vip application ONLINE ONLINE rac1
    ora....ac2.gsd application ONLINE ONLINE rac2
    ora....ac2.ons application ONLINE ONLINE rac2
    ora....ac2.vip application OFFLINE OFFLINE
    [root@rac1 bin]# ./crs_start ora.rac2.vip
    Attempting to start `ora.rac2.vip` on member `rac2`
    Start of `ora.rac2.vip` on member `rac2` succeeded.
    [root@rac1 bin]# ./crs_stat -t
    Name Type Target State Host
    ora....ac1.gsd application ONLINE ONLINE rac1
    ora....ac1.ons application ONLINE ONLINE rac1
    ora....ac1.vip application ONLINE ONLINE rac1
    ora....ac2.gsd application ONLINE ONLINE rac2
    ora....ac2.ons application ONLINE ONLINE rac2
    ora....ac2.vip application ONLINE ONLINE rac2
    I have only 1.5G on each node
    Here the issue is,
    1. Actual Result: # During failover why it is showing UNKNOWN state for ora.rac2.vip on member rac1
    Expected Result: # During failover,it have to be ONLINE state for ora.rac2.vip on member rac1
    2. I have to start ora.rac2.vip manually, when node-2 is up.I want VIP fail over have to happen automatically when node-2 is up to normal online state.
    Help me out from this issue

    VMware is unsupported but that is likely not your issue.
    1. Run cluster verify and report the results
    2. Did you create a failover service? How?
    3. Post your TNSNAMES.ORA

  • Handling Oracle RAC Failover at the application layer

    Hi All,
    I am currently researching best practices for handling oracle RAC failover at the application layer since transactional statements (INSERT,UPDATE and DELETE) are not handled transparently. So I have few questions for the community:
    1. In case of a node failure would I need to roll back all of transactional statements that are part of the transaction or would I have to re-execute the one that failed only?
    2. Does things change with XA 2 phase commit transactions?
    Any input and/or architecture suggestions would be much appreciated.
    Regards,
    Dmitriy Frolov

    Hi,
    the Oracle RAC stack works very vell on its own, without the need of a third party clusterware. It will be aware of failures on the nodes.
    Database will be using ASM or RAW or NFS devices for Shared storage.
    Oracle Clusterware can also be configured to monitor your applications and to a failover to other nodes.
    However if you need a shared filesystem for your own applications, then you will have to use QFS or similar (which again than requires Sun Cluster).
    See: http://www.oracle.com/technology/products/database/clusterware/index.html
    Regards
    Sebastian

  • Oracle RAC Failover (CTF) support

    hi! every developers.
    I am running Portal 6.0
    In the WAS6.40 JDBC configuration,
    Is this JDBC driver THIN?
    and Is this connection String effective on failover (CTF) with Oracle RAC?
    below is our current configration for reference.
    Connection conn =
    DriverManager.getConnection("jdbc:oracle:thin:@(DESCRIPTION =
    (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP) (HOST = *.*.*.11) (PORT = 1521))
    (ADDRESS = (PROTOCOL = TCP) (HOST = *.*.*.12) (PORT = 1521))) (CONNECT_DATA =
    (SERVICE_NAME = HIDB))) ", "scott", "tiger");
    Does anything to need in my configuration WAS6.40?
    thanks in advance.

    Considering Oracle wrote DIF/DIX support for Linux, yes it does. It requires the use of ASMlib as well.
    See Preventing Silent Data Corruption in Oracle Linux for more information.

  • VIP Failover Testing

    Hello,
    Am new to Oracle RAC. We have a 2 node 11gR2 Cluster and we are in the process of doing some failover testing. For database deployments we use an internal third part tool called the deployer which has tokens for DB configurations and the DBHost token in the deployer has the Hostname for either Node 1 or Node 2. In this way we are not actually utilising the HA feature because the connection is either to Node1 or Node 2 and if something happens to either the deployment cannot connect to the database on the respective Node which treats as a single Node instead of a Cluster.
    Instead of mentioning the DBHost value to point to the Physical Hostname of the Server in a Cluster I was thinking if I can use the VIP address i.e ipaddress-VIP for either of the Node. So after making changes I would like to do some failover testing manually and I am stuck here. How do I go about the testing scenarios.
    For Eg: if DBHOST token value is VIP for Node 2, connections are coming in to Node2 via deployer how do I proceed with the testing
    Should I bring down Node 2? If I reboot how can I see if it failed over or not to the surviving Node?
    Any help/suggestions much appreciated.
    Thanks!

    What you describe is having a RAC cluster, possibly working possibly not, and no actual use of the value of the licensing you paid for.
    My first advice to you is to read the docs and learn what RAC is, how it works, how to define and use services, and how a properly configured LISTENER.ORA and TNSNAMES.ORA should be constructed so you can compare that to what you have. With 11gR2 you should connect to the SCAN not the VIP.
    Here's how I would test RAC:
    1. Walk up to one of the servers while half the users are connected to each instance and do a SHUTDOWN ABORT. See what happens. Restart the killed node. Try it with the other node.
    2. With everything running properly and load on both machines disconnect the switch that provides the cache fusion interconnect or pull one of the cables out of the server. When you reestablish the connection what happens?
    3. Repeat #2 but this time with the connection to storage.
    The above should get you started.

  • TAF Failover issue when RAC node shutdown

    Dear all,
    We have a two-node RAC database. We use sqlplus from a client laptop to test RAC TAF failover when one node is being shutdown. And there's a tnsnames.ora file with TAF settings in the client laptop.
    First we connect to RAC database via sqlplus, when we are under the "SQL>" command prompt, we type " select instance_name from v$instance; " and we can see what instance we truely connect to. Then we shutdown the node we truely connect to; At the meanwhile, if we type "select instance_name from v$instance;" again right away, sometimes the sqlplus hangs and with no response; but if we wait utill the VIP failover to another node then type "select instance_name from v$instance;" we can see it always show the other node's instance name and we know the session is successfully failover to the healthy node.
    My question is :
    Does RAC TAF failover can always and "no down time" failover the session to another healthy node? Or there are some circumstances that the session would hang and need to connect again?
    Any help would be appreciated.

    Hi, thanks for your help.
    There are many things you have to do but if you don't have the knowledge will be difficult.Right. The cluster was setup by consultants but we're still trying to pick up basic Oracle knowledge by self study...
    Found some messages about eviction in old cssd logs in $ORA_CRS_HOME/log/cssd/. Will further dig into it.
    Yes, we tried rebooting different nodes many times in the clusters before, without any problem.
    Thanks a lot.
    /ST Wong

  • Oracle RAC interfaces

    HI,
    While ORACLE 10g R2 CRS Installation , why should we give public interface information.
    I understand we give private interface information because it should be used for Inter instance communication.
    But why do we need to specify public interface.Is this poublic interface gonna be used for Cache fusion at all?
    Thanks
    Pramod

    When one server goes down, the other server will take up both virtual IPs on the public interface, ensuring there is no delay in failover.
    The interface type indicates the purpose for which the network is configured. The supported interface types are:
    Public—An interface that can be used for communication with components external to Oracle RAC instances, such as Oracle Net and Virtual Internet Protocol (VIP) addresses.
    Cluster_interconnect—A private interface used for the cluster interconnect to provide interinstance or Cache Fusion communication.
    If you set the interface type to cluster_interconnect, it affects instances as they start up and changes do not take effect until you restart the instances.

  • Oracle RAC 11G - Service configuration

    Hi,
    I have been reading a lot of documentation regarding oracle services and I have an ok understanding of how they work. However, I have a general question regarding configuring services using Oracle RAC. For instance, if I have a 2 node oracle 11GR2 RAC on a Linux Redhat server. I have an application that connects to a service I have created. I create the service as follows.
    srvctl add service -d ORCL_RAC -s APP_SERVICE -r ORCL_RAC1,ORCL_RAC2
    The tnsnames contains:
    APP_OLTP =
    (DESCRIPTION =
    (LOAD_BALANCE = ON)
    (FAILOVER = ON)
    (ADDRESS = (PROTOCOL = TCP)(HOST = server01)(PORT = 1521))
    (ADDRESS = (PROTOCOL = TCP)(HOST = server02)(PORT = 1521))
    (CONNECT_DATA =
    (SERVICE_NAME = APP_SERVICE)
    (FAILOVER_MODE =
    (TYPE = SELECT)
    (METHOD = BASIC)
    (RETRIES = 20)
    (DELAY = 1)
    My questions are as follows:
    1) When I do a 'srvctl status service -d ORCL_RAC', should I see the service running on both nodes of the RAC? Or does it run only one node, then it will fail over to the other when needed?
    2) If I have a RAC environment where I see two services created (RAC_SRV1 and RAC_SRV2). I see that RAC_SRV1 is only running on node1 and RAC_SRV2 is only running on node2. There are two applications sharing the same database, one application is using RAC_SRV1 and the other application is using RAC_SRV2. Am I correct in thinking that there is no failover available here? If node1 goes down, the application connecting to RAC_SRV1 will not be able to connect to node2 right?
    3) In the case of the scenario in question 2 above, would it be best practise to simply create one service and have both applications connecting to the one service? Could I configure the one service to point connections from one application to node1 and connections from the other application to node2?

    1) When I do a 'srvctl status service -d ORCL_RAC', should I see the service running on both nodes of the RAC? Or does it run only one node, then it will fail over to the other when needed?you can see its running on both nodes.
    use option -a in srvctl ( A list of available instances to which the service fails over when the database is administrator managed.)
    http://docs.oracle.com/cd/E11882_01/rac.112/e16795/srvctladmin.htm#i1008562
    2) If I have a RAC environment where I see two services created (RAC_SRV1 and RAC_SRV2). I see that RAC_SRV1 is only running on node1 and RAC_SRV2 is >only running on node2. There are two applications sharing the same database, one application is using RAC_SRV1 and the other application is using RAC_SRV2. Am >I correct in thinking that there is no failover available here? If node1 goes down, the application connecting to RAC_SRV1 will not be able to connect to node2 >right?All depend on your service configuration. check it by srvctl config
    3) In the case of the scenario in question 2 above, would it be best practise to simply create one service and have both applications connecting to the one >service? Could I configure the one service to point connections from one application to node1 and connections from the other application to node2? better create two service ,one for each application with specific node and other node in available list.

Maybe you are looking for

  • Using External HD for iTunes library.

    Hi All, How do I use my external HD to store all my iTunes music and also have my newly imported stuff go to the Ex HD also? I did read about 40 pages of postings and have not found the answer. Thanks COVRC

  • How to make full screen on safari

    how to make full screen on safari

  • Multiple message types for IDoc types

    Hi, can we assign multiple message types to IDoc Types? Is so How to make it Thanks in Advance.... Regards Sravya

  • Bloom Filter

    I'd like to implement a distributed bloom filter using Coherence. Bloom filter: it is an array of bits (true/false). The size of the array m is determined using the number of keys n we would like to encode. For example, for a given n=16 keys, the siz

  • Wanting to play a DOS game on MAC

    I have an old CD Rom game that I would like to play on my MAC. Is this possible. It runs from DOS. The name of the game is Where in the World is Carmen San Diego, Version 1, Copyright 1992 by Broderbund. Can anyone help?