Ons gsd of one node  offline,RAC

crs_start ora.whdb02.ons
Attempting to start `ora.whdb02.ons` on member `whdb02`
Start of `ora.whdb02.ons` on member `whdb02` failed.
whdb01 : CRS-1019: Resource ora.whdb02.ons (application) cannot run on whdb01
CRS-0215: Could not start resource 'ora.whdb02.ons'
oracle@whdb02:/oracle/app/oracle/product/10.2.0/db1/network/admin>$CRS_HOME/bin/onsctl start
ksh: /bin/onsctl: not found.
oracle@whdb02:/oracle/app/oracle/product/10.2.0/db1/network/admin>$ORA_CRS_HOME/bin/onsctl start
clsrons_init failed, stat = 504, ocrerr = 32
clsrons_init failed, stat = 504, ocrerr = 32
onsctl: ons failed to start
oracle@whdb02:/oracle/app/oracle/product/10.2.0/db1/network/admin>
oracle@whdb02:/oracle/app/oracle/product/10.2.0/db1/network/admin>
oracle@whdb02:/oracle/app/oracle/product/10.2.0/db1/network/admin>A_CRS_HOME/bin/onsctl STOP
ksh: A_CRS_HOME/bin/onsctl: not found.
oracle@whdb02:/oracle/app/oracle/product/10.2.0/db1/network/admin>$ORA_CRS_HOME/bin/onsctl stop
onsctl: shutting down ons daemon ...
clsrons_init failed, stat = 504, ocrerr = 32
onsctl: shutdown of ons failed!
oracle@whdb02:/oracle/app/oracle/product/10.2.0/db1/network/admin>$ORA_CRS_HOME/bin/onsctl start
clsrons_init failed, stat = 504, ocrerr = 32
clsrons_init failed, stat = 504, ocrerr = 32
onsctl: ons failed to startoracle@whdb02:/oracle/app/oracle/product/10.2.0/db1/network/ad

gsd log
2010-11-10 13:16:45.515: [    RACG][1] [274628][1][ora.whdb02.gsd]: clsrcexecut: cmd = /oracle/app/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /oracle/app/oracle/product/10.2.0/crs/bin/gsdctl stat
2010-11-10 13:16:45.515: [    RACG][1] [274628][1][ora.whdb02.gsd]: clsrcexecut: rc = 1, time = 2.703s
2010-11-10 13:16:45.515: [    RACG][1] [274628][1][ora.whdb02.gsd]: end for resource = ora.whdb02.gsd, action = start, status = 1, time = 54.109s
2010-11-10 13:16:45.989: [    RACG][1] [274634][1][ora.whdb02.gsd]: clsrcgetprsrctx: prsr_init_ext returned rc = 3
2010-11-10 13:16:48.694: [    RACG][1] [274634][1][ora.whdb02.gsd]: GSD is not running on the local node
2010-11-10 14:09:27.008: [    RACG][1] [573578][1][ora.whdb02.gsd]: clsrcgetprsrctx: prsr_init_ext returned rc = 3
2010-11-10 14:10:18.016: [    RACG][1] [573578][1][ora.whdb02.gsd]: Failed to start GSD on local node
2010-11-10 14:10:18.016: [    RACG][1] [573578][1][ora.whdb02.gsd]: clsrcexecut: cmd = /oracle/app/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /oracle/app/oracle/product/10.2.0/crs/bin/gsdctl start
2010-11-10 14:10:18.016: [    RACG][1] [573578][1][ora.whdb02.gsd]: clsrcexecut: rc = 1, time = 51.008s
2010-11-10 14:10:20.720: [    RACG][1] [573578][1][ora.whdb02.gsd]: GSD is not running on the local node
2010-11-10 14:10:20.720: [    RACG][1] [573578][1][ora.whdb02.gsd]: clsrcexecut: cmd = /oracle/app/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /oracle/app/oracle/product/10.2.0/crs/bin/gsdctl stat
2010-11-10 14:10:20.720: [    RACG][1] [573578][1][ora.whdb02.gsd]: clsrcexecut: rc = 1, time = 2.704s
2010-11-10 14:10:20.720: [    RACG][1] [573578][1][ora.whdb02.gsd]: end for resource = ora.whdb02.gsd, action = start, status = 1, time = 54.109s
2010-11-10 14:10:21.195: [    RACG][1] [573584][1][ora.whdb02.gsd]: clsrcgetprsrctx: prsr_init_ext returned rc = 3
2010-11-10 14:10:23.900: [    RACG][1] [573584][1][ora.whdb02.gsd]: GSD is not running on the local node
2010-11-10 15:33:02.434: [    RACG][1] [618716][1][ora.whdb02.gsd]: clsrcgetprsrctx: prsr_init_ext returned rc = 3
2010-11-10 15:33:56.467: [    RACG][1] [618716][1][ora.whdb02.gsd]: Failed to start GSD on local node
2010-11-10 15:33:56.467: [    RACG][1] [618716][1][ora.whdb02.gsd]: clsrcexecut: cmd = /oracle/app/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /oracle/app/oracle/product/10.2.0/crs/bin/gsdctl start
2010-11-10 15:33:56.467: [    RACG][1] [618716][1][ora.whdb02.gsd]: clsrcexecut: rc = 1, time = 54.024s
2010-11-10 15:33:59.171: [    RACG][1] [618716][1][ora.whdb02.gsd]: GSD is not running on the local node
2010-11-10 15:33:59.171: [    RACG][1] [618716][1][ora.whdb02.gsd]: clsrcexecut: cmd = /oracle/app/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /oracle/app/oracle/product/10.2.0/crs/bin/gsdctl stat
2010-11-10 15:33:59.171: [    RACG][1] [618716][1][ora.whdb02.gsd]: clsrcexecut: rc = 1, time = 2.704s
2010-11-10 15:33:59.171: [    RACG][1] [618716][1][ora.whdb02.gsd]: end for resource = ora.whdb02.gsd, action = start, status = 1, time = 57.130s
2010-11-10 15:33:59.646: [    RACG][1] [618722][1][ora.whdb02.gsd]: clsrcgetprsrctx: prsr_init_ext returned rc = 3
2010-11-10 15:34:02.351: [    RACG][1] [618722][1][ora.whdb02.gsd]: GSD is not running on the local node
2010-11-10 15:40:29.176: [    RACG][1] [503954][1][ora.whdb02.gsd]: clsrcgetprsrctx: prsr_init_ext returned rc = 3
2010-11-10 15:41:20.184: [    RACG][1] [503954][1][ora.whdb02.gsd]: Failed to start GSD on local node
2010-11-10 15:41:20.184: [    RACG][1] [503954][1][ora.whdb02.gsd]: clsrcexecut: cmd = /oracle/app/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /oracle/app/oracle/product/10.2.0/crs/bin/gsdctl start
2010-11-10 15:41:20.184: [    RACG][1] [503954][1][ora.whdb02.gsd]: clsrcexecut: rc = 1, time = 51.007s
2010-11-10 15:41:22.888: [    RACG][1] [503954][1][ora.whdb02.gsd]: GSD is not running on the local node
2010-11-10 15:41:22.889: [    RACG][1] [503954][1][ora.whdb02.gsd]: clsrcexecut: cmd = /oracle/app/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /oracle/app/oracle/product/10.2.0/crs/bin/gsdctl stat
2010-11-10 15:41:22.889: [    RACG][1] [503954][1][ora.whdb02.gsd]: clsrcexecut: rc = 1, time = 2.703s
2010-11-10 15:41:22.889: [    RACG][1] [503954][1][ora.whdb02.gsd]: end for resource = ora.whdb02.gsd, action = start, status = 1, time = 54.106s
2010-11-10 15:41:23.373: [    RACG][1] [290992][1][ora.whdb02.gsd]: clsrcgetprsrctx: prsr_init_ext returned rc = 3
2010-11-10 15:41:26.078: [    RACG][1] [290992][1][ora.whdb02.gsd]: GSD is not running on the local node
2010-11-10 15:50:06.328: [    RACG][1] [442492][1][ora.whdb02.gsd]: clsrcgetprsrctx: prsr_init_ext returned rc = 3
2010-11-10 15:50:57.336: [    RACG][1] [442492][1][ora.whdb02.gsd]: Failed to start GSD on local node
2010-11-10 15:50:57.336: [    RACG][1] [442492][1][ora.whdb02.gsd]: clsrcexecut: cmd = /oracle/app/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /oracle/app/oracle/product/10.2.0/crs/bin/gsdctl start
2010-11-10 15:50:57.336: [    RACG][1] [442492][1][ora.whdb02.gsd]: clsrcexecut: rc = 1, time = 51.008s
2010-11-10 15:51:00.043: [    RACG][1] [442492][1][ora.whdb02.gsd]: GSD is not running on the local node
2010-11-10 15:51:00.043: [    RACG][1] [442492][1][ora.whdb02.gsd]: clsrcexecut: cmd = /oracle/app/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /oracle/app/oracle/product/10.2.0/crs/bin/gsdctl stat
2010-11-10 15:51:00.043: [    RACG][1] [442492][1][ora.whdb02.gsd]: clsrcexecut: rc = 1, time = 2.706s
2010-11-10 15:51:00.043: [    RACG][1] [442492][1][ora.whdb02.gsd]: end for resource = ora.whdb02.gsd, action = start, status = 1, time = 54.114s
2010-11-10 15:51:01.361: [    RACG][1] [618710][1][ora.whdb02.gsd]: clsrcgetprsrctx: prsr_init_ext returned rc = 3
2010-11-10 15:51:04.066: [    RACG][1] [618710][1][ora.whdb02.gsd]: GSD is not running on the local node

Similar Messages

  • Rac One Node on Rac Servers

    Hi Xperts
    We have this environment:
    2 Rac Nodes 11.2.0.3 Enterprise on Oracle Linux 5.9 . 
    We have one production Database on this Rac and the users ask to create two single instance on each node, something like this:
    Node1 -> Rac Prod1,  Single Test
    Node2-> Rac Prod2,  Single Dev
    I want to create Rac One node for those Database (Dev, Test) and create New Diskgroups for ech database.
    Can I install a Rac One node on those Server  with DBCA?
    Do I nedd to Install new Database Software ?
    Does the installed Rac have some affectation ?
    I just want to be sure about this procedure, before to do anything.
    Thank you
    J.A.

    Hi J.A.
    Yes you can! However you need to install Grid Infrastructure (GI) in cluster on both nodes, then install database software. Either during software installation or after that, DBCA would allow you to create 1) Single instance, 2) RAC database, 3) RAC One Node database. Keep in mind that RAC One Node is an option (additional license) to the Enterprise Edition of Oracle Database.
    I've talked on that topic at the Bulgarian Oracle Users Group at 2011, here is the link to the presentation, you may find it useful. I might upload the videos as well if you need to have something like a proof of concept of just for your own:
    http://sve.to/download/1112/
    Also I would go with one disk group for both databases, as long as they share the same physical disks I don't see the point of doing that ? Having one diskgroup would allow you to utilize better you disk/space resources.
    The procedure would be:
    1. Install GI in cluster.
    2. Install software libraries.
    3. Patch up to 11.2.0.4.
    4. Create RAC One Node database using either command line or using Custom Template of DBCA.
    At the end of the day, if you have standard edition license you can still install GI in cluster and create single instance databases on each server. The downside of doing that is that you need to manually failover the database to remaining node in case of disaster.
    Regards,
    Sve

  • Ora.reco.acfsvol.acfs only on one node on RAC on ODA

    We have an ODA (old model) and by a power failure in the data center both boot disks in one node are we gone faulty.
    After replacing the chassis, RAID controllers and disks (Oracle Filed Engenieer) reports crsctl stat res -t following:
    [grid @ XXXXXXXXA ~] $ crsctl stat res -t
    TARGET NAME SERVER STATE STATE_DETAILS
    Local Resources
    ora.reco.acfsvol.acfs
                    ONLINE ONLINE XXXXXXXXXA mounted on / cloudfs
                    OFFLINE OFFLINE XXXXXXXXXBvolume / cloudfs off
    is that correct?
    Oracle support referred me to MOS 1319263.1, but that's for Exadata ....
    Thx
    Christoph
    (i masked the hostname)

    No, this is not correct.  Your resource should be online on both nodes.
    What happens if you try and start the resource manually using srvctl start filesystem?
    Have you checked to see if your volume is online?

  • Oracle Binary Currepted in One node 11g RAC

    Hi Team,
    /oracle(Oracle Home ) folder currepted in one node.
    IBM AIX-11.2.0.1
    How to reolve the same.
    Thanks
    Manohar.

    1. take the backup of current oracle_home corrupted.
    2. Tar the other node oracle_home as below, and copy it to the corrupted node.
    tar cvf location <tarname>.tar oraInventory product
    3. Extracted the tar on the corrupted node by delting the existing corrupted oracle_home
    tar xvf /location/<name>.tar
    Now, complete the Oracle RDBMS Cloning Process of the above mentioned untar done in earlier steps:
    cd $ORACLE_HOME/clone/bin
    Perl clone.pl ORACLE_HOME="<LOCATION>" ORACLE_HOME_NAME="OraDb10g_home1"
    Run manual command of “relink all”

  • Why RAW partitions are visible only on one node in RAC?

    I am having 2 node RAC on Windows 2003 Server.
    Can anybody tell me why the RAW partitions are visible on RAC-2 but not on RAC-1.
    I Shutted down RAC-2, and I thought all the RAW devices will appear in RAC-1, but it didn't work, it is displaying only LOCAL DRIVE.
    Why???

    I am using Microsoft Virtual Server and I have configured SCSI type shared disks.
    I executed the given command
    C:\oracle\product\10.2.0\crs\bin>cluvfy comp ssa -n rac-1,rac-2
    Result : Shared storage check was successful on both the nodes..
    Both the nodes are working properly but I want to know the reason 'why the shared disks are not visible on node1'?
    Thanks
    Sushil

  • Error starting listener on one node in RAC:Error listening on....TNS-12545:

    LSNRCTL> start LISTENER_CORPNG04
    Starting /ora00/app/oracle/product/11/db1/bin/tnslsnr: please wait...
    TNSLSNR for HPUX: Version 11.1.0.7.0 - Production
    System parameter file is /ora00/app/oracle/product/11/db1/network/admin/listener.ora
    Error listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=vir_corpng04)(PORT=1521)(IP=FIRST)))
    TNS-12545: Connect failed because target host or object does not exist
    TNS-12560: TNS:protocol adapter error
    TNS-00515: Connect failed because target host or object does not exist
    HPUX Error: 227: Can't assign requested address
    Listener failed to start. See the error message(s) above...
    LSNRCTL>
    Plz help its production system
    Thanks in Advance
    Gagan

    See link
    Rgds

  • When one node reboot other node in RAC

    Hi Friends,
    I faced one situation where one node of RAC cluster had been rebooted by other node. This happen due to network interconnect link fluctuation.
    Sep 13 16:23:48 kkvs1a su: [ID 810491 auth.crit] 'su admin' failed for wipro1 on /dev/pts/3
    Sep 14 00:22:17 kkvs1a ixgbe: [ID 611667 kern.info] NOTICE: ixgbe3: link down
    Sep 14 00:22:21 kkvs1a ixgbe: [ID 611667 kern.info] NOTICE: ixgbe3: link up, , full duplex
    Sep 14 00:22:31 kkvs1a ixgbe: [ID 611667 kern.info] NOTICE: ixgbe1: link down
    Sep 14 00:22:31 kkvs1a ixgbe: [ID 611667 kern.info] NOTICE: ixgbe3: link down
    /opt/oracle/product/10.2.0/crs/log/node1/alertkk1a.log
    ==============================================
    2013-09-14 00:22:05.180
    [cssd(12561)]CRS-1612:node kk1b (2) at 50% heartbeat fatal, eviction in 14.251 seconds
    2013-09-14 00:22:12.180
    [cssd(12561)]CRS-1611:node kk1b (2) at 75% heartbeat fatal, eviction in 7.251 seconds
    2013-09-14 00:22:13.180
    [cssd(12561)]CRS-1611:node kk1b (2) at 75% heartbeat fatal, eviction in 6.251 seconds
    2013-09-14 00:22:17.179
    [cssd(12561)]CRS-1610:node kk1b (2) at 90% heartbeat fatal, eviction in 2.251 seconds
    2013-09-14 00:22:18.180
    [cssd(12561)]CRS-1610:node kkvs1b (2) at 90% heartbeat fatal, eviction in 1.251 seconds
    This clearly shows CSSD of node kkvs1a has given node eviction message to kkvs1b node.
    I got following messages on the instance which got rebooted:
    ASM alert log:
    Sat Sep 14 00:22:25 IST 2013
    Error: KGXGN aborts the instance (6)
    Sat Sep 14 00:22:25 IST 2013
    Errors in file /opt/oracle/admin/+ASM/bdump/+asm2_lmon_8527.trc:
    ORA-29702: error occurred in Cluster Group Service operation
    LMON: terminating instance due to error 29702
    A network fluctuation shouldn't give reboot like this. Then why oracle design like this way? Is this a bug? My oracle version is: 10.2.0.5.0
    Could you tell me the other possible situations when 1 RC instance reboots other RAC instacne.

    What you are describing is the expected behaviour: if your interconnect fails, you will have a node eviction. Releases < 11.2.0.2 evict a node by reboot, which can fix the problem: the NIC may come up correctly when the machine re-starts. Releases >= 11.2.0.2 can often evict without a re-boot. But either way, if your interconnect goes down, a node must be evicted to prevent uncoordinated disc writes.
    If you are interested, you can find some discussion and demos of this in a series of webcasts I've recorded,
    Free Oracle Database Tutorials for Administration and Developers
    If you really don't like this behaviour and the problems are transient, you can try 'raising the CSS MISSCOUNT parameter.
    John Watson
    Oracle Certified Master DBA

  • Can RAC and RAC One Node share the same servers ?

    Does anyone know if it is possible for RAC and the new 11gR2 RAC One Node to share the same set of physical servers i.e. in effect having 2 clusters sharing the same set of servers ( though you could argue RAC One node is a different type of clustering or even that it is not real clustering at all - more instance transporting ).
    Or does standard RAC always require exlusive use of the physical servers it is using as its nodes ?
    Any thoughts appreciated
    Jim

    Jimbo wrote:
    Does anyone know if it is possible for RAC and the new 11gR2 RAC One Node to share the same set of physical servers i.e. in effect having 2 clusters sharing the same set of servers ( though you could argue RAC One node is a different type of clustering or even that it is not real clustering at all - more instance transporting ).
    Or does standard RAC always require exlusive use of the physical servers it is using as its nodes ?
    Hi Jim,
    To deploy RAC we need Oracle Grid Infrastructure for a cluster (aka Oracle Clusterware) on top.
    What determine if it's Single /RAC /RAC one Node is the Installation of Oracle Database.
    So, Oracle Clusterware support on Same Cluster ( RAC/ RAC one Node / Single).
    You will need one installation for each Feature.
    e.g on Same Cluster
    ---> Grid Infrastructure GRID_HOME=/u01/app/11.2.0/grid
    --->> RAC one Node /ORACLE_HOME = /u01/app/oracle/product/11.2.0/racone_11203
    --->> RAC /ORACLE_HOME = /u01/app/oracle/product/11.2.0/rac_11203
    --->> SINGLE /ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_11203
    For me make no sense RAC ONE NODE and RAC on same cluster.
    Because RAC ONE NODE is a RAC with less feature.
    Regards,
    Levi Pereira

  • ORA-24544 When Creating a Standby From RAC One Node

    I have a RAC One Node database running on node A and am attempting to create a Data Guard standby on node B using the following RMAN command -
    duplicate target database for standby from active database
    Error returned is -
    RMAN-00571: ===========================================================
    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
    RMAN-00571: ===========================================================
    RMAN-03002: failure of Duplicate Db command at 03/10/2015 11:11:06
    RMAN-05501: aborting duplication of target database
    RMAN-03015: error occurred in stored script Memory Script
    RMAN-04014: startup failed: ORA-24544: Oracle RAC One Node instance is already running.
    Can anyone advise ?
    Thanks in advance

    Hi,
    ORA-24544: Oracle RAC One Node instance is already running.
    Cause: An instance startup failed because an instance of the Oracle RAC One Node database was already running on one of the cluster nodes.
    Action: In Oracle RAC One Node, avoid any attempt to start a second instance by any means while the instance is already running.
    I have never used One Node RAC, but it looks like RMAN trying to start an offline instance of current One Node RAC on Node B.
    HTH,
    Pradeep

  • Oracle RAC one node-11.2.0.1 with patch 9004119

    Hi,
    I'm slightly stuck here. I have successfully converted 11.2.0.1 2 node RAC to RAC one node with patch *9004119*(which has scripts required for racone). My single instances are up and running, relocation of databases go fine
    As i launch my OEM, it pops with +"ORA-12505:TNS:listener does not currently know of SID given in connect descriptor".+ . Obviously, ./Raconeinit renames the DB instances, but doesn't get changed in OEM. It still lists my old instances.
    I manually change ORACLE_SID to new instances in all my nodes, so i'm able to connect through sql, but how do i configure it in DB control so that the change is seen universally. Any application i run with the new instance gives me the above mentioned error.
    Please respond!

    Hi,
    In continuation with my above query, i did some search and looked at 'raconeinit' script. It's basically trying to run certain commands to see the current server, free server, current running DB , stores the value in a variable and validates against the input I provide, which is not happening.
    When i manually try to run the commands mentioned in script, it gives me invalid -i and -f options errors.
    to see DB up:
    +./crsctl status resource ora.$LOWER_DB_NAME.db | $GREP -i "^STATE" | $GREP -i "ONLINE" | $WC -l+
    to see current server
    +./crsctl status resource ora.$LOWER_DB_NAME.db | $GREP -i "STATE" | $AWK -F'ONLINE on ' '{print $2}'+
    to see free servers
    +./crsctl stat serverpool Free -v|$GREP -i active_servers|$AWK -F'=' '{print $2}'+
    It is not accepting my input (free server) and hence exiting. :(
    Need help badly

  • Oracle RAC one node

    why Oracle introduced Oracle RAC one node in 11gR2?
    I have gone through oracle documents on oracle rac one node, but i couldn't find much added advantage than instance failover. We have already some technologies like datagaurd and some third party softwares for instance failover.
    Then why oracle introduced this RAC one node in the new release i.e 11gR2?
    What exactly oracle wants to provide from oracle RAC one node?
    Thanks...
    Bharath

    Why RAC one node?
    Oracle RAC one node is a single instance of Oracle RAC that runs on the node in a cluster. The benefit of the RAC one node option is that it allows you to consolidate many databases into one cluster without a lot of overhead, while also providing high availbilty benefits of failover protection, as well as for Online rolling patch application and rolling upgrades for the Oracle clusterware.
    Another aspect of RAC one node allows you to limit th CPU utilization of individual database instances within the cluster through a feature called resource manager instance caging, which gives you the ability to dynamilcally change the limit as required.
    Furhtermore , with RAC one node there is no limitation for server scalabilty such that if applications outgrow the current resources than a single node can supply, you can then upgrade the applications online to Oracle RAC.
    In tthe event that the node which is runnig Oracle RAc one node becomes saturated and out of resources, you can migrate the instance to another node in the cluster using Oracle rac one node and another new utilty called OMOTION feature.OMOTION feature for Oracle RAC 11g R2 rac allows you to migrate a running instance to another server without downtime or distrubtion in service for your enviornment.
    Hope you're understand.

  • One node RAC pause/hang/block on other node shutdown

    Hi,
    We have a Java application running on Linux servers connecting to a 10.2.0.1 RAC cluster, also Linux. When the application starts it opens up a pool of connections to the databsae, and these are used throughout the life time of the application. One server connects to one RAC node.
    AppA - DBA
    AppB - DBB
    When we shutdown one node, the application connecting to that node stops, which is what we would expect in this configuration.
    What is strange is that the other application blocks for 63 seconds and then continues. So it is like the database is blocking, or the database connections are blocking.
    We are not using TAF, FAN, FCN, LB, VIPs or any special features, just simple lightweight JDBC from one server to one database. In fact I do not thing we are unwittingly using any of these features, we have them switched off.
    john

    user1788323 wrote:
    What is strange is that the other application blocks for 63 seconds and then continues. So it is like the database is blocking, or the database connections are blocking.How have you determined/diagnosed the 63s blocking? (more details in this regard may shed some light on the problem)
    Assuming that the block is server side, then two basic reasons comes to mind.
    Networking issue - the CRS on the surviving node has to perform certain functions, like switching the VIP of the node that left the cluster to a surviving cluster node. The listener may need to re-register services. A local firewall may need to be dynamically reconfigured for supporting the new failed-over VIP. Etc.
    Thus these could result in some kind of delay or issue in the network layer that you are seeing from the client side.
    Infrastructure issue. If the actual client request via JDBC reaches the server process, and it is slow in responding, then that is not a network issue - instead some underlying service or s/w layer that the server process needs to use to perform the client request is busy for those 63s.
    This could be related to the Interconnect, the shared I/O storage layer or something along those lines. For example, how does the Interconnect and/or SAN switch re-act when a server node is powered down or rebooted?
    There's not really sufficient information to make anything but a guesses.. You will need to isolate the problem with further testing.
    I have seen similar problems with 10.1.0.3 CRS and RAC when a node is evicted from the cluster. In this case the "hung" period was in excess of 15 minutes and only for new connections (Listener unable to hand off to dedicated servers or dispatchers). Existing connections worked fine however and were unaware of any problems. But part of the issue in this case was a poor (outdated) driver layer - and also the last time we used proprietary binary drivers (kernel modules) from 3rd party vendors that results in a tainted (and very fixed and rigid) Linux kernel. Today we're sticking with an OpenSource driver layer only for Linux.

  • Removing one node, re-install and join cluster 3.2/RAC/QFS

    Hi all,
    I have one cluster system with 2 node ( Cluster 3.2, Oracle RAC, QFS). Now one node have been failed and cannot recover. Now I have to reinstall the faulty node.
    How can it removing all the faulty node from the active node?
    Can I reinstall and rejoin new node to cluster?
    Thanks
    Nguyen

    The instructions to orderly remove a node from the cluster (http://docs.sun.com/app/docs/doc/820-4679/gcfso?l=en&a=view) do assume that the cluster node itself is still healthy.
    If you lost a node due to failure/disaster, then you would need to rebuild the same hardware and restore it from your backup. This is described at
    http://docs.sun.com/app/docs/doc/820-4679/cfhdbgbc?l=en&a=view
    Regards
    Thorsten

  • RAC One Node: downtime after HW-outage?

    Hi ,
    RAC One Node seems to be a good alternative compared to a "regular" RAC.
    Since we don't have any experience with RAC One Node we are currently going through the available documentaion and try to get a picture of the advantages/disadvantages.
    One major point is the question "what will exactly happen when an HW-failure or outage occurs?". It's my understanding - and what Oracle has documented - that the Oracle Clusterware provides failover protection to Oracle RAC One Node. If the node fails, Oracle Clusterware will automatically detect the failure and restart the Oracle RAC One Node database on another server in the pool.
    In our case we would have the active node in Datacenter 1, the passive node in Datacenter 2. Therefore if the node in Datacenter 1 fails, the database should then get started on the node in Datacenter 2.
    My questions now are:
    - will that really automatically be done?
    - what happens to the sessions which were connected to the database before the HW-failure?
    - is it possible to determine how long this failover will take?
    I'm sure that somebody has already tested this situation - any help will be appreciated.
    Rgds
    JH
    Edited by: VivaLaVida on 05.12.2012 08:57

    Hi,
    My questions now are:
    - will that really automatically be done?Yes, automatically and transparent
    - what happens to the sessions which were connected to the database before the HW-failure?Depend if you are using Omotion the connection will be redirected to another node, if you are cold failover the client will be have a short downtime.
    - is it possible to determine how long this failover will take?Depend of failure, but normal (could failover) is on range from 1 to 3 minutes, using (omotion- hot failover) no downtime.
    Read it :
    http://www.oracle.com/technetwork/products/clustering/overview/ug-raconenode-2009-130760.pdf?ssSourceSiteId=otncn
    Maybe this can clarify your issues.
    Any question just ask.
    Regards,
    Levi Pereira
    Edited by: Levi Pereira on Dec 5, 2012 11:58 AM
    PS: Ignore use of Omotion if you have "downtime after HW-outage". Consider only cold failover.

  • Oracle Applications 11i Load Balancing does not work with RAC one Node

    Hi all,
    Could you help me to resolve this issue.
    Architecture environment is :
    - One APPS tier node
    - Two nodes Oracle Database Appliance (Primary node 1 holds INSTANCE_1 et Secondary node is configurured to holds INSTANCE_2), i.e RAC one Node.
    - The primary node have instance_name SIGM_1 and the secondary node have instance_name SIGM_2, but in RAC one node, the secondary instance is not alive.
    We convert our EBS 11i environment to RAC following note ID Using Oracle 11g Release 2 Real Application Clusters with Oracle E-Business Suite Release 11i [ID 823586.1].
    When testing Database failover, Oracle Applications 11i load balancing does not work anymore.
    The root cause is that, when the primary node of the Rac one node is down, the INSTANCE_NAME_1 is automaically relocating to the surviving node,.
    During test failover, we imagine that when the primary node goes down, the secondary node start or relocate database with instance_name SIGM_2, and in that case the Oracle Applications load balancing works.
    Currently, when the primary node goes down, the instance_name SIGM_1 is relocated on the secondary node, which cause failure of Oracle Applications Load Balancing.
    Thank you for your advice.
    Moussa

    This is something I observed a long time ago for Safari (ie: around version 1). I'm not sure this is Safari, per se, but OpenSSL that is responsible for the behavior. I'm pretty sure Chrome does this and I've seen some Linux browsers do it.
    What I have done at the last two companies I've worked for is recommend that our clients do not use SSL SessionID as the way of tracking sticky sessions on web servers, but instead using IP address. This works in nearly all cases and has few downsides. The other solution is to use some sort of session sharing on your web servers to mitigate the issue (which also means that your web servers aren't a point of failure for your users' sessions). (One of the products I supported had no session information stored on the web servers, so we could safely round-robin requests, the other product could be implemented with a Session State Server... but in most cases we just used IP address to load balance with). The other solution is to configure your load balancer to terminate the SSL tunnel. You get some other benefits from this, such as allowing your load balancer to reduce the number of actual connections to the web servers. I've seen many devices setup this way.
    One thing to consider through this is that - due to the way internet standards work - this really can't be termed a bug on anyone's part. There is no guarantee in the SSL/TLS standards that a client will return the same SSL Session ID for each request and there is not requirement that subsequent requests will even use the same tunnel. Remember, HTTP is a stateless protocol. Each request is considered a new request by the web server and everything else is just trickery to try and get it to work the way you want. You can be annoyed at Safari's behavior, but it's been this way for over 5 years by my count, so I don't expect it to change.

Maybe you are looking for

  • Error: import failed

    Hi everyone, I am trying to create a import format to read a .csv file. I have specified the load location and import group. In my Import format, I have fields Account, ICP, entitity and Amount and use the delimiter format to parse the fields. As the

  • Deploy Eclise Project on Azure Cloude successfully but Staging URL giving The connection has timed out ERROR.

    Hello,   I just Deploy the java project on the Azure Clod using the Azure Plugin as per the video  I got this Staging URL    But this URL not Working giving Connection Time Out Error Please Help

  • N97 - Unable to upload images to Facebook

    Hi, I've been very happy with the recent v2 software update and had no problems so far (after the mess the previous firmware update left my phone in), my only problem is while using the facebook application I can no longer upload images from my phone

  • Modifying JSP in OAS installation

    .I am auto generating user id and password field and I need to hide these two field from user creation page.I modified JSP in OAS installation.I followed the following steps for doing this. 1.Unpacked xlWebApp.war. 2.Modified tiles/util/tjspGenerateC

  • Recovering keychain and application liscense keys from a crashed disk

    My OS X partition is too corrupted to boot from or repair with disk utility. I have a windows bootcamp partion that I can boot from and the files on the OS X partion can be read from windows and copied to an external hard disk (NTFS if it makes a dif