Oracle 10g RAC on Solaris Node Eviction

Been having periodic node eviction on my server. I've found several threads regarding RAC node reboots but nothing specific.. In my case, the node eviction warning appears to be "immediate"
[cssd(9530)]CRS-1612:node mbdmb2 (0) at 50% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1612:node mbdmb2 (0) at 50% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1611:node mbdmb2 (0) at 75% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1611:node mbdmb2 (0) at 75% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1610:node mbdmb2 (0) at 90% heartbeat fatal, eviction in 0.000 seconds
[cssd(9530)]CRS-1607:CSSD evicting node mbdmb2. Details in /u01/crs/oracle/product/10.2/app/log/mbdmb1/cssd/
ocssd.log.
Other people’s: Seem to have a time to recover.. and only reboots when it eventually runs out of time..
2009-08-31 16:05:41.405
[cssd(4968)]CRS-1612:node simsd1 (1) at 50% heartbeat fatal, eviction in 29.611 seconds
2009-08-31 16:05:42.403
[cssd(4968)]CRS-1612:node simsd1 (1) at 50% heartbeat fatal, eviction in 28.613 seconds
2009-08-31 16:05:56.412
[cssd(4968)]CRS-1611:node simsd1 (1) at 75% heartbeat fatal, eviction in 14.604 seconds
2009-08-31 16:05:57.411
[cssd(4968)]CRS-1611:node simsd1 (1) at 75% heartbeat fatal, eviction in 13.605 seconds
2009-08-31 16:06:05.413
[cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 5.603 seconds
2009-08-31 16:06:06.412
[cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 4.604 seconds
2009-08-31 16:06:07.410
[cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 3.606 seconds
2009-08-31 16:06:08.409
[cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 2.607 seconds
2009-08-31 16:06:09.407
[cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 1.609 seconds
2009-08-31 16:06:10.405
[cssd(4968)]CRS-1610:node simsd1 (1) at 90% heartbeat fatal, eviction in 0.611 seconds
2009-08-31 16:06:11.061
[cssd(4968)]CRS-1609:CSSD detected a network split. Details in C:\product\11.1.0\crs\log\simsd2\cssd\ocssd.log.
2009-08-31 16:14:37.873
I'm lead to think this is due to something with the setting on the heartbeat loss window. There are some threads suggesting the hangcheck-timer but it does not appear to be for solaris. Wondering where if any place I can check/change this setting.

Ah, thanks, looks like even just looking at the log yielded some thing different. I was grepping the alertlog instead which apparantly doesn't show as much (and shows time as 0). In the ocssd.log, it shows it with the time to live.
One more question, can you tell from this that whether this is a network hb or disk hb related or something else?
Thanks!
CSSD]2010-02-24 06:46:17.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:46:17.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:46:22.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:46:22.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmRegisterClient: proc(17/1009503f0), client(344/10097f
7f0)
[    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmExecuteClientRequest: GRKJOIN recvd from client 344 (
10097f7f0)
[    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmJoinGrock: grock DG_FLASH51 new client 10097f7f0 with
con 100932430, requested num -1
[    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmAddGrockMember: adding member to grock DG_FLASH51
[    CSSD]2010-02-24 06:46:27.685 [14] >TRACE: clssgmAddMember: member (2/100921830) added. pbsz(123) prsz
(42) flags 0x0 to grock (100914210/DG_FLASH51)
[    CSSD]2010-02-24 06:46:27.686 [14] >TRACE: clssgmQueueGrockEvent: groupName(DG_FLASH51) count(3) maste
r(0) event(1), incarn 208505, mbrc 3, to member 0, events 0x0, state 0x0
[    CSSD]2010-02-24 06:46:27.686 [14] >TRACE: clssgmCommonAddMember: Local member(2) node(1) flags 0x0 0x
100 grock (2/100914210/DG_FLASH51)
[    CSSD]2010-02-24 06:46:27.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:46:27.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:46:28.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 50 2.123767e-314art
beat fatal, eviction in 59.577 seconds
[    CSSD]2010-02-24 06:46:28.941 [18] >TRACE: clssnmPollingThread: node mbdmb2 (2) is impending reconfig,
flag 1, misstime 60423
[    CSSD]2010-02-24 06:46:28.941 [18] >TRACE: clssnmPollingThread: diskTimeout set to (117000)ms impendin
g reconfig status(1)
[    CSSD]2010-02-24 06:46:29.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 50 2.123767e-314art
beat fatal, eviction in 58.577 seconds
[    CSSD]2010-02-24 06:46:30.363 [17] >TRACE: clssgmDispatchCMXMSG(): msg type(12) src(2) dest(1) size(36
0) tag(00d2002a) incarnation(88)
[    CSSD]2010-02-24 06:46:32.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:46:32.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:46:37.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:46:37.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:46:42.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:46:42.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:46:47.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:46:47.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:46:52.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:46:57.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:46:57.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:46:58.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 75 2.123767e-314art
beat fatal, eviction in 29.577 seconds
[    CSSD]2010-02-24 06:46:59.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 75 2.123767e-314art
beat fatal, eviction in 28.577 seconds
[    CSSD]2010-02-24 06:47:02.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:47:02.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:47:07.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:47:07.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:47:12.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:47:12.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:47:16.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
beat fatal, eviction in 11.577 seconds
[    CSSD]2010-02-24 06:47:17.725 [17] >TRACE: clssgmDispatchCMXMSG(): msg type(12) src(2) dest(1) size(36
0) tag(00d3002a) incarnation(88)
[    CSSD]2010-02-24 06:47:17.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:47:17.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
beat fatal, eviction in 10.577 seconds
[    CSSD]2010-02-24 06:47:17.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:47:18.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
beat fatal, eviction in 9.577 seconds
[    CSSD]2010-02-24 06:47:19.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
beat fatal, eviction in 8.577 seconds
[    CSSD]2010-02-24 06:47:20.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
beat fatal, eviction in 7.577 seconds
[    CSSD]2010-02-24 06:47:21.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
beat fatal, eviction in 6.577 seconds
[    CSSD]2010-02-24 06:47:22.941 [19] >TRACE: clssnmSendingThread: sending status msg to all nodes
[    CSSD]2010-02-24 06:47:22.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
beat fatal, eviction in 5.577 seconds
[    CSSD]2010-02-24 06:47:22.941 [19] >TRACE: clssnmSendingThread: sent 5 status msgs to all nodes
[    CSSD]2010-02-24 06:47:26.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
beat fatal, eviction in 1.577 seconds
[    CSSD]2010-02-24 06:47:26.941 [19] >TRACE: clssnmSendingThread: sent 4 status msgs to all nodes
[    CSSD]2010-02-24 06:47:27.703 [17] >TRACE: clssgmPeerEventHndlr: receive failed, node 2 (mbdmb2) (1009
0eb90), rc 11
[    CSSD]2010-02-24 06:47:27.704 [17] >TRACE: clssgmPeerDeactivate: node 2 (mbdmb2), death 0, state 0x800
00001 connstate 0xf
[    CSSD]2010-02-24 06:47:27.704 [17] >TRACE: clssgmPeerListener: discarded 0 future msgsfor 2
[    CSSD]2010-02-24 06:47:27.941 [18] >WARNING: clssnmPollingThread: node mbdmb2 (2) at 90 2.123767e-314art
beat fatal, eviction in 0.577 seconds
[    CSSD]2010-02-24 06:47:28.521 [18] >TRACE: clssnmPollingThread: Eviction started for node mbdmb2 (2),
flags 0x0001, state 3, wt4c 0

Similar Messages

  • Oracle 10g RAC on solaris 10 installation

    hi
    i want to know the default file location of following:
    oracle base
    oracle home
    control files
    redo log files
    data files
    plz tell me exactly the default path for all the above

    I'm assuming you mean running RAC with or without Sun Cluster? If so, the answer is that there would almost no difference in most cases. The only case I can think of would be if you had a sub-optimal interconnect without Sun Cluster where Sun Cluster's clprivnet would have given you striping and availability for free.
    I am currently working on a Blueprint that describes the important differences between the two configurations (with and without SC). They can broadly be summarised as:
    Sun Cluster gives you:
    * Better data integrity protection
    * Faster, more reliable node failure detection
    * Makes your name space homogeneous, simplifying installation and device management (DID structure), so no need for messy symbolic links
    * Gives you a highly available, striped cluster interconnect for the Cache Fusion traffic. (No need for tricky IPMP or link aggregation configurations.)
    * Allows you to use volume managers link Solaris Volume Manager or VxVM
    * Provides support for shared QFS as a file system for all Oracle objects, data includes (while still allowing ASM)
    * A substantial collection of Sun written and supported agents to manage other applications you might also have on your cluster, e.g. NFS, SAP, Apache, etc, etc.
    Hope that helps,
    Tim
    ---

  • Oracle 10g RAC to 11g RAC Upgrade on Solaris

    Hi,
    We are planning to do a migration of a 4 Node Oracle 10g RAC on Solaris 9 to 11g with Solaris 10. We'd like know what would be the best path to take. We cannot afford any downtime!
    Options: Are these feasible? Which option is best? Any documents links?
    a) Do a rolling upgrade of Oracle from 10g to 11g. Then take down individual nodes and upgrade the Solaris OS from 9 to 10 and bring them up back into the cluster. Is there any known issues taking this path? Is a rolling upgrade like this possible?
    b) Do an upgrade of the Solaris OS from 9 to 10 on each node and then bring them back up? Is this practical? Does Oracle allow different versions of OS running on different nodes?
    c) Use Dataguard with 2 different RAC environments (2 nodes each). How would this work? Is it the only possible way? Any steps please?
    Thanks

    a) Do a rolling upgrade of Oracle from 10g to 11g. Then take down individual nodes and upgrade the Solaris OS from 9 to 10 and bring them up back into the cluster. Is there any known issues taking this path? Is a rolling upgrade like this possible?Hi,
    first of all i would not change several components (OS, database version) at a time. My recommendation is to make small steps and start with the operating system first. Seconds recommendation is to test and test everything in your dev or test environment prior doing the upgrades in the productive environment. Trust me: You will face problems :-) So you better try it beforehand!
    b) Do an upgrade of the Solaris OS from 9 to 10 on each node and then bring them back up? Is this practical? Does Oracle allow different versions of OS running on different nodes?As far i know you can run different operating system versions on different nodes if they are supported (Solaris 9 and 10 are).
    Ronny Egner
    My blog: http://ronnyegner.wordpress.com

  • Problem with Oracle 10g RAC VIP network setting at Solaris 9

    Dear All,
    I have tried to set up a Oracle 10g RAC Release 2.
    With OS solaris 9, and 2 nodes.
    The nodes setting as the following:
    nodes 1:
    Public address: 172.16.0.121
    Private address: 192.168.0.121, 192.168.1.121 (dual path for heart beat)
    nodes 2:
    Public address: 172.16.0.122
    Private address: 192.168.0.122, 192.168.1.122
    And i have assigned two IP adress 172.16.0.131, 172.16.0.132 as the VIP address for the
    RAC.
    And the following is the /etc/hosts file:
    root@shk01 # cat /etc/hosts
    # Internet host table
    127.0.0.1 localhost
    # public address
    172.16.0.121 shk01 loghost
    172.16.0.122 shk02
    # heart beat
    192.168.0.121 shk01-priv1
    192.168.1.121 shk01-priv2
    192.168.0.122 shk02-priv1
    192.168.1.122 shk02-priv2
    # VIP address
    172.16.1.131 vip-shk01
    172.16.1.132 vip-shk02
    But when i run the command "$ ./runcluvfy.sh comp nodecon -n shk01,shk02 -verbose"
    it shows the error:
    ERROR:
    Could not find a suitable set of interfaces for VIPs.
    Result: Node connectivity check failed.
    I did try to add the VIP address on bge0:1, as i using bge0 for the public address.
    Both nodes i did using the same interface name for it.
    Anyone have idea for me to check out the error?
    Also, I have other question about the raw device.
    As there is option for setting for ASM or raw device. If choosing raw device, does it mean that it just need to
    format the storage disk but without newfs it? And then the Oracle program will able to handle it?
    Thanks,
    Xentar

    You don't seem to state categorically that you are using Solaris Cluster, so I'll assume it since this is mainly a forum about Solaris Cluster (and IMHO, Solaris Cluster with Clusterware is better than Clusterware on its own).
    Clusterware has to see the same device names from all cluster nodes. This is why Solaris Cluster (SC) is a positive benefit over Clusterware because SC provides an automatically managed, consistent name space. Clusterware on its own forces you to manage either the symbolic links (or worse mknods) to create a consistent namespace!
    So, given the SC consistent namespace you simple add the raw devices into the ASM configuration, i.e. /dev/did/rdsk/dXsY. If you are using Solaris Volume Manager, you would use /dev/md/<setname>/rdsk/dXXX and if you were using CVM/VxVM you would use /dev/vx/rdsk/<dg_name>/<dev_name>.
    Of course, if you genuinely are using Clusterware on its own, then you have somewhat of a management issue! ... time to think about installing SC?
    Tim
    ---

  • Solaris 10 and Hitachi LUN mapping with Oracle 10g RAC and ASM?

    Hi all,
    I am working on an Oracle 10g RAC and ASM installation with Sun E6900 servers attached to a Hitachi SAN for shared storage with Sun Solaris 10 as the server OS. We are using Oracle 10g Release 2 (10.2.0.3) RAC clusterware
    for the clustering software and raw devices for shared storage and Veritas VxFs 4.1 filesystem.
    My question is this:
    How do I map the raw devices and LUNs on the Hitachi SAN to Solaris 10 OS and Oracle 10g RAC ASM?
    I am aware that with an Oracle 10g RAC and ASM instance, one needs to configure the ASM instance initialization parameter file to set the asm_diskstring setting to recognize the LUNs that are presented to the host.
    I know that Sun Solaris 10 uses /dev/rdsk/CwTxDySz naming convention at the OS level for disks. However, how would I map this to Oracle 10g ASM settings?
    I cannot find this critical piece of information ANYWHERE!!!!
    Thanks for your help!

    You don't seem to state categorically that you are using Solaris Cluster, so I'll assume it since this is mainly a forum about Solaris Cluster (and IMHO, Solaris Cluster with Clusterware is better than Clusterware on its own).
    Clusterware has to see the same device names from all cluster nodes. This is why Solaris Cluster (SC) is a positive benefit over Clusterware because SC provides an automatically managed, consistent name space. Clusterware on its own forces you to manage either the symbolic links (or worse mknods) to create a consistent namespace!
    So, given the SC consistent namespace you simple add the raw devices into the ASM configuration, i.e. /dev/did/rdsk/dXsY. If you are using Solaris Volume Manager, you would use /dev/md/<setname>/rdsk/dXXX and if you were using CVM/VxVM you would use /dev/vx/rdsk/<dg_name>/<dev_name>.
    Of course, if you genuinely are using Clusterware on its own, then you have somewhat of a management issue! ... time to think about installing SC?
    Tim
    ---

  • Installation of  oracle 10g rac on open solaris  for X86 Machine

    Hi
    Can you please let me know the advantages of installation of oracle 10g rac on open solaris on x86 machine.And also let me know the detail steps involved in instatllation of oracle 10g rac on open solaris on x86. please suggest the recommdendation for the installation
    Thanks
    Saiprasath

    If by "advantages of installation of oracle 10g rac on open solaris on x86 machine" you mean advantages over RAC on Linux or AIX or HP/UX there aren't any.
    If you want to know the advantages of RAC over stand-alone they generally relate to:
    1. Eliminating the server as a single point of failure
    2. Transparent fail-over
    2. Incremental horizontal scalability

  • DO I need SFRAC for Oracle 10g RAC  on Sun Solaris

    My platform is SUN Solaris 10 64 bit on v490 server. On the backend side, we are using EMC CX500 storage. We will use Veritas File system and Veritas CFS.
    I would like to ask, Do I must need SFRAC to configure Oracle 10g RAC or Can I just use only Oracle CRS. I do not want to use ASM.
    Please advise
    Thanks,
    Sam..

    The VCFS usually requires a Veritas Cluster, thus you would have to use the product combination that is bundled as SFRAC as mentioned before. Metalink in addition says:
    10      10gR2 64-bit      Veritas Storage Foundation for Oracle RAC      5.0      Certified
    There is one exception to this rule: Linux.
    Veritas and Oracle support a standalone version of the Veritas Cluster Files System (and only this component) on Linux. This does not hold true for Solaris.
    Concluding, if you want to use the VCFS you would need to get a VCS, which then basically means using SFRAC.
    However, in this case, as mentioned before, you would not need ASM.

  • Memory Leak when TOMCAT connects to Oracle 10g RAC using JDBC Thin driver.

    We had experienced Memory leak when a Oracle 10g (10.2.0.3) RAC node was evicted. TOMCAT app server is connecting to the Oracle 10g RAC database instances using JDBC 10.2.0.3 thin driver.
    Anyone had similar experience?
    Any ideas? Any bugs reported/fixed?
    Thanks,
    Raj

    If you're doing XA, we absolutely do not support
    driver-level load-balancing OR failover. Use neither.
    For non-XA, you can use driver-level failover. For
    non-XA, you could set load-balancing, but it won't
    help because we get connections from the driver,
    and keep them indefinitely, so the driver never gets
    the chance to affect which connections the pool
    uses after that.

  • Oracle 10g RAC

    Hi All,
    I am a research student currently struggling to install Oracle 10g RAC on Windows OS.What i understand about a Oracle RAC database system is that it involves a configuration of multiple hosts or servers joined together with a clustering software and accessing the shared disk structures.What i also understand is that i need a CRS provided by Oracle for a Windows or Linux based RAC installation.
    What i don't understand is for a Two node RAC system do i need a Third machine which has the shared storage device.If yes do i have to install a copy of Oracle 10g Enterprise Edition and CRS on the shared storage device moreover do i need to install a copy of Oracle 10g Enterprise edition on each of the nodes or just the clusterware is enough.

    Hi Salil
    No, you do not need a third machine. You can even setup RAC as a single node on one machine! What you do need is to have some sort of cluster aware file system. Try looking at OCFS on Windows in your case.
    If you are interested in a Linux example please have a look at the examples at http://ocpdba.net/oracle9i_rac/cheaprac.html and http://ocpdba.net/oracle10g_rac for an OCFS implementation I am working on.
    regards,
    Ahbaid

  • Oracle 10G RAC - Public & Heartbeat on 1 NIC

    Hello all,
    actually I am installing Oracle 10G RAC on rhel 4 (4 node cluster). But the Cluster Verification Utility aborts with errors. I checked the configToolAllCommands and tried to run the failed commands manually:
    #/opt/oracle/crs/bin/oifcfg setif -global eth0.100/172.18.0.0:cluster_interconnect eth0.728/172.16.128.0:public eth0.498/172.17.1.0:cluster_interconnect
    PRIF-50: duplicate interface is given in the input
    PRIF-50: duplicate interface is given in the input
    PRIF-50: duplicate interface is given in the input
    Question:
    Is it possible to put Public & Heartbeat on one NIC (eth0.728 & eth0.498)
    If not is their any workaround for that issue?
    Output /etc/hosts
    # that require network functionality will fail.
    127.0.0.1 localhost.localdomain localhost
    172.18.253.48 eu0266.[company].net cfmaster
    172.16.128.11 eu0200.[company].net eu0200
    172.16.128.12 eu0201.[company].net eu0201
    172.16.128.13 eu0202.[company].net eu0202
    172.16.128.14 eu0203.[company].net eu0203
    172.18.13.11 eu0200m.[company].net eu0200m
    172.18.13.12 eu0201m.[company].net eu0201m
    172.18.13.13 eu0202m.[company].net eu0202m
    172.18.13.14 eu0203m.[company].net eu0203m
    # Private section
    172.17.1.11 eu0200-priv.[company].net eu0200-priv
    172.17.1.12 eu0201-priv.[company].net eu0201-priv
    172.17.1.13 eu0202-priv.[company].net eu0202-priv
    172.17.1.14 eu0203-priv.[company].net eu0203-priv
    # Virtual section
    172.16.128.16 eu0200-vip.[company].net eu0200-vip
    172.16.128.17 eu0201-vip.l[company].net eu0201-vip
    172.16.128.18 eu0202-vip.[company].net eu0202-vip
    172.16.128.19 eu0203-vip.[company].net eu0203-vip
    Output install log:
    Checking existence of VIP node application (required)
    Check failed.
    Check failed on nodes:
    eu0203,eu0202,eu0201,eu0200
    Checking existence of ONS node application (optional)
    Check ignored.
    Checking existence of GSD node application (optional)
    Check ignored.
    Post-check for cluster services setup was unsuccessful on all the nodes.
    Command = /opt/[company]/oracle/crs/bin/cluvfy has failed
    INFO: Configuration assistant "Oracle Cluster Verification Utility" failed
    *** Starting OUICA ***
    Oracle Home set to /opt/[company]/oracle/crs
    Configuration directory is set to /opt/[company]/oracle/crs/cfgtoollogs. All xml files under the directory will be processed
    INFO: The "/opt/[company]/oracle/crs/cfgtoollogs/configToolFailedCommands" script contains all commands that failed, were skipped or were cancelled. This file may be used to run these configuration assistants outside of OUI. Note that you may have to update this script with passwords (if any) before executing the same.
    SEVERE: OUI-25031:Some of the configuration assistants failed. It is strongly recommended that you retry the configuration assistants at this time. Not successfully running any "Recommended" assistants means your system will not be correctly configured.
    1. Check the Details panel on the Configuration Assistant Screen to see the errors resulting in the failures.
    2. Fix the errors causing these failures.
    3. Select the failed assistants and click the 'Retry' button to retry them.
    Thanks in advance for your help!
    Regards

    Tads wrote:
    Hi all,
    I have a question about Oracle RAC and network interface.
    We're using Oracle 10gR2 RAC with two nodes on Linux Red Hat.
    Let's assume that the public network interface goes down.
    I would like to know what happens with existing connections
    on node with network interface with problems.
    Are connections frozen, actives?
    Can the users continue to use theses existing connections using the another node of RAC?If the interface is down? what do you think? All connections to this node will die. How does your application handle fail-over, does it attempt to reconnect or just have a complete application failure?
    You should spend some time in a test lab where you can test this stuff for yourself. Read the documentation and there are tons of sites out there that purport to have all of your RAC/TAF/FAN/FAF questions. - I would read and trust the documentation first.
    >
    I know that the listener goes down and any other connections is allowed.
    Thank you very much!!!!

  • Oracle 10g RAC using ASM - Storage Issue

    I’ve an issue related to Oracle 10g RAC.
    I’ve 2 node cluster each being Dell 2850 Server with RHEL 4.0
    I’ve EMC CX300 SAN storage with following partitions
    /orasoft     10 Gb          OCFS2 File system
    /oracrs          2 Gb          OCFS2 File system
    /orabackup      100 Gb          OCFS2 File system
    The datafiles are on ASM which is not directly visible in OS.
    I’ve common Oracle Home installed in /orasoft/db_1 which is shared by both nodes in cluster.
    I’ve faced an issue recently related to EMC storage.
    The /orasoft partition displays 1.4 Gb space available using df command.
    With both nodes sharing the common Oracle Home (/orasoft/db_1), when ever I try to touch a file I get an error as No Space left on device. I’m unable to start any service with the same reason.
    Is this setup correct ??
    Can anyone help me with this storage issue ??

    Need a clarification here...what do you mean by "Storage System"...do you mean a server/node or the SAN storage system. If you are referring to a server/node's local storage, then it would NOT be possible for use by RAC, since the disk space has to be shared among the nodes.
    Here is what you can do:
    - Create two partitions/devices (for example Disk_1 and Disk_2) in the SAN storage
    - Create a ASM disk group which would mirror Disk_1 to Disk_2.
    Again, please note that the partitions have to be visible and be accessible read/write from both the nodes/servers.
    HTH
    Thanks
    Chandra Pabba

  • Oracle 10g RAC+ASM - Storage Issue

    I’ve an issue related to Oracle 10g RAC.
    I’ve 2 node cluster each being Dell 2850 Server with RHEL 4.0
    I’ve EMC CX300 SAN storage with following partitions
    /orasoft     10 Gb          OCFS2 File system
    /oracrs          2 Gb          OCFS2 File system
    /orabackup      100 Gb          OCFS2 File system
    The datafiles are on ASM which is not directly visible in OS.
    I’ve common Oracle Home installed in /orasoft/db_1 which is shared by both nodes in cluster.
    I’ve faced an issue recently related to EMC storage.
    The /orasoft partition displays 1.4 Gb space available using df command.
    With both nodes sharing the common Oracle Home (/orasoft/db_1), when ever I try to touch a file I get an error as No Space left on device. I’m unable to start any service with the same reason.
    Is this setup correct ??
    Can anyone help me with this storage issue ??

    Hi,
    If you create a new diskgroup you may be to add the same diskgroup to parameter file or spfile and which will be needing down time.
    sugestion: Instead of creating new diskgroup you should to add disk to existing group.if you add asm disk to existing group your all problem will be solved and Oracle itself will be managing all.And than i am sure no need to add entry in the parameter or spfile like +db_create_file_dest=.....
    regards,
    Sher khan

  • Oracle 10g RAC - Private Interconnect on Private non-routable VLAN

    In our data center there is an existing Oracle 10g RAC configured with private VLAN for Interconnect administered by a different group of DBAs.
    We are designing a new, separate Oracle 10g RAC environment to support our application.
    When we discussed with our data center folks to set up a private VLAN for our RAC Interconnect, they suggest to use the same existing Private VLAN used by other Oracle RAC configurations. In that case the Interconnect IPs will be on the same subnet as other Oracle RAC configurations.
    For example, if
    RAC1 with 2 nodes is using 192.168.1.1 and 192.168.1.2 in the VLAN_1 for the Interconect, they want us to use the same VLAN_1 with Interconnect IPs 192.168.1.3 and 192.168.1.4 for our 2 node RAC.
    Is Sharing same subnet on the same Private VLANs for interconnects of different RAC configurations supported?
    Will that cause any performance hit? That means the Interconnect IPs of One RAC configuration is pingable from other RAC configuration.
    Did anyone come across such a design?
    Could not find any info on this on Metalink.
    Thanks

    yes
    this is practically very much feasible.. as you would have only 4 m/c in ip subnet .... and this is very much less than the public subnet which we should refrain from using from interconnect.

  • Oracle database not starting up in oracle 10g RAC

    Hi!
    Recently I came across one problem with one node oracle 10g RAC.When the Oracle database is started,while opening it is giving ORA-03113:End of file on communication channel error.When I saw the the alert trace file and other trace files I found Disk group is exhausted error and it is not able to create .dbf files.Actually it is not a production server and I gave archive log destination in SAN.Even the spfile(content of init_database.ora) is in SAN..
    I tried Asmcmd utility to delete the archive log files.As the oracle is not available I am not able to asmcmd prompt.
    How to change the destination of archive log and to remove the old archive log files(as it is a testing environment we can remove) from SAN?Please let me know.
    Thanks & Regards
    Srikanth MVS

    keithrust wrote:
    On VMware there's a known issue with Oracle databases on a Windows client not starting up properly all the time and a manual startup using oradim -start -sid <whatever> is required to get it fully running. Hmmm, doing it several time, and never seen such issue. Which "known issue" and by who are you talking about ?
    I created a brand new Oracle VM Windows 2003 32-bit server, installed the Oracle drivers for paravirtualization, and whammo, the problem is still hereI'm sure, you miss something somewhere in the config. Right now, you're on supported configuration, you could either raise a SR to the support, or get help from your peer on Oracle Database General forum.
    Ah, but it's not a Windows issue. On a non-VM Windows box the database starts just fine all the time. Again, this is a known issue acknowledged by Oracle on the VMware side, I'm just surprised it exists on the Oracle VM side.Again, give more details about this "known issue". Never heard about that, eventhough I've been around for years.
    I was afraid you were going to ask that. I'll have to search for it again, but I think you can do the same as well....Well, I doubt you could find a Metalink note about Oracle on VMWare. So far, Oracle has always refused to support database on OS virtualized on VMWare (or any VM software other than Oracle VM). Based on that, you could be sure, your "known issue" is not an issue on Oracle VM.
    If you want more help, again, give more details about your issue.
    Nicolas.

  • How to maintain Data availability in Oracle 10g RAC when LOADING the data?

    Hi
    we are having Oracle 10g server on Sun Solaris(64 bit) with 8 GB RAM.
    We are in need of moving the database to "Oracle 10g RAC". (Real Application Clusters)
    Our doubt is when loading the data into RAC server, will it affect the application?
    coz' we heared that RAC server should be down when we load the data.
    Is this correct?
    If yes, then How to maintain High data availability when we load the data in RAC server?? Please help me.
    Thanks.

    First, is this the same question that a colleague of yours posted in this thread?
    Data Load in RAC
    Second, are other sessions querying the table that you're loading data into? If so, how are you loading the data?
    Justin
    Distributed Database Consulting, Inc.
    http://www.ddbcinc.com/askDDBC

Maybe you are looking for