Link down on Oracle 10R2 - RAC

Dear All..
This is my first post, I have question related with RAC environment.
I setting up RAC on spesifications bellow:
- 3 HP server using intel xeon and 4G RAM each.
- Hitachi SAN
- Linux Centos 4.2
- Oracle 10g Release 2
- ASM 2 and OCFS 2
In installation steps, I found no problems. And the IP configuration used are:
# Public Network - (eth0)
172.20.141.1 linux1
172.20.141.2 linux2
172.20.141.3 linux3
# Private Interconnect - (eth1)
192.168.0.1 int-linux1
192.168.0.2 int-linux2
192.168.0.3 int-linux3
# Public Virtual IP (VIP) addresses for - (eth0)
172.20.142.1 vip-linux1
172.20.142.2 vip-linux2
172.20.142.3 vip-linux3
==========================================
Hangcheck 30 seconds, margin is 180 seconds
==========================================
[root@linux2 ~]# grep Hangcheck /var/log/messages |
tail -5
Apr 24 13:09:59 linux2 kernel: Hangcheck: starting
hangcheck timer 0.5.0 (tick is 30 seconds, margin is
180 seconds).
Apr 25 13:24:42 linux2 kernel: Hangcheck: starting
hangcheck timer 0.5.0 (tick is 30 seconds, margin is
180 seconds).
==========================================
setup O2CB_HEARTBEAT_THRESHOLD=301 pada file
/etc/sysconfig/o2cb
==========================================
But when I am trying to test the server using very hard
environment test (transactions and reporting using application
build by Delphi), just likes production ones, I found an error
occurred, eth0 : link down. The times it happen was random
(1 month 3 times - averages values).
And this is the error log.
==========================================
linux2 error log
==========================================
Apr 22 07:55:01 linux2 crond(pam_unix)[4288]: session
closed for user root
Apr 22 07:55:18 linux2 kernel: eth0: link down
Apr 22 07:55:19 linux2 kernel: eth0: link up, 100Mbps,
full-duplex, lpa 0x45E1
Apr 22 07:55:24 linux2 kernel:
(0,0):o2net_idle_timer:1330 connection to node li
nux3 num 1 at 172.20.141.3:7777 has been idle for 10
seconds, shutting it down.
Apr 22 07:55:24 linux2 kernel:
(0,0):o2net_idle_timer:1341 here are some times t
hat might help debug the situation: (tmr
1145667314.798878 now 1145667324.796791
dr 1145667314.798856 adv
1145667314.798892:1145667314.798893 func (c9943f19:505
) 1145429443.839194:1145429443.839197)
Apr 22 07:55:24 linux2 kernel:
(2248,1):o2net_set_nn_state:421 no longer connect
ed to node linux3 at 172.20.141.3:7777
Apr 22 07:55:26 linux2 kernel:
(0,0):o2net_idle_timer:1330 connection to node li
nux1 num 2 at 172.20.141.1:7777 has been idle for 10
seconds, shutting it down.
Apr 22 07:55:26 linux2 kernel:
(0,0):o2net_idle_timer:1341 here are some times t
hat might help debug the situation: (tmr
1145667316.353964 now 1145667326.352424
dr 1145667316.353945 adv
1145667316.353965:1145667316.353966 func (c9943f19:505
) 1145429355.325673:1145429355.325679)
Apr 22 07:55:26 linux2 kernel:
(2248,1):o2net_set_nn_state:421 no longer connect
ed to node linux1 at 172.20.141.1:7777
============================================
linux1 error log
============================================
Apr 22 07:55:02 linux1 crond(pam_unix)[10395]: session
closed for user root
Apr 22 07:57:58 linux1 kernel:
(2319,0):o2net_set_nn_state:421 no longer connect
ed to node linux2 at 172.20.141.2:7777
===========================================
linux3 error log
===========================================
Apr 22 07:55:03 linux3 crond(pam_unix)[14329]: session
closed for user root
Apr 22 07:56:50 linux3 kernel:
(2512,0):o2net_set_nn_state:421 no longer connect
ed to node linux2 at 172.20.141.2:7777
Any body have an advice or solution to overcome this problem.
Thanks for helping.
Best Regards,
Car Elcaro.

$cat opmn/conf/ons.config
localport=6101
remoteport=6200
loglevel=9
useocr=on
2007-02-09 13:43:16.441: [    RACG][1] [15217][1][ora.cgwdb02.ons]: Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
{node = cgwdb01, port = 6200}
Adding remote host cgwdb01:6200
onscfg[1]
{node = cgwdb02, port = 6200}
Adding remote host cgwdb02:6200
ons is not running ...
2007-02-09 13:43:16.454: [    RACG][1] [15217][1][ora.cgwdb02.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/oracle/product/10.2.0/crs
2007-02-09 13:43:16.454: [    RACG][1] [15217][1][ora.cgwdb02.ons]: clsrcexecut: cmd = /u01/oracle/product/10.2.0/crs/bin/racgeut -e USRORA_DEBUG=0 540 /u01/oracle/product/10.2.0/crs/bin/onsctl ping
2007-02-09 13:43:16.454: [    RACG][1] [15217][1][ora.cgwdb02.ons]: clsrcexecut: rc = 1, time = 0.531s
2007-02-09 13:43:16.454: [    RACG][1] [15217][1][ora.cgwdb02.ons]: end for resource = ora.cgwdb02.ons, action = check, status = 1, time = 0.743s
2007-02-09 13:43:17.421: [    RACG][1] [15239][1][ora.cgwdb02.ons]: onsctl: shutting down ons daemon ...
Number of onsconfiguration retrieved, numcfg = 2
onscfg[0]
{node = cgwdb01, port = 6200}
Adding remote host cgwdb01:6200
onscfg[1]
{node = cgwdb02, port = 6200}
Adding remote host cgwdb02:6200

Similar Messages

  • Need link for official oracle clusterware & RAC installaiton on sol10 SPARC

    Hi Forum,
    I have been trying out all types of possible combination of installations for oracle clusterware and RAC on solaris 10 SPARC system. But as every document is not exact in all the steps so i have not succeeded at all.
    It would be nice if anyone can give me the link for ORACLE CLUSTERWARE AND RAC INSTALLATION ON SOLARIS 10 SPARC. It would be nice if the document covers almost all the steps.
    setup : sun fire 445 (2 No)
    RAW disk Setup for storage.
    OS: solaris 10 11/08
    oracle: oracle 10.2.0.1 for DB and clusterware.
    Please help me out. Its urgent.
    Regards
    Prakash

    I have been using the same document b14205. But the steps for the oracle clusterware OUI installations are not at all specified in. in 4 steps it has completed the whole OUI setup. where as i ran into so many issues that i had to look into google to find the soultion... and doc which is more presise with the steps.. well thanks for the other document... its given a new dimention to my efforts
    regards
    prakash

  • RAW disks for Oracle 10R2 RAC NO SUN CLUSTER

    Yes you read it correctly....no Sun cluster. Then why am I on the Forum right? Well we have one Sun Cluster and another that is RAC only for testing. Between Oracle and Sun, neither accept any fault for problems with their perfectly honed products. Currently, I have multipathed fiber hba's to a Storedge 3510, and I've tried to get Oracle to use a raw lun for the ocr and voting disks. It doesn't see the disk. I've made sure they are stamped for oracle:dba, and tried oracle:oinstall. When presenting /dev/rdsk/C7t<long number>d0s6 for the ocr, I get a "can not find disk path." Does Oracle raw mean SVM raw? Should I create metadisks?

    "Between Oracle and Sun, neither accept any fault for problems with their perfectly honed products"...more specific:
    Not that the word "fault" is characterization of any liability, but a technical characterization of acting like a responsible stakeholder when you sell your product to a corporation. I've been working on the same project for a year, as an engineer. Not withstanding a huge expanse of management issues over the project, when technical gray areas have been reached, whereas our team has tried to get information to solve the issue. The area has become a big bouncing hot potato. Specifically, when Oracle has a problem reading a storage device, according to Oracle, that is a Sun issue. According to Sun, they didn't certify the software on that piece of equipment, so go talk to Oracle. In the sun cluster arena, if starting the database creates a node eviction from the cluster, good luck getting any specific team to say, that's our problem. Sun will say that Oracle writes crappy cluster verify scripts, and Oracle will say that Sun has not properly certified the device for use with their product. Man, I've seen it. The first time I said O.K. how do we avoid this in the future, the second time I said how did I let this happen again, and after more issues, money spent, hours lost, and customers, pissed --do the math.   I've even went as far as say, find me a plug and play production model for this specific environment, but good luck getting two companies to sign the specs for it...neither wants to stamp their name on the product due to the liability.  Yes your right, I should beat the account team, but as an engineer, man that's not my area, and I have other problems that I was hired to deal with.  I could go on.  What really is a slap in face is no one wants to work on these projects, if given the choice with doing a Windows deployment, because they can pop out mind bending amounts of builds why we plop along figuring out why clusterware doesn't like slice 6 of a /device/scsi_vhci/ .  Try finding good documentation on that.  ~You can deploy faster, but you can't pay more!                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       

  • Failure at final  check of Oracle CRS stack.AIX oracle 10g RAC with GPFS

    HI ,I install oracle 10R2 RAC using GPFS,os is AIX 6.1,but when I installed  CRS,at executing the second root.sh ,I am in trouble ,the error information as follow :
    Failure at final  check of Oracle CRS stack.
    10
    I look up the log file the information :
    The OCR location /data_gpfs/CRS/ocr_disk2 is inaccessible. Details in /orac
    leapp/product/10.2.0/crs/log/p615a/client/ocrconfig_6881286.log.
    the ocrconfig_6881286.log.information:
    2014-01-24 01:32:20.361: [ OCRCONF][1]ocrconfig starts...
    2014-01-24 01:32:20.389: [ OCRCONF][1]Upgrading OCR data
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:3: problem reading buffer 100f21d0 buflen 512 retval 0 phy_offset 102400 retry 0
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:4: problem reading the buffer errno 2 errstring No such file or directory
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:3: problem reading buffer 100f21d0 buflen 512 retval 0 phy_offset 102400 retry 0
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:4: problem reading the buffer errno 2 errstring No such file or directory
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:3: problem reading buffer ffffb1d0 buflen 4096 retval 0 phy_offset 102400 retry 0
    2014-01-24 01:32:20.392: [  OCROSD][1]utread:4: problem reading the buffer errno 2 errstring No such file or directory
    2014-01-24 01:32:20.392: [  OCRRAW][1]propriogid:1: INVALID FORMAT
    2014-01-24 01:32:20.392: [  OCROSD][1]utread:3: problem reading buffer ffffb1d0 buflen 4096 retval 0 phy_offset 102400 retry 0
    2014-01-24 01:32:20.392: [  OCROSD][1]utread:4: problem reading the buffer errno 2 errstring No such file or directory
    2014-01-24 01:32:20.392: [  OCRRAW][1]propriogid:1: INVALID FORMAT
    2014-01-24 01:32:20.392: [  OCRRAW][1]proprioini: both disks are not OCR formatted
    2014-01-24 01:32:20.392: [  OCRRAW][1]proprinit: Could not open raw device
    2014-01-24 01:32:20.392: [ default][1]a_init:7!: Backend init unsuccessful : [26]
    2014-01-24 01:32:20.393: [ OCRCONF][1]Exporting OCR data to [OCRUPGRADEFILE]
    2014-01-24 01:32:20.393: [  OCRAPI][1]a_init:7!: Backend init unsuccessful : [33]
    propriogid:1: INVALID FORMAT
    2014-01-24 01:32:20.516: [  OCRRAW][1]propriowv: Vote information on disk 0 [/data_gpfs/CRS/ocr_disk1] is adjusted from [0/0] to [1/2]
    2014-01-24 01:32:20.527: [  OCRRAW][1]propriowv: Vote information on disk 1 [/data_gpfs/CRS/ocr_disk2] is adjusted from [0/0] to [1/2]
    2014-01-24 01:32:20.960: [  OCRRAW][1]propriniconfig:No 92 configuration
    2014-01-24 01:32:20.960: [  OCRAPI][1]a_init:6a: Backend init successful
    2014-01-24 01:32:21.191: [ OCRCONF][1]Initialized DATABASE keys in OCR
    2014-01-24 01:32:21.349: [ OCRCONF][1]Successfully set skgfr block 0
    2014-01-24 01:32:21.351: [ OCRCONF][1]Exiting [status=success]...
    I dont know what cause this error,i am really trouble,who can help me !!!

    HI ,I install oracle 10R2 RAC using GPFS,os is AIX 6.1,but when I installed  CRS,at executing the second root.sh ,I am in trouble ,the error information as follow :
    Failure at final  check of Oracle CRS stack.
    10
    I look up the log file the information :
    The OCR location /data_gpfs/CRS/ocr_disk2 is inaccessible. Details in /orac
    leapp/product/10.2.0/crs/log/p615a/client/ocrconfig_6881286.log.
    the ocrconfig_6881286.log.information:
    2014-01-24 01:32:20.361: [ OCRCONF][1]ocrconfig starts...
    2014-01-24 01:32:20.389: [ OCRCONF][1]Upgrading OCR data
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:3: problem reading buffer 100f21d0 buflen 512 retval 0 phy_offset 102400 retry 0
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:4: problem reading the buffer errno 2 errstring No such file or directory
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:3: problem reading buffer 100f21d0 buflen 512 retval 0 phy_offset 102400 retry 0
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:4: problem reading the buffer errno 2 errstring No such file or directory
    2014-01-24 01:32:20.391: [  OCROSD][1]utread:3: problem reading buffer ffffb1d0 buflen 4096 retval 0 phy_offset 102400 retry 0
    2014-01-24 01:32:20.392: [  OCROSD][1]utread:4: problem reading the buffer errno 2 errstring No such file or directory
    2014-01-24 01:32:20.392: [  OCRRAW][1]propriogid:1: INVALID FORMAT
    2014-01-24 01:32:20.392: [  OCROSD][1]utread:3: problem reading buffer ffffb1d0 buflen 4096 retval 0 phy_offset 102400 retry 0
    2014-01-24 01:32:20.392: [  OCROSD][1]utread:4: problem reading the buffer errno 2 errstring No such file or directory
    2014-01-24 01:32:20.392: [  OCRRAW][1]propriogid:1: INVALID FORMAT
    2014-01-24 01:32:20.392: [  OCRRAW][1]proprioini: both disks are not OCR formatted
    2014-01-24 01:32:20.392: [  OCRRAW][1]proprinit: Could not open raw device
    2014-01-24 01:32:20.392: [ default][1]a_init:7!: Backend init unsuccessful : [26]
    2014-01-24 01:32:20.393: [ OCRCONF][1]Exporting OCR data to [OCRUPGRADEFILE]
    2014-01-24 01:32:20.393: [  OCRAPI][1]a_init:7!: Backend init unsuccessful : [33]
    propriogid:1: INVALID FORMAT
    2014-01-24 01:32:20.516: [  OCRRAW][1]propriowv: Vote information on disk 0 [/data_gpfs/CRS/ocr_disk1] is adjusted from [0/0] to [1/2]
    2014-01-24 01:32:20.527: [  OCRRAW][1]propriowv: Vote information on disk 1 [/data_gpfs/CRS/ocr_disk2] is adjusted from [0/0] to [1/2]
    2014-01-24 01:32:20.960: [  OCRRAW][1]propriniconfig:No 92 configuration
    2014-01-24 01:32:20.960: [  OCRAPI][1]a_init:6a: Backend init successful
    2014-01-24 01:32:21.191: [ OCRCONF][1]Initialized DATABASE keys in OCR
    2014-01-24 01:32:21.349: [ OCRCONF][1]Successfully set skgfr block 0
    2014-01-24 01:32:21.351: [ OCRCONF][1]Exiting [status=success]...
    I dont know what cause this error,i am really trouble,who can help me !!!

  • Oracle 10g RAC - public network interface down

    Hi all,
    I have a question about Oracle RAC and network interface.
    We're using Oracle 10gR2 RAC with two nodes on Linux Red Hat.
    Let's assume that the public network interface goes down.
    I would like to know what happens with existing connections
    on node with network interface with problems.
    Are connections frozen, actives?
    Can the users continue to use theses existing connections using the another node of RAC?
    I know that the listener goes down and any other connections is allowed.
    Thank you very much!!!!

    Tads wrote:
    Hi all,
    I have a question about Oracle RAC and network interface.
    We're using Oracle 10gR2 RAC with two nodes on Linux Red Hat.
    Let's assume that the public network interface goes down.
    I would like to know what happens with existing connections
    on node with network interface with problems.
    Are connections frozen, actives?
    Can the users continue to use theses existing connections using the another node of RAC?If the interface is down? what do you think? All connections to this node will die. How does your application handle fail-over, does it attempt to reconnect or just have a complete application failure?
    You should spend some time in a test lab where you can test this stuff for yourself. Read the documentation and there are tons of sites out there that purport to have all of your RAC/TAF/FAN/FAF questions. - I would read and trust the documentation first.
    >
    I know that the listener goes down and any other connections is allowed.
    Thank you very much!!!!

  • Oracle 10g RAC to 11g RAC Upgrade on Solaris

    Hi,
    We are planning to do a migration of a 4 Node Oracle 10g RAC on Solaris 9 to 11g with Solaris 10. We'd like know what would be the best path to take. We cannot afford any downtime!
    Options: Are these feasible? Which option is best? Any documents links?
    a) Do a rolling upgrade of Oracle from 10g to 11g. Then take down individual nodes and upgrade the Solaris OS from 9 to 10 and bring them up back into the cluster. Is there any known issues taking this path? Is a rolling upgrade like this possible?
    b) Do an upgrade of the Solaris OS from 9 to 10 on each node and then bring them back up? Is this practical? Does Oracle allow different versions of OS running on different nodes?
    c) Use Dataguard with 2 different RAC environments (2 nodes each). How would this work? Is it the only possible way? Any steps please?
    Thanks

    a) Do a rolling upgrade of Oracle from 10g to 11g. Then take down individual nodes and upgrade the Solaris OS from 9 to 10 and bring them up back into the cluster. Is there any known issues taking this path? Is a rolling upgrade like this possible?Hi,
    first of all i would not change several components (OS, database version) at a time. My recommendation is to make small steps and start with the operating system first. Seconds recommendation is to test and test everything in your dev or test environment prior doing the upgrades in the productive environment. Trust me: You will face problems :-) So you better try it beforehand!
    b) Do an upgrade of the Solaris OS from 9 to 10 on each node and then bring them back up? Is this practical? Does Oracle allow different versions of OS running on different nodes?As far i know you can run different operating system versions on different nodes if they are supported (Solaris 9 and 10 are).
    Ronny Egner
    My blog: http://ronnyegner.wordpress.com

  • Private interconnect oracle 10g RAC configuration

    Can you please answer below questions?
    Why does a private interconnect need a switch, and why is straight through cables not supported.

    Hi,
    Why do you need a switch between the nodes".When network plugs are pulled out from one node on a two node cluster, a split brain scenerio occurs. (just it's enough)
    If you are using a crossover cable and you shutdown node (A) you will loose the network (private) link from node (B) (this happens in some servers.), Oracle RAC will not work with private network link down, both nodes will down and will not start until you get link on network private. (Goodbye high availability)
    You will be very unhappy with the error ORA-29740. Prelude to suggest this note: Troubleshooting ORA-29740 in a RAC Environment [ID 219361.1]
    There is no "why" Oracle RAC does not support crossover cable because RAC depends of SWITCH (it's a Hardware Requirements) and any problems in your environment Oracle will force you to implement a supported solution.
    You will have implemented a poor environment, if you not use a GB switch for private network.
    Oracle Words:
    *Physical Layout of the Private Interconnect*
    The basic requirements are described in the Installation Guide for each platform. Additional information about certification can be found on Metalink Certify.
    The interconnect as identified by both subnet number and interface name must be configured on all clustered nodes.
    *A switch between the clustered nodes is an absolute requirement.*
    *Cluster Interconnect in Oracle 10g and 11g [ID 787420.1]*
    Regards,
    Levi Pereira

  • Oracle 11gR2 - RAC to RAC replication via SAN

    Hi,
    Anyone there experience before on Oracle 11gr2 RAC to RAC via EMC SRDF setup? Please share some information and also provide links if there is any white paper/technical paper.
    Thanks
    Edited by: 858013 on May 11, 2011 12:10 AM

    In SAN replication, the database instance at the remote site cannot be started up and running as the SAN cannot allow read-write access to the filesystems.
    When the primary site goes down, you simply issue a STARTUP at the remote site --- provided that the SAN replication has guaranteed write ordering, else you will have a corrupted database. That is why it is very important to talk to your SAN storage vendors to get the replication setup correctly.
    (If the replication is correctly done, the STARTUP will "see" that the database files are fuzzy and that the onlin redo log files are available and attempt an Instance Recovery -- as if the instance / server had crashed or suffered a shutdown abort).
    Hemant K Chitale

  • How to configure Oracle 10g RAC on windown sever 2003

    hi all
    plz tell me
    how to configure Oracle 10g RAC on windown sever 2003 can any body help me , give be any link
    plz it is very necessary for me
    Regards

    Hello,
    There are a good doc written by Philip Newlan at www.jobcestbon.com/oracle/RacOnWindows.pdf
    Regards,
    Rodrigo Mufalani
    http://mufalani.blogspot.com

  • Oracle 10g rac installation on IBM power server

    Dear Gurus,
    I am installing Oracle 10g RAC on IBM power server but while running CRS getting the following errors. Please help to resolve this issue.
    OUTPUT from Installation log:
    INFO: Start output from spawned process:
    INFO: INFO:
    INFO: /oracle/app/oracle/product/crs/bin/genclntsh
    INFO: /usr/bin/ld: cannot find -lxl
    INFO: collect2: ld returned 1 exit status
    INFO: genclntsh: Failed to link libclntsh.so.10.1
    INFO: make:
    INFO: *** [client_sharedlib] Error 1
    INFO: End output from spawned process.
    INFO: INFO: Exception thrown from action: make
    Exception Name: MakefileException
    Exception String: Error in invoking target 'client_sharedlib' of makefile '/oracle/app/oracle/product/crs/network/lib/ins_net_client.mk'. See '/oracle/app/oracle/oraInventory/logs/installActions2011-12-26_04-40-48PM.log' for details.
    Exception Severity: 1
    INFO: Exception handling set to prompt user with options to Retry Ignore
    SYSTEM:
    IBM power server
    Linux 5.3
    I have already tried changing ORACLE_HOME,CRS_HOME,LD_LIBRARY_PATH but nothing worked.
    Regards,
    Prajash

    Those errors remind me on errors I encountered when I was trying to do installation that was not certified,
    so try to check if installation you are doing is actually supported.
    In the meantime , visit metalink and read document [ID 460969.1] , */usr/bin/ld: Cannot Find -lxml10, Genclntsh: Failed To Link Libclntsh.so.10.1*

  • Oracle 10G RAC - Public & Heartbeat on 1 NIC

    Hello all,
    actually I am installing Oracle 10G RAC on rhel 4 (4 node cluster). But the Cluster Verification Utility aborts with errors. I checked the configToolAllCommands and tried to run the failed commands manually:
    #/opt/oracle/crs/bin/oifcfg setif -global eth0.100/172.18.0.0:cluster_interconnect eth0.728/172.16.128.0:public eth0.498/172.17.1.0:cluster_interconnect
    PRIF-50: duplicate interface is given in the input
    PRIF-50: duplicate interface is given in the input
    PRIF-50: duplicate interface is given in the input
    Question:
    Is it possible to put Public & Heartbeat on one NIC (eth0.728 & eth0.498)
    If not is their any workaround for that issue?
    Output /etc/hosts
    # that require network functionality will fail.
    127.0.0.1 localhost.localdomain localhost
    172.18.253.48 eu0266.[company].net cfmaster
    172.16.128.11 eu0200.[company].net eu0200
    172.16.128.12 eu0201.[company].net eu0201
    172.16.128.13 eu0202.[company].net eu0202
    172.16.128.14 eu0203.[company].net eu0203
    172.18.13.11 eu0200m.[company].net eu0200m
    172.18.13.12 eu0201m.[company].net eu0201m
    172.18.13.13 eu0202m.[company].net eu0202m
    172.18.13.14 eu0203m.[company].net eu0203m
    # Private section
    172.17.1.11 eu0200-priv.[company].net eu0200-priv
    172.17.1.12 eu0201-priv.[company].net eu0201-priv
    172.17.1.13 eu0202-priv.[company].net eu0202-priv
    172.17.1.14 eu0203-priv.[company].net eu0203-priv
    # Virtual section
    172.16.128.16 eu0200-vip.[company].net eu0200-vip
    172.16.128.17 eu0201-vip.l[company].net eu0201-vip
    172.16.128.18 eu0202-vip.[company].net eu0202-vip
    172.16.128.19 eu0203-vip.[company].net eu0203-vip
    Output install log:
    Checking existence of VIP node application (required)
    Check failed.
    Check failed on nodes:
    eu0203,eu0202,eu0201,eu0200
    Checking existence of ONS node application (optional)
    Check ignored.
    Checking existence of GSD node application (optional)
    Check ignored.
    Post-check for cluster services setup was unsuccessful on all the nodes.
    Command = /opt/[company]/oracle/crs/bin/cluvfy has failed
    INFO: Configuration assistant "Oracle Cluster Verification Utility" failed
    *** Starting OUICA ***
    Oracle Home set to /opt/[company]/oracle/crs
    Configuration directory is set to /opt/[company]/oracle/crs/cfgtoollogs. All xml files under the directory will be processed
    INFO: The "/opt/[company]/oracle/crs/cfgtoollogs/configToolFailedCommands" script contains all commands that failed, were skipped or were cancelled. This file may be used to run these configuration assistants outside of OUI. Note that you may have to update this script with passwords (if any) before executing the same.
    SEVERE: OUI-25031:Some of the configuration assistants failed. It is strongly recommended that you retry the configuration assistants at this time. Not successfully running any "Recommended" assistants means your system will not be correctly configured.
    1. Check the Details panel on the Configuration Assistant Screen to see the errors resulting in the failures.
    2. Fix the errors causing these failures.
    3. Select the failed assistants and click the 'Retry' button to retry them.
    Thanks in advance for your help!
    Regards

    Tads wrote:
    Hi all,
    I have a question about Oracle RAC and network interface.
    We're using Oracle 10gR2 RAC with two nodes on Linux Red Hat.
    Let's assume that the public network interface goes down.
    I would like to know what happens with existing connections
    on node with network interface with problems.
    Are connections frozen, actives?
    Can the users continue to use theses existing connections using the another node of RAC?If the interface is down? what do you think? All connections to this node will die. How does your application handle fail-over, does it attempt to reconnect or just have a complete application failure?
    You should spend some time in a test lab where you can test this stuff for yourself. Read the documentation and there are tons of sites out there that purport to have all of your RAC/TAF/FAN/FAF questions. - I would read and trust the documentation first.
    >
    I know that the listener goes down and any other connections is allowed.
    Thank you very much!!!!

  • Oracle 10g RAC+ASM - Storage Issue

    I’ve an issue related to Oracle 10g RAC.
    I’ve 2 node cluster each being Dell 2850 Server with RHEL 4.0
    I’ve EMC CX300 SAN storage with following partitions
    /orasoft     10 Gb          OCFS2 File system
    /oracrs          2 Gb          OCFS2 File system
    /orabackup      100 Gb          OCFS2 File system
    The datafiles are on ASM which is not directly visible in OS.
    I’ve common Oracle Home installed in /orasoft/db_1 which is shared by both nodes in cluster.
    I’ve faced an issue recently related to EMC storage.
    The /orasoft partition displays 1.4 Gb space available using df command.
    With both nodes sharing the common Oracle Home (/orasoft/db_1), when ever I try to touch a file I get an error as No Space left on device. I’m unable to start any service with the same reason.
    Is this setup correct ??
    Can anyone help me with this storage issue ??

    Hi,
    If you create a new diskgroup you may be to add the same diskgroup to parameter file or spfile and which will be needing down time.
    sugestion: Instead of creating new diskgroup you should to add disk to existing group.if you add asm disk to existing group your all problem will be solved and Oracle itself will be managing all.And than i am sure no need to add entry in the parameter or spfile like +db_create_file_dest=.....
    regards,
    Sher khan

  • Oracle 11gR2 RAC in LDOM Network issue

    Hi, Requesting your expert advise regarding this configuration.
    We are implementing LDOM 2.2 on two SPARC T4-4 for Oracle 11gR2 RAC; Solaris 10 U10 on both control and guest domain. The setup for each primary/control domain is: Two 10g links aggregated and have four VLAN trunked on the aggregate. vSwitch created using the aggr as the device as following per T4-4:
    NOTE: VLAN 1501 is for data connection and VLAN 10 is for heartbeat for one RAC cluster and VL 1601 and 11 is for another RAC. all together four LDOMS.
    ldm add-vswitch vid=1501,1601,10,11 net-dev=aggr1 primary-vsw0 primary
    ldm add-vnet pvid=1501 vnetprod primary-vsw0 guest1
    ldm add-vnet pvid=10 vnethb primary-vsw0 guest1
    ldm add-vnet pvid=1601 vnetprod primary-vsw0 guest2
    ldm add-vnet pvid=11 vnethb primary-vsw0 guest2
    vnet inside the LDOM are not tagged:
    vnet1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
    inet 10.220.128.20 netmask ffffff80 broadcast 10.220.128.127
    ether 0:14:4f:f9:ec:7f
    vnet2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
    inet 192.168.2.11 netmask ffffff80 broadcast 192.168.2.127
    ether 0:14:4f:fb:2b:8f
    Here is the whole configuration:
    root@gp-cpu-suh004 # ldm -V
    Logical Domains Manager (v 2.2.0.0)
    Hypervisor control protocol v 1.9
    Using Hypervisor MD v 1.4
    System PROM:
    Hostconfig v. 1.2.0. @(#)Hostconfig 1.2.0.a 2012/05/11 07:34
    Hypervisor v. 1.11.0. @(#)Hypervisor 1.11.0.a 2012/05/11 05:28
    OpenBoot v. 4.34.0 @(#)OpenBoot 4.34.0 2012/04/30 14:26
    root@gp-cpu-suh004 # ldm ls
    NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
    primary active -n-cv- UART 32 16G 2.5% 48m
    oidrac1 active -n---- 5000 32 16G 0.0% 27m
    root@gp-cpu-suh004 # ldm ls -l
    NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
    primary active -n-cv- UART 32 16G 3.1% 48m
    SOFTSTATE
    Solaris running
    UUID
    e73421fe-7003-e748-be7e-801fee5bfcc7
    MAC
    00:21:28:f1:95:26
    HOSTID
    0x85f19526
    CONTROL
    failure-policy=ignore
    extended-mapin-space=off
    cpu-arch=native
    DEPENDENCY
    master=
    CORE
    VCPU
    MEMORY
    RA PA SIZE
    0x20000000 0x20000000 16G
    CONSTRAINT
    threading=max-throughput
    VARIABLES
    auto-boot-on-error?=true
    auto-boot?=true
    boot-device=/pci@400/pci@1/pci@0/pci@0/LSI,sas@0/disk@w5000cca0251e7a29,0:a
    keyboard-layout=US-English
    nvramrc=." ChassisSerialNumber 1207BDYFFE " cr
    use-nvramrc?=true
    IO
    DEVICE PSEUDONYM OPTIONS
    VCC
    NAME PORT-RANGE
    primary-vcc0 5000-5100
    VSW
    NAME MAC NET-DEV ID DEVICE LINKPROP DEFAULT-VLAN-ID PVID VID MTU MODE INTER-VNET-LINK
    primary-vsw-mgmt 00:14:4f:fb:75:c0 igb1 0 switch@0 1 1 1500 on
    primary-vsw0 00:14:4f:fa:33:8b aggr1 1 switch@1 1 1 1501,1601,10,11 1500 on
    VDS
    NAME VOLUME OPTIONS MPGROUP DEVICE
    primary-vds0 rootoid /dev/dsk/c14t50060E8005BFAA04d1s2
    data_oid /dev/dsk/c14t50060E8005BFAA04d2s2
    ocr_oid /dev/dsk/c14t50060E8005BFAA04d3s2
    VCONS
    NAME SERVICE PORT
    UART
    NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
    oidrac1 active -n---- 5000 32 16G 0.0% 27m
    SOFTSTATE
    Solaris running
    UUID
    0fcbbf21-14a2-eb21-f544-d4424212f3ef
    MAC
    00:14:4f:f9:1b:d4
    HOSTID
    0x84f91bd4
    CONTROL
    failure-policy=ignore
    extended-mapin-space=off
    cpu-arch=native
    DEPENDENCY
    master=
    CORE
    CID CPUSET
    VCPU
    VID PID CID UTIL STRAND
    MEMORY
    RA PA SIZE
    0x20000000 0x420000000 16G
    CONSTRAINT
    threading=max-throughput
    VARIABLES
    auto-boot?=true
    boot-device=disk:a
    keyboard-layout=US-English
    NETWORK
    NAME SERVICE ID DEVICE MAC MODE PVID VID MTU LINKPROP
    vnet1 primary-vsw-mgmt@primary 0 network@0 00:14:4f:fa:61:77 1 1500
    vnetprod primary-vsw0@primary 1 network@1 00:14:4f:f9:ec:7f 1501 1500
    vnethb primary-vsw0@primary 2 network@2 00:14:4f:fb:2b:8f 10 1500
    DISK
    NAME VOLUME TOUT ID DEVICE SERVER MPGROUP
    oneidrootdisk rootoid@primary-vds0 0 disk@0 primary
    oid_data data_oid@primary-vds0 1 disk@1 primary
    oid_ocr ocr_oid@primary-vds0 2 disk@2 primary
    VCONS
    NAME SERVICE PORT
    oidrac1 primary-vcc0@primary 5000
    root@gp-cpu-suh004 # ldm ls-services
    VCC
    NAME LDOM PORT-RANGE
    primary-vcc0 primary 5000-5100
    VSW
    NAME LDOM MAC NET-DEV ID DEVICE LINKPROP DEFAULT-VLAN-ID PVID VID MTU MODE INTER-VNET-LINK
    primary-vsw-mgmt primary 00:14:4f:fb:75:c0 igb1 0 switch@0 1 1 1500 on
    primary-vsw0 primary 00:14:4f:fa:33:8b aggr1 1 switch@1 1 1 1501,1601,10,11 1500 on
    VDS
    NAME LDOM VOLUME OPTIONS MPGROUP DEVICE
    primary-vds0 primary rootoid /dev/dsk/c14t50060E8005BFAA04d1s2
    data_oid /dev/dsk/c14t50060E8005BFAA04d2s2
    ocr_oid /dev/dsk/c14t50060E8005BFAA04d3s2
    root@gp-cpu-suh004 # dladm show-link
    vsw0 type: non-vlan mtu: 1500 device: vsw0
    vsw1 type: non-vlan mtu: 1500 device: vsw1
    vsw1501001 type: vlan 1501 mtu: 1500 device: vsw1
    igb0 type: non-vlan mtu: 1500 device: igb0
    igb1 type: non-vlan mtu: 1500 device: igb1
    qlge0 type: non-vlan mtu: 1500 device: qlge0
    qlge1 type: non-vlan mtu: 1500 device: qlge1
    qlge2 type: non-vlan mtu: 1500 device: qlge2
    qlge3 type: non-vlan mtu: 1500 device: qlge3
    aggr1 type: non-vlan mtu: 1500 aggregation: key 1
    root@gp-cpu-suh004 # ifconfig -a
    lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
    inet 127.0.0.1 netmask ff000000
    igb0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
    inet 10.223.12.14 netmask ffffff00 broadcast 10.223.12.255
    ether 0:21:28:f1:95:26
    vsw1501001: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 3
    inet 10.220.128.9 netmask ffffff80 broadcast 10.220.128.127
    ether 0:14:4f:fa:33:8b
    root@gp-cpu-suh004 # netstat -nr
    Routing Table: IPv4
    Destination Gateway Flags Ref Use Interface
    default 10.220.128.1 UG 1 7
    10.220.128.0 10.220.128.9 U 1 5 vsw1501001
    10.223.0.0 10.223.12.1 UG 1 2
    10.223.12.0 10.223.12.14 U 1 1 igb0
    224.0.0.0 10.220.128.9 U 1 0 vsw1501001
    127.0.0.1 127.0.0.1 UH 8 261 lo0

    Yes, I can connect to the vswitch interface on the control domain. I didn't specify any PVID because my understanding is that PVID will tag any frame with the PVID VLAN by default. basically the PVID for this interface is 1.
    Here's the VNET config for the other LDOM in the RAC cluster:
    VSW
    NAME MAC NET-DEV ID DEVICE LINKPROP DEFAULT-VLAN-ID PVID VID MTU MODE INTER-VNET-LINK
    primary-vsw-mgmt 00:14:4f:f9:91:fa igb1 0 switch@0 1 1 1500 on
    primary-vsw0 00:14:4f:fa:8e:cf aggr1 1 switch@1 1 1 1501,1601,10,11 1500 on
    NETWORK
    NAME SERVICE ID DEVICE MAC MODE PVID VID MTU LINKPROP
    vnet1 primary-vsw-mgmt 0 00:14:4f:fb:65:6d 1
    vnetprod primary-vsw0 1 00:14:4f:fa:2b:02 1501
    vnethb primary-vsw0 2 00:14:4f:f8:12:c1 10
    Thanks for reviewing my configuration.

  • Problem with Oracle 10g RAC VIP network setting at Solaris 9

    Dear All,
    I have tried to set up a Oracle 10g RAC Release 2.
    With OS solaris 9, and 2 nodes.
    The nodes setting as the following:
    nodes 1:
    Public address: 172.16.0.121
    Private address: 192.168.0.121, 192.168.1.121 (dual path for heart beat)
    nodes 2:
    Public address: 172.16.0.122
    Private address: 192.168.0.122, 192.168.1.122
    And i have assigned two IP adress 172.16.0.131, 172.16.0.132 as the VIP address for the
    RAC.
    And the following is the /etc/hosts file:
    root@shk01 # cat /etc/hosts
    # Internet host table
    127.0.0.1 localhost
    # public address
    172.16.0.121 shk01 loghost
    172.16.0.122 shk02
    # heart beat
    192.168.0.121 shk01-priv1
    192.168.1.121 shk01-priv2
    192.168.0.122 shk02-priv1
    192.168.1.122 shk02-priv2
    # VIP address
    172.16.1.131 vip-shk01
    172.16.1.132 vip-shk02
    But when i run the command "$ ./runcluvfy.sh comp nodecon -n shk01,shk02 -verbose"
    it shows the error:
    ERROR:
    Could not find a suitable set of interfaces for VIPs.
    Result: Node connectivity check failed.
    I did try to add the VIP address on bge0:1, as i using bge0 for the public address.
    Both nodes i did using the same interface name for it.
    Anyone have idea for me to check out the error?
    Also, I have other question about the raw device.
    As there is option for setting for ASM or raw device. If choosing raw device, does it mean that it just need to
    format the storage disk but without newfs it? And then the Oracle program will able to handle it?
    Thanks,
    Xentar

    You don't seem to state categorically that you are using Solaris Cluster, so I'll assume it since this is mainly a forum about Solaris Cluster (and IMHO, Solaris Cluster with Clusterware is better than Clusterware on its own).
    Clusterware has to see the same device names from all cluster nodes. This is why Solaris Cluster (SC) is a positive benefit over Clusterware because SC provides an automatically managed, consistent name space. Clusterware on its own forces you to manage either the symbolic links (or worse mknods) to create a consistent namespace!
    So, given the SC consistent namespace you simple add the raw devices into the ASM configuration, i.e. /dev/did/rdsk/dXsY. If you are using Solaris Volume Manager, you would use /dev/md/<setname>/rdsk/dXXX and if you were using CVM/VxVM you would use /dev/vx/rdsk/<dg_name>/<dev_name>.
    Of course, if you genuinely are using Clusterware on its own, then you have somewhat of a management issue! ... time to think about installing SC?
    Tim
    ---

  • Oracle 11gR2 RAC on Oracle Linux

    Folks, need some help in finding the correct asmlib for this linux box, have already tired one and screwed up one box, now trying on 2nd one.
    Here is info: Oracle Linux Server release 5.6   x86_64
    Please advice, I'm looking at here and tried this one, but it's not working too:
    Intel EM64T (x86_64) Architecture
    Library and Tools
        oracleasm-support-2.1.8-1.el5.x86_64.rpm
        oracleasmlib-2.0.4-1.el5.x86_64.rpm
    Link: http://www.oracle.com/technetwork/server-storage/linux/downloads/rhel5-084877.html#oracleasm_rhel5_amd64
    and this is my oracleasm status on the server:
    [root@rac1 ~]# rpm -qa | grep -i oracleasm
    oracleasm-support-2.1.4-1.el5
    oracleasm-2.6.18-238.el5xen-2.0.5-1.el5
    oracleasm-2.6.18-238.el5debug-2.0.5-1.el5
    oracleasm-2.6.18-238.el5-2.0.5-1.el5
    Please assist, do I need to upgrade or ... how to fix this problem and proceed to complete the Oracle RAC setup.
    Thanks in advance.

    ASMLib is becoming "obsolete" (by system admins, not by Oracle Corp) and believe it will soon disappear.
    For RHEL6 or Oracle Linux 6, Oracle will only provide ASMLib software and updates for the UEK Kernel and the Red Hat Compatible Kernel for Oracle Linux.
    I recommend you use udev rules instead.
    Configuring Storage for Oracle Grid Infrastructure for a Cluster and Oracle RAC
    ORACLE-BASE - UDEV SCSI Rules Configuration for ASM in Oracle Linux 5 and 6

Maybe you are looking for