Oracle RAC HAIP vs network interface bonding

Hi,
Oracle intruduced new feature called HAIP (11gR2 Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip [ID 1210883.1] ) in Oracle 11.2.0.2 or later. Oracle GI natively supports HAIP so that user does not need to configure Interconnect network interface bonding anymore.
It seems to me that HAIP will provide better performance than network bonding. Using HAIP, you can have up to four interconnect at work at same time whereas bonding really present as one interconnect IP (ofcause you can have two bondings configured, but still is 1/2 of what HAIP can potentially up to).
Any thought on this?

I have implemented this at my site on a two node RAC.
We did test the same by bringing the newly added interfaces down and the failover did happen as expected.
I personally think that OS bonding will be better considering the number of bugs we run into as when the complexity and involvement of Oracle features increases.
Not saying that Oracle is unreliable, but just a personal opinion.
There is already a situation, you may face, wherein after the failover (and failback following the failover) the IPs returned are not correct due to ARP caching if the IPs fall under same network.
This can be avoided by using proven OS bonding (not that I'm aware of any issues with OS bonding :) ).

Similar Messages

Oracle RAC new feature for interconnect configuration HAIP vs BONDING

Hi All:
I would like to get some opinion about using Oracle HAIP (High Avalibilty IP) for configuring the RAC interconnect vs using network interface bonding.
This seems to be a new feature for Oracle GI from 11.2.0.2 and later. Have anyone had any experience for using HAIP and any issues?
Thanks

Hi
Multiple private network adapters can be defined either during the installation phase or afterward using the oifcfg.Grid Infrastructure can activate a maximum of four private network adapters at a time. With HAIP, by default, interconnect traffic will be load balanced across all active interconnect interfaces, and corresponding HAIP address will be failed over transparently to other adapters if one fails or becomes non-communicative.
it is quite helpful in following scenarios
1) if one private network adapterrs fails, virtual private IP on that adapter would be relocated to healthy adapter.
There is very good document on Metalink ( Doc ID - 1210883.1)
Rgds
Harvinder

Oracle RAC not starting

Dear Experts....
I have two node Oracle RAC 11g Setup. This was working fine. We have only single public network configured under cluster. Both nodes have CE0 interfaces configured for RAC public IP. Due to hardware failure, CE0 on node2 is not working. I have another interface CE3 free on system. I have configured IP on CE3 (from OS layer using ifconfig). But unable to start Clusterware as interface is not configured under RAC.
My question is :-
Can I run different interface instances (however interface types are same) on both nodes? What changes I need to do for this?
A quick help would be appreciated...
Kind Rgds
Rajendra

You cannot have interface whos name are different on both nodes. i.e you should have CE0 on node2 if node1 was configured with CE0. If this condition is not met then clusterware wont start.
Read following from support.oracle.com which says network interface name should be same on all node.
RAC and Oracle Clusterware Best Practices and Starter Kit (Platform Independent) [ID 810394.1]
hope this help

Channel Bonding? Oracle RAC with multiple redundant network

We are planning on setting up on Oracle RAC 10g on Linux.
We have redundant switches, so I would like to set up the network to have some version of network redundancy.
Under windows, it is called 'teaming'. Linux is 'channel-bonding'
I just want to make sure if I setup the channel-bonded interfaces, it will be supported with RAC. Has anyone done this before??
-Andy

we are running our RAC on NIC bonding. Although we need to do some more testing to be real sure as to the functionality of the bond.
Metalink Note:298891.1 would be a good place to start.
hth,
-S

Oracle 10g RAC - public network interface down

Hi all,
I have a question about Oracle RAC and network interface.
We're using Oracle 10gR2 RAC with two nodes on Linux Red Hat.
Let's assume that the public network interface goes down.
I would like to know what happens with existing connections
on node with network interface with problems.
Are connections frozen, actives?
Can the users continue to use theses existing connections using the another node of RAC?
I know that the listener goes down and any other connections is allowed.
Thank you very much!!!!

Tads wrote:
Hi all,
I have a question about Oracle RAC and network interface.
We're using Oracle 10gR2 RAC with two nodes on Linux Red Hat.
Let's assume that the public network interface goes down.
I would like to know what happens with existing connections
on node with network interface with problems.
Are connections frozen, actives?
Can the users continue to use theses existing connections using the another node of RAC?If the interface is down? what do you think? All connections to this node will die. How does your application handle fail-over, does it attempt to reconnect or just have a complete application failure?
You should spend some time in a test lab where you can test this stuff for yourself. Read the documentation and there are tons of sites out there that purport to have all of your RAC/TAF/FAN/FAF questions. - I would read and trust the documentation first.
>
I know that the listener goes down and any other connections is allowed.
Thank you very much!!!!

Copper cable / GigE Copper Interface as Private Interconnect for Oracle RAC

Hello Gurus
Can some one confirm if the copper Cables ( Cat5/RJ45) can be used for Gig Ethernet i.e. Private interconnects for deploying Oracle RAC 9.x or 10gR2 on Solaris 9/10 .
i am planning to use 2 X GigE Interfaces (one port each from X4445 Quad Port Ethernet Adapters) & Planning to connect it using copper cables ( all the documents that i came across is been refering to the fiber cables for Private Interconnects , connecting GigE Interfaces , so i am getting bit confused )
would appretiate if some one can throw some lights on the same.
regards,
Nilesh Naik
thanks

Cat5/RJ45 can be used for Gig Ethernet Private interconnects for Oracle RAC. I would recommend trunking the two or more interconnects for redundancy. The X4445 adapters are compatible with the Sun Trunking 1.3 software (http://www.sun.com/products/networking/ethernet/suntrunking/). If you have servers that support the Nemo framework (bge, e1000g, xge, nge, rge, ixgb), you can use the Solaris 10 trunking software, dladmin.
We have a couple of SUN T2000 servers and are using the onboard GigE ports for the Oracle 10gR2 RAC interconnects. We upgraded the onboard NIC drivers to the e1000g and used the Solaris 10 trunking software. The next update of Solaris will have the e1000g drivers as the default for the SUN T2000 servers.

PRIF-33 and CRS-02307 while changing public network interface, RAC

Hi,
I'm working on an Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management. It's a 3 nodes RAC: mgvdb01/02/03
After the installation I had to change IP (different IP and Subnet) to the RAC:
From 172.17.1.0/24 to 10.19.201.0/24.
Node from 172.17.1.31/32/33 to 10.19.201.31/32/33.
The same for VIP(s): from 172.17.1.131/132/133 to 10.19.201.131/132/133.
The oifcfg iflist shows correct ips configuration:
[oracle@mgvdb01 bin]$ ./oifcfg iflist -p -n
eth0 10.19.201.0 PRIVATE 255.255.255.0
eth1 172.17.100.0 PRIVATE 255.255.255.0
I'm following doc 276434.1 from Metalink: How to Modify Public Network Information including VIP in Oracle Clusterware, starting from Case III "Changing public network interface, subnet or netmask".
But at the first operation I've a problem:
[oracle@mgvdb01 bin]$ ./oifcfg getif
eth0 172.17.1.0 global public **THIS HAS TO BE CHANGED**
eth1 172.17.100.0 global cluster_interconnect
[oracle@mgvdb01 bin]# ./oifcfg delif -global eth0/172.17.1.0
PRIF-33: Failed to set or delete interface because hosts could not be discovered
CRS-02307: No GPnP services on requested remote hosts.
PRIF-32: Error in checking for profile availability for host mgvdb02
CRS-02306: GPnP service on host "mgvdb02" not found.
PRIF-32: Error in checking for profile availability for host mgvdb03
CRS-02306: GPnP service on host "mgvdb03" not found.
[oracle@mgvdb01 bin]$ ./oifcfg delif -node mgvdb01 eth0/172.17.1.0
[oracle@mgvdb01 bin]$ ./oifcfg setif -node mgvdb01 eth0/10.19.201.0:public
PRIF-33: Failed to set or delete interface because hosts could not be discovered
CRS-02307: No GPnP services on requested remote hosts.
PRIF-32: Error in checking for profile availability for host mgvdb02
CRS-02306: GPnP service on host "mgvdb02" not found.
PRIF-32: Error in checking for profile availability for host mgvdb03
CRS-02306: GPnP service on host "mgvdb03" not found.Then I restart Clusterware service, but issuing
[oracle@mgvdb01 bin]$ ./oifcfg getif
eth0 172.17.1.0 global public
eth1 172.17.100.0 global cluster_interconnectnothing seems to be changed.
This is blocking following operations, the ones in Case IV.
Do you have any suggestion?
Thanks in advance,
Samuel

Hi,
I'm working on an Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management. It's a 3 nodes RAC: mgvdb01/02/03
After the installation I had to change IP (different IP and Subnet) to the RAC:
From 172.17.1.0/24 to 10.19.201.0/24.
Node from 172.17.1.31/32/33 to 10.19.201.31/32/33.
The same for VIP(s): from 172.17.1.131/132/133 to 10.19.201.131/132/133.
The oifcfg iflist shows correct ips configuration:
[oracle@mgvdb01 bin]$ ./oifcfg iflist -p -n
eth0 10.19.201.0 PRIVATE 255.255.255.0
eth1 172.17.100.0 PRIVATE 255.255.255.0
I'm following doc 276434.1 from Metalink: How to Modify Public Network Information including VIP in Oracle Clusterware, starting from Case III "Changing public network interface, subnet or netmask".
But at the first operation I've a problem:
[oracle@mgvdb01 bin]$ ./oifcfg getif
eth0 172.17.1.0 global public **THIS HAS TO BE CHANGED**
eth1 172.17.100.0 global cluster_interconnect
[oracle@mgvdb01 bin]# ./oifcfg delif -global eth0/172.17.1.0
PRIF-33: Failed to set or delete interface because hosts could not be discovered
CRS-02307: No GPnP services on requested remote hosts.
PRIF-32: Error in checking for profile availability for host mgvdb02
CRS-02306: GPnP service on host "mgvdb02" not found.
PRIF-32: Error in checking for profile availability for host mgvdb03
CRS-02306: GPnP service on host "mgvdb03" not found.
[oracle@mgvdb01 bin]$ ./oifcfg delif -node mgvdb01 eth0/172.17.1.0
[oracle@mgvdb01 bin]$ ./oifcfg setif -node mgvdb01 eth0/10.19.201.0:public
PRIF-33: Failed to set or delete interface because hosts could not be discovered
CRS-02307: No GPnP services on requested remote hosts.
PRIF-32: Error in checking for profile availability for host mgvdb02
CRS-02306: GPnP service on host "mgvdb02" not found.
PRIF-32: Error in checking for profile availability for host mgvdb03
CRS-02306: GPnP service on host "mgvdb03" not found.Then I restart Clusterware service, but issuing
[oracle@mgvdb01 bin]$ ./oifcfg getif
eth0 172.17.1.0 global public
eth1 172.17.100.0 global cluster_interconnectnothing seems to be changed.
This is blocking following operations, the ones in Case IV.
Do you have any suggestion?
Thanks in advance,
Samuel

Oracle RAC and crossover cable for private network

Hi,
I have the following configuration: two database servers, each has four network cards, two for public network and two for private, cluster network. Each public card has own IP-address and both have virtual IP-address, defined in operation system (SLES-9) for redundancy. Because I have only two machines in the cluster I want to connect the two machines for private with crossover cable without switch. For redundancy I want to make two connections between machines. Is it at all possible? How should I defined all network interfaces and what should be included in /etc/hosts for properly work of Oracle cluster?
Best Regards,
Jacek

Hi,
You can build a RAC witch CROSSOVER, but the Oracle no homolog.
As you have 4 cards, 2 to public (redundancy) and 2 to interconnect (redundancy) you need of a software to to make a TEAM, and create a card virtual that will have a IP address.
Eder

Oracle RAC interfaces

HI,
While ORACLE 10g R2 CRS Installation , why should we give public interface information.
I understand we give private interface information because it should be used for Inter instance communication.
But why do we need to specify public interface.Is this poublic interface gonna be used for Cache fusion at all?
Thanks
Pramod

When one server goes down, the other server will take up both virtual IPs on the public interface, ensuring there is no delay in failover.
The interface type indicates the purpose for which the network is configured. The supported interface types are:
Public—An interface that can be used for communication with components external to Oracle RAC instances, such as Oracle Net and Virtual Internet Protocol (VIP) addresses.
Cluster_interconnect—A private interface used for the cluster interconnect to provide interinstance or Cache Fusion communication.
If you set the interface type to cluster_interconnect, it affects instances as they start up and changes do not take effect until you restart the instances.

Oracle RAC nodeapp 启动报错-vip:IP:192.168.2.200 is already up in the network

Linux redhat 5 Oracle RAC 10.2.0.5环境
启动nodeapp报错如下：
[oracle@rac1 ~]$ srvctl start nodeapps -n rac1
CRS-0210: Could not find resource ora.rac1.gsd.
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)
crsd.log部分日志如下：
2012-11-21 20:10:16.831: [ CRSRES][2717907856]0startRunnable: setting CLI values
2012-11-21 20:10:16.967: [ CRSRES][2717907856]0Attempting to start `ora.rac1.vip` on member `rac1`
2012-11-21 20:10:44.246: [ CRSAPP][2717907856]0StartResource error for ora.rac1.vip error code = 1
2012-11-21 20:10:47.007: [ CRSRES][2717907856]0Start of `ora.rac1.vip` on member `rac1` failed.
2012-11-21 20:10:47.529: [ CRSRES][2717907856]0Attempting to start `ora.rac1.vip` on member `rac2`
2012-11-21 20:11:18.649: [ CRSRES][2717907856]0Start of `ora.rac1.vip` on member `rac2` failed.
2012-11-21 20:11:18.897: [ CRSRES][2717907856]0CRS-1006: No more members to consider
2012-11-21 20:11:20.986: [ CRSRES][2717907856]0startRunnable: setting CLI values
2012-11-21 20:11:21.190: [ CRSRES][2717907856]0Attempting to start `ora.rac1.vip` on member `rac1`
2012-11-21 20:11:48.846: [ CRSAPP][2717907856]0StartResource error for ora.rac1.vip error code = 1
2012-11-21 20:11:51.203: [ CRSRES][2717907856]0Start of `ora.rac1.vip` on member `rac1` failed.
2012-11-21 20:11:51.492: [ CRSRES][2717907856]0rac2 : CRS-1019: Resource ora.rac1.LISTENER_RAC1.lsnr (application) cannot run on rac2
请问如何进一步排查“rac1:ora.rac1.vip:IP:192.168.2.200 is already up in the network (host=rac1)”问题？感谢

ping信息：
[oracle@rac1 ~]$ ping 192.168.2.200
PING 192.168.2.200 (192.168.2.200) 56(84) bytes of data.
64 bytes from 192.168.2.200: icmp_seq=1 ttl=64 time=0.043 ms
64 bytes from 192.168.2.200: icmp_seq=2 ttl=64 time=0.126 ms
64 bytes from 192.168.2.200: icmp_seq=3 ttl=64 time=0.059 ms
64 bytes from 192.168.2.200: icmp_seq=4 ttl=64 time=0.158 ms
64 bytes from 192.168.2.200: icmp_seq=5 ttl=64 time=0.643 ms
64 bytes from 192.168.2.200: icmp_seq=6 ttl=64 time=0.034 ms
64 bytes from 192.168.2.200: icmp_seq=7 ttl=64 time=0.046 ms
64 bytes from 192.168.2.200: icmp_seq=8 ttl=64 time=0.043 ms
64 bytes from 192.168.2.200: icmp_seq=9 ttl=64 time=0.048 ms
64 bytes from 192.168.2.200: icmp_seq=10 ttl=64 time=0.031 ms
telnet信息
[oracle@rac1 ~]$ telnet 192.168.2.200
Trying 192.168.2.200...
Connected to 192.168.2.200.
Escape character is '^]'.
Red Hat Enterprise Linux Server release 5 (Tikanga)
Kernel 2.6.18-8.el5xen on an i686
login:
ssh信息：
[oracle@rac1 ~]$ ssh 192.168.2.200
Last login: Sun Nov 18 13:37:10 2012 from rac2-vip
ifconfig -a信息：
[root@rac1 ~]# ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:0C:29:B6:CE:6B
inet addr:192.168.2.100 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:feb6:ce6b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:142396 errors:0 dropped:0 overruns:0 frame:0
TX packets:172561 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:41403126 (39.4 MiB) TX bytes:96009307 (91.5 MiB)
Interrupt:18 Base address:0x1480
eth1 Link encap:Ethernet HWaddr 00:0C:29:B6:CE:75
inet addr:192.168.2.200 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:feb6:ce75/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:14082 errors:0 dropped:0 overruns:0 frame:0
TX packets:29 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:9789756 (9.3 MiB) TX bytes:1434 (1.4 KiB)
Interrupt:19 Base address:0x1800
eth2 Link encap:Ethernet HWaddr 00:0C:29:B6:CE:7F
inet addr:192.168.3.100 Bcast:192.168.3.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:feb6:ce7f/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:12665 errors:0 dropped:0 overruns:0 frame:0
TX packets:32728 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6590271 (6.2 MiB) TX bytes:28437643 (27.1 MiB)
Interrupt:16 Base address:0x1880
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:30893 errors:0 dropped:0 overruns:0 frame:0
TX packets:30893 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:9402131 (8.9 MiB) TX bytes:9402131 (8.9 MiB)
sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

Oracle RAC / Logical Data Guard causing network problems on VMware

We have VMWare 5.0 cluster across the 12 blades (6 per chassis) running a mixture of Red Hat and Windows 2008 R2 vms. The Red Hat boxes are two times two node Oracle RAC (primary and secondary), also apache web servers and jboss application servers. The Windows servers are for AV/DC/Management/Monitoring.
The problem is that intermittent network connectivity to random Windows and Red Hat boxes occur when the Oracle RAC builds up archive logs and then ships / applies them to the secondary nodes, between ESX nodes either on different blades in the same chassis or across the chassis and even when all RAC nodes are on the same ESX host.
We are using NFS, Oracle 11g and Red Hat 6.2.
Sorry if this info is a bit vague, im not an Oracle expert! :-)
thanks,
Dave

Hi,
1.) The calculation for Standby RedoLogs is:
(Max Number of Logfiles per thread (Instance) +1) * Max Number of Threads (Instances))
So if you have 4 Redo Log Groups on your primary (which is 2 Redo Log Groups per Instance), then it ends up:
(2 +1) * 2 = 6
So actually you will only need 6 standby redo logs, not 8. But 2 more don't harm.
Your primary will need exactly the same number (6 or in your case 8). Which will be 3 per thread/instance or in your case 4.
2.) The SID List in the listener.ora is a listing of SIDs the Listener is listening on. It is not the listener name.
Hence it is not "lsnrctl guard_dgmgrl start" but only "lsnrctl LISTENER start", whereas the LISTENER is the default and "lsnrctl start" would be sufficient.
However since this is grid infrastructure with the listener running out of ASM home, be sure to have set your environment to GI Home not to DB_HOME for the listener.ora entries, but to DB_HOME for the tnsnames.ora entries necessary for data guard.
And since listener is running under clusterware you should use "srvctl stop listener" and start.
Last but not least the SID entries for dataguard have to use DGMGRL not dgmgrl.
3.) Here is the whitepaper you are looking for:
www.oracle.com/goto/maa
Also for client failover best practices.
(Here the direct link to the RAC whitepaper):
http://www.oracle.com/technetwork/database/features/availability/maa-wp-10g-racprimarysingleinstance-131970.pdf
However since this is 10g you should combine this with the 11g RAC standy paper (e.g. SCAN Listener setup).
Sebastian

Gig Ethernet V/S SCI as Cluster Private Interconnect for Oracle RAC

Hello Gurus
Can any one pls confirm if it's possible to configure 2 or more Gigabit Ethernet interconnects ( Sun Cluster 3.1 Private Interconnects) on a E6900 cluster ?
It's for a High Availability requirement of Oracle 9i RAC. i need to know ,
1) can i use gigabit ethernet as Private cluster interconnect for Deploying Oracle RAC on E6900 ?
2) What is the recommended Private Cluster Interconnect for Oracle RAC ? GiG ethernet or SCI with RSM ?
3) How about the scenarios where one can have say 3 X Gig Ethernet V/S 2 X SCI , as their cluster's Private Interconnects ?
4) How the Interconnect traffic gets distributed amongest the multiple GigaBit ethernet Interconnects ( For oracle RAC) , & is anything required to be done at oracle Rac Level to enable Oracle to recognise that there are multiple interconnect cards it needs to start utilizing all of the GigaBit ethernet Interfaces for transfering packets ?
5) what would happen to Oracle RAC if one of the Gigabit ethernet private interconnects fails
Have tried searching for this info but could not locate any doc that can precisely clarify these doubts that i have .........
thanks for the patience
Regards,
Nilesh

Answers inline...
Tim
Can any one pls confirm if it's possible to configure
2 or more Gigabit Ethernet interconnects ( Sun
Cluster 3.1 Private Interconnects) on a E6900
cluster ?Yes, absolutely. You can configure up to 6 NICs for the private networks. Traffic is automatically striped across them if you specify clprivnet0 to Oracle RAC (9i or 10g). That is TCP connections and UDP messages.
It's for a High Availability requirement of Oracle
9i RAC. i need to know ,
1) can i use gigabit ethernet as Private cluster
interconnect for Deploying Oracle RAC on E6900 ? Yes, definitely.
2) What is the recommended Private Cluster
Interconnect for Oracle RAC ? GiG ethernet or SCI
with RSM ? SCI is or is in the process of being EOL'ed. Gigabit is usually sufficient. Longer term you may want to consider Infiniband or 10 Gigabit ethernet with RDS.
3) How about the scenarios where one can have say 3 X
Gig Ethernet V/S 2 X SCI , as their cluster's
Private Interconnects ? I would still go for 3 x GbE because it is usually cheaper and will probably work just as well. The latency and bandwidth differences are often masked by the performance of the software higher up the stack. In short, unless you tuned the heck out of your application and just about everything else, don't worry too much about the difference between GbE and SCI.
4) How the Interconnect traffic gets distributed
amongest the multiple GigaBit ethernet Interconnects
( For oracle RAC) , & is anything required to be done
at oracle Rac Level to enable Oracle to recognise
that there are multiple interconnect cards it needs
to start utilizing all of the GigaBit ethernet
Interfaces for transfering packets ?You don't need to do anything at the Oracle level. That's the beauty of using Oracle RAC with Sun Cluster as opposed to RAC on its own. The striping takes place automatically and transparently behind the scenes.
5) what would happen to Oracle RAC if one of the
Gigabit ethernet private interconnects fails It's completely transparent. Oracle will never see the failure.
Have tried searching for this info but could not
locate any doc that can precisely clarify these
doubts that i have .........This is all covered in a paper that I have just completed and should be published after Christmas. Unfortunately, I cannot give out the paper yet.
thanks for the patience
Regards,
Nilesh

Oracle RAC Interconnect, PowerVM VLANs, and the Limit of 20

Hello,
Our company has a requirement to build a multitude of Oracle RAC clusters on AIX using Power VM on 770s and 795 hardware.
We presently have 802.1q trunking configured on our Virtual I/O Servers, and have currently consumed 12 of 20 allowed VLANs for a virtual ethernet adapter. We have read the Oracle RAC FAQ on Oracle Metalink and it seems to otherwise discourage the use of sharing these interconnect VLANs between different clusters. This puts us in a scalability bind; IBM limits VLANs to 20 and Oracle says there is a one-to-one relationship between VLANs and subnets and RAC clusters. We must assume we have a fixed number of network interfaces available and that we absolutely have to leverage virtualized network hardware in order to build these environments. "add more network adapters to VIO" isn't an acceptable solution for us.
Does anyone know if Oracle can afford any flexibility which would allow us to host multiple Oracle RAC interconnects on the same 802.1q trunk VLAN? We will independently guarantee the bandwidth, latency, and redundancy requirements are met for proper Oracle RAC performance, however we don't want a design "flaw" to cause us supportability issues in the future.
We'd like it very much if we could have a bunch of two-node clusters all sharing the same private interconnect. For example:
Cluster 1, node 1: 192.168.16.2 / 255.255.255.0 / VLAN 16
Cluster 1, node 2: 192.168.16.3 / 255.255.255.0 / VLAN 16
Cluster 2, node 1: 192.168.16.4 / 255.255.255.0 / VLAN 16
Cluster 2, node 2: 192.168.16.5 / 255.255.255.0 / VLAN 16
Cluster 3, node 1: 192.168.16.6 / 255.255.255.0 / VLAN 16
Cluster 3, node 2: 192.168.16.7 / 255.255.255.0 / VLAN 16
Cluster 4, node 1: 192.168.16.8 / 255.255.255.0 / VLAN 16
Cluster 4, node 2: 192.168.16.9 / 255.255.255.0 / VLAN 16
etc.
Whereas the concern is that Oracle Corp will only support us if we do this:
Cluster 1, node 1: 192.168.16.2 / 255.255.255.0 / VLAN 16
Cluster 1, node 2: 192.168.16.3 / 255.255.255.0 / VLAN 16
Cluster 2, node 1: 192.168.17.2 / 255.255.255.0 / VLAN 17
Cluster 2, node 2: 192.168.17.3 / 255.255.255.0 / VLAN 17
Cluster 3, node 1: 192.168.18.2 / 255.255.255.0 / VLAN 18
Cluster 3, node 2: 192.168.18.3 / 255.255.255.0 / VLAN 18
Cluster 4, node 1: 192.168.19.2 / 255.255.255.0 / VLAN 19
Cluster 4, node 2: 192.168.19.3 / 255.255.255.0 / VLAN 19
Which eats one VLAN per RAC cluster.

Thank you for your answer!!
I think I roughly understand the argument behind a 2-node RAC and a 3-node or greater RAC. We, unfortunately, were provided with two physical pieces of hardware to virtualize to support production (and two more to support non-production) and as a result we really have no place to host a third RAC node without placing it within the same "failure domain" (I hate that term) as one of the other nodes.
My role is primarily as a system engineer, and, generally speaking, our main goals are eliminating single points of failure. We may be misusing 2-node RACs to eliminate single points of failure since it seems to violate the real intentions behind RAC, which is used more appropriately to scale wide to many nodes. Unfortunately, we've scaled out to only two nodes, and opted to scale these two nodes up, making them huge with many CPUs and lots of memory.
Other options, notably the active-passive failover cluster we have in HACMP or PowerHA on the AIX / IBM Power platform is unattractive as the standby node drives no resources yet must consume CPU and memory resources so that it is prepared for a failover of the primary node. We use HACMP / PowerHA with Oracle and it works nice, however Oracle RAC, even in a two-node configuration, drives load on both nodes unlike with an active-passive clustering technology.
All that aside, I am posing the question to both IBM, our Oracle DBAs (whom will ask Oracle Support). Typically the answers we get vary widely depending on the experience and skill level of the support personnel we get on both the Oracle and IBM sides... so on a suggestion from a colleague (Hi Kevin!) I posted here. I'm concerned that the answer from Oracle Support will unthinkingly be "you can't do that, my script says to tell you the absolute most rigid interpretation of the support document" while all the time the same document talks of the use of NFS and/or iSCSI storage eye roll
We have a massive deployment of Oracle EBS and honestly the interconnect doesn't even touch 100mbit speeds even though the configuration has been checked multiple times by Oracle and IBM and with the knowledge that Oracle EBS is supposed to heavily leverage RAC. I haven't met a single person who doesn't look at our environment and suggest jumbo frames. It's a joke at this point... comments like "OMG YOU DON'T HAVE JUMBO FRAMES" and/or "OMG YOU'RE NOT USING INFINIBAND WHATTA NOOB" are commonplace when new DBAs are hired. I maintain that the utilization numbers don't support this.
I can tell you that we have 8Gb fiber channel storage and 10Gb network connectivity. I would probably assume that there were a bottleneck in the storage infrastructure first. But alas, I digress.
Mainly I'm looking for a real-world answer to this question. Aside from violating every last recommendation and making oracle support folk gently weep at the suggestion, are there any issues with sharing interconnects between RAC environments that will prevent it's functionality and/or reduce it's stability?
We have rapid spanning tree configured, as far as I know, and our network folks have tuned the timers razor thin. We have Nexus 5k and Nexus 7k network infrastructure. The typical issues you'd fine with standard spanning tree really don't affect us because our network people are just that damn good.

How many Standard Ethernet Network Interface by server ?

Hi
Huh... I'd like to install a RAC configuration with 2 server but i've a doubt.
When i list all network interfaces i've:
Server1
en0 Standard Ethernet Network Interface
et0 IEEE 802.3 Ethernet Network Interface
lo0 Loopback Network Interface
Server2
en0 Standard Ethernet Network Interface
et0 IEEE 802.3 Ethernet Network Interface
lo0 Loopback Network Interface
Is it good or not ? (probably not and documentation says that en0 and en1 must be present in the configuration). OK.
I'm under AIX 5.3 and by Smit i've an option intituled "Add a Virtual IP Address Interface". Is it possible to create the second Interface as a Virtual IP adress here ?
Thanks in advance
Regards
Den

Den,
Basically, you would be needing two separate interface cards for each participating node. The first one to connect to the public network and second for the use an interconnect. Not sure, if et0 is really an network interface, if yes, you should be ok.
VIP is nothing but an un-assigned IP address (in the same subnet as your public IP address) which is expected to be inactive before you install the CRS. This should be listed in your /etc/hosts file and/or dns server (if used). Not sure if adding VIP through the SMIT makes it active, if yes, then you might not want to use SMIT to set up the VIP.
This may help you with more information - http://download.oracle.com/docs/cd/B19306_01/install.102/b14201/preaix.htm#sthref378
HTH
Thanks
Chandra

ORACLE RAC - Clusterware 11Gr1 -

Hi,
We are starting a fresh installation of an Oracle RAC database on:
IBM AIX 5.3
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 6
Processor Version: PV_6_Compat
Number Of Processors: 2
Processor Clock Speed: 3108 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
We are changing the servers and installing new binaries: clusterware ,asm binary, rdbms binary. After that, database will be migrated to this server. The oracle clusterware will be on 11.1.0.7, ASM on 11.1.0.7 and RDMS on 10.2.0.4. We are installing exactly as it is in production now.
After the servers were release to the dba, configured the ocr and voting disks and the pre-reqs, during the installation we face an error on the second node, while running the root.sh. The installation went successfully, 100% and on the first node finished successfully and started the services. On the second node, while running the root.hs, it failed with the message :
--> Failure at final check of oracle CRS stack. 10
Checking the logs, we could see the following:
[    CSSD]2011-09-25 02:59:51.633 [1030] >TRACE: clssnmReadDskHeartbeat: node 1, hodev001ler, has a disk HB, but no network HB, DHB has rcfg 212382813, wrtcnt, 2978, LATS 3178240991, lastSeqNo 2978, timestamp 1316930391/924708736
[    CSSD]2011-09-25 02:59:51.984 [1287] >TRACE: clssnmReadDskHeartbeat: node 1, hodev001ler, has a disk HB, but no network HB, DHB has rcfg 212382813, wrtcnt, 2978, LATS 3178241341, lastSeqNo 2978, timestamp 1316930391/924708736
[    CSSD]2011-09-25 02:59:52.304 [1801] >TRACE: clssnmReadDskHeartbeat: node 1, hodev001ler, has a disk HB, but no network HB, DHB has rcfg 212382813, wrtcnt, 2978, LATS 3178241662, lastSeqNo 2978, timestamp 1316930391/924708736
[    CSSD]2011-09-25 02:59:52.635 [1030] >TRACE: clssnmReadDskHeartbeat: node 1, hodev001ler, has a disk HB, but no network HB, DHB has rcfg 212382813, wrtcnt, 2979, LATS 3178241992, lastSeqNo 2979, timestamp 1316930392/924709737
[    CSSD]2011-09-25 02:59:52.771 [5399] >TRACE: clssnmLocalJoinEvent: node(1), state(0), cont (1), sleep (0), diskHB 1, diskinfo 110aa46f0
[    CSSD]2011-09-25 02:59:52.771 [5399] >TRACE: clssnmLocalJoinEvent: node(2), state(1), cont (0), sleep (0), diskHB 1, diskinfo 110aa46f0
All the notes seems to point to and interface/interconnect problem, but we dont have any clue on what parameter or what checks we need to perform with Unix Team. Does anybody had this issue? Any clue on what may be need to be adjusted or configured to solve this issue? Following below is the interfaces of both servers: hodb001lernew and hodb002lernew:
hodb001lernew
root@hodev001lernew:/u01/crs/log/hodev001ler # ifconfig -a
en1: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.124.85.221 netmask 0xffffffe0 broadcast 10.124.85.223
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en2: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.124.140.109 netmask 0xffffffe0 broadcast 10.124.140.127
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en3: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.56.96.47 netmask 0xffffe000 broadcast 10.56.127.255
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en5: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.124.251.4 netmask 0xffffffc0 broadcast 10.124.251.63
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
lo0: flags=e08084b<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT>
inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
inet6 ::1/0
tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
hodb002lernew
root@hodev002lernew:/u01/crs/log/hodev002ler # ifconfig -a
en0: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 172.16.1.11 netmask 0xffffff00 broadcast 172.16.1.255
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en1: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.124.85.220 netmask 0xffffffe0 broadcast 10.124.85.223
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en2: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.124.140.110 netmask 0xffffffe0 broadcast 10.124.140.127
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en3: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.56.96.48 netmask 0xffffe000 broadcast 10.56.127.255
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en5: flags=1e080863,480<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD(ACTIVE),CHAIN>
inet 10.124.251.7 netmask 0xffffffc0 broadcast 10.124.251.63
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
lo0: flags=e08084b<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT>
inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
inet6 ::1/0
tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
Also, while we run the cluverify, it appears the following warning on the interfaces verification:
Checking node connectivity...
Node connectivity check passed for subnet "172.16.1.0" with node(s) hodev002ler,hodev001ler.
Node connectivity check passed for subnet "10.124.85.192" with node(s) hodev002ler,hodev001ler.
Node connectivity check passed for subnet "10.124.140.96" with node(s) hodev002ler,hodev001ler.
Node connectivity check passed for subnet "10.56.96.0" with node(s) hodev002ler,hodev001ler.
Node connectivity check passed for subnet "10.124.107.224" with node(s) hodev002ler,hodev001ler.
Node connectivity check passed for subnet "10.124.251.0" with node(s) hodev002ler,hodev001ler.
Interfaces found on subnet "172.16.1.0" that are likely candidates for VIP:
hodev002ler en0:172.16.1.11
hodev001ler en0:172.16.1.13
Interfaces found on subnet "10.124.85.192" that are likely candidates for VIP:
hodev002ler en1:10.124.85.200
hodev001ler en1:10.124.85.222
Interfaces found on subnet "10.124.140.96" that are likely candidates for VIP:
hodev002ler en2:10.124.140.97
hodev001ler en2:10.124.140.101
Interfaces found on subnet "10.56.96.0" that are likely candidates for VIP:
hodev002ler en3:10.56.96.8
hodev001ler en3:10.56.96.32
Interfaces found on subnet "10.124.107.224" that are likely candidates for VIP:
hodev002ler en4:10.124.107.231
hodev001ler en4:10.124.107.241
Interfaces found on subnet "10.124.251.0" that are likely candidates for VIP:
hodev002ler en5:10.124.251.7
hodev001ler en5:10.124.251.4
WARNING:
Could not find a suitable set of interfaces for the private interconnect.
Any helps?
Thanks,

Hi,
Thanks for the reply. Here is what I have:
Node 1:
oracle@hodev001lernew:/home/oracle # ssh hodev002lernew date
Tue Oct 4 19:09:16 GRNLNDST 2011
oracle@hodev001lernew:/home/oracle # ssh hodev002lernew_pri date
Tue Oct 4 19:09:25 GRNLNDST 2011
Node 2:
oracle@hodev002lernew:/home/oracle # ssh hodev001lernew date
Tue Oct 4 19:10:24 GRNLNDST 2011
oracle@hodev002lernew:/home/oracle # ssh hodev001lernew_pri date
Tue Oct 4 19:10:57 GRNLNDST 2011
Regarding the Firewall point, I issued the command lsfilt but didn't return. it means it is not enabled? Im at AIX 5.3, any other command to verify this point ?
root@hodev001lernew:/ # /usr/sbin/lsfilt -a
Can not open device /dev/ipsec4_filt.
root@hodev002lernew:/ # lsfilt -a
Can not open device /dev/ipsec4_filt.
Thanks
Edited by: user11969939 on 04/10/2011 15:23

Oracle RAC HAIP vs network interface bonding

Similar Messages

Maybe you are looking for