802.3ad (mode=4) bonding for RAC interconnects

Is anyone using 802.3ad (mode=4) bonding for their RAC interconnects? We have five Dell R710 RAC nodes and we're trying to use the four onboard Broadcom NetXtreme II NICs in a 802.3ad bond with src-dst-mac load balancing. Since we have the hardware to pull this off we thought we'd give it a try and achieve some extra bandwith for the interconnect rather than deploying the traditional acitve/standby interconnect using just two of the NICs. Has anyone tried this config and what was the outcome? Thanks.

I don't but may be the documents might help ?
http://www.iop.org/EJ/article/1742-6596/119/4/042015/jpconf8_119_042015.pdf?request-id=bcddc94d-7727-4a8a-8201-4d1b837a1eac
http://www.oracleracsig.org/pls/apex/Z?p_url=RAC_SIG.download_my_file?p_file=1002938&p_id=1002938&p_cat=documents&p_user=nobody&p_company=994323795175833
http://www.oracle.com/technology/global/cn/events/download/ccb/10g_rac_bp_en.pdf
Edited by: Hub on Nov 18, 2009 10:10 AM

Similar Messages

  • Teamed NICs for RAC interconnect

    Hi there,
    We have an Oralce 10g RAC with 2 nodes. there are only one NIC for RAC interconnect in both servers.
    now we want to add one redundant NIC into each server for RAC interconnect as well.
    Could you please guide me some documents about this "teamed NICs for RAC interconnect "?
    Your help is greatly appreciated!
    Thanks,
    Scott

    Search around for NIC bonding. The exact process will depend on your OS.
    Linux, see Metalink note 298891.1 - Configuring Linux for the Oracle 10g VIP or private interconnect using bonding driver
    Regards,
    Greg Rahn
    http://structureddata.org

  • Dedicated switches needed for RAC interconnect or not?

    Currently working on an Extended RAC cluster design implementation, I asked the network engineer for dedicated switches for the RAC interconnects.
    Here is a little background:
    There are 28 RAC clusters over 2X13 physical RAC nodes with separate Oracle_Home for each instance with atleast 2+ instances on each RAC node. So 13 RAC nodes will be in each site(Data-Center). This is basically an Extended RAC solution for SAP databases on RHEL 6 using ASM and Clusterware for Oracle 11gR2. The RAC nodes are Blades in a c7000 enclosure (in each site). The distance between the sites is 55+ kms.
    Oracle recommends to have Infiniband(20GBps) as the network backbone, but here DWDM will be used with 2X10 Gbps (each at 10 GBps) links for the RAC interconnect between the sites. There will be separate 2x1GBps redundant link for the Production network and 2x2 GBps FC(Fiber-Channel) redundant links for the SAN/Storage(ASM traffic will go here) network. There will be switches for the Public-production network and the SAN network each.
    Oracle recommends dedicated switches(which will give acceptable latency/bandwith) with switch redundancy to route the dedicated/non-routable VLANs for the RAC interconnect (private/heartbeat/global cache transfer) network. Since the DWDM interlinks is 2x10Gbps - do I still need the dedicated switches?
    If yes, then how many?
    Your inputs will be greatly appreciated.. and help me take a decision.
    Many Thanks in advance..
    Abhijit

    Absolutely agree.. the chances of overload in a HA(RAC) solution and ultmate RAC node eviction are very high(with very high latency) and for exactly this reason I even suggested inexpensive switches to route the VLANs for the RAC interconnect through these switches. The ASM traffic will get routed through the 2x2GB FC links through SAN-Directors (1 in each site).
    Suggested the network folks to use Up-links from the c7000 enclosure and route the RAC VLAN through these inexpensive switches for the interconnect traffic. We have another challenge here: HP has certified using VirtualConnect/Flex-Fabric architecture for Blades in c7000 to allocate VLANs for RAC interconnect. But this is only for one site, and does not span Production/DR sites separated over a distance.
    Btw, do you have any standard switch model to select from.. and how many to go for a RAC configuration of 13 Extended RAC clusters with each cluster hosting 2+ RAC instances to host total of 28 SAP instances.
    Many Thanks again!
    Abhijit

  • RAC Interconnect Transfer rate vs NIC's Bandwidth

    Hi Guru,
    I need some clarification for RAC interconnect terminology between "private interconnect transfer rate" and "NIC bandwidth".
    We have 11gR2 RAC with multiple databases.
    So we need to find out what the current resource status is.
    We have two physical NICs each node. And 8G is for public and 2G is for private (interconnect).
    Technically, we have 4G for Private network bandwidth.
    If I look at the "Private Interconnect Transfer rate" though OEM or IPTraf (linux tool), it is showing 20 ~30 MB/Sec.
    There is no any issue at all at this moment.
    Please correct me if I am wrong.
    The transfer rate will be fine till 500M or 1G/Sec. Because the current NIC's capacity is 4G. Does it make sense ?
    I'm sure there are multiple things to consider,but I'm kind of stumped on the whole transfer rate vs bandwidth. Is there any way to calculate what a typical transfer would be....
    OR How do I say our interconnect are good enough ....based on the transfer rate ?
    Another question is ....
    In our case, how do I set up the warning threshold and Critical threshold for "Private Interconnect Transer rate" in OEM ?
    Any comments will be appreciated.
    Please advise.

    Interconnect performance sways more to latency than bandwidth IMO. In simplistic terms, memory is shared across the Interconnect. What is important for accessing memory? The size of the pipe? Or the speed of the pipe?
    A very fast small pipe will typically perform significantly better than a large and slower pipe.
    Even the size of the pipe is not that straight forward. Standard IP MTU size is 1500. You can run jumbo and super-jumbo frame MTU sizes on the Interconnect - where for example a MTU size of 65K is significantly larger than a 1500 byte MTU. Which means significantly more data can be transferred over the Interconnect at a much reduced overhead.
    Personally, I would not consider Ethernet (GigE included) for the Interconnect. Infiniband is faster, more scalable, and offers an actual growth path to 128Gb/s and higher.
    Oracle also uses Infiniband (QDR/40Gb) for their Exadata Database Machine product's Interconnect. Infiniband also enables one to run Oracle Interconnect over RDS instead of UDP. I've seen Oracle reports to the OFED committee saying that using RDS in comparison with UDP, reduced CPU utilisation by 50% and decreased latency by 50%.
    I also do not see the logic of having a faster public network and a slower Interconnect.
    IMO there are 2 very fundamental components in RAC that determines what is the speed and performance achievable with that RAC - the speed, performance and scalability of the I/O fabric layer and for the Interconnect layer.
    And Exadata btw uses Infiniband for both these critical layers. Not fibre. Not GigE.

  • RAC interconnect using UDP - default ports?

    Is there a default port used by each cluster member to listen for connections over UDP? We use IPTABLES firewalls on our hosts, and I need to ensure the cluster heartbeat traffic gets through the firewall properly.
    Thanks in advance.
    Jeff

    user2528460 wrote:
    I understood the UPD ports that are going to be used on the interconnect (clearly without a firewall). Is there a set of default ports?I did a quick count (using <i>lsof</i> to list UDP ports opened on the Interconnect interface) that showed over 185 UDP ports in use.. E.g.
    [root ~]# lsof -n -i | grep UDP | grep "10.0.1.1"
    oracle     5577  oracle   10u  IPv4   130938       UDP 10.0.1.1:22747
    oracle     5577  oracle   15u  IPv4   130941       UDP 10.0.1.1:64265
    oracle     5579  oracle   10u  IPv4   130948       UDP 10.0.1.1:39566
    oracle     5579  oracle   15u  IPv4   130951       UDP 10.0.1.1:55454
    oracle     5579  oracle   21u  IPv4   130970       UDP 10.0.1.1:27897
    oracle     5581  oracle   10u  IPv4   130973       UDP 10.0.1.1:14118
    oracle     5581  oracle   15u  IPv4   130976       UDP 10.0.1.1:13774
    oracle     5583  oracle   10u  IPv4   130983       UDP 10.0.1.1:33277
    oracle     5583  oracle   15u  IPv4   130986       UDP 10.0.1.1:6886
    ..snipped..I would not be concerned about what ports are in use. The important decisions are do you use bonding for the Interconnect, do you use jumbo or super-jumbo frames (MTU sizes), and so on. The actual ports being used has no real bearing as firewalling is not applicable.

  • [SOLVED] Configure LACP (802.3ad) Bond

    I've set this up before on Debian and RHEL/CentOS, but I'm having zero luck under Arch.  I get the interfaces bound, and I get a bond0, but its always round-robin.  My /etc/modprobe.d/bonding.conf has this one line:  options bonding mode=4 miion=100.
    All the references I can find to bonding in Arch are for wired/wireless failover, has anyone got this to work?
    Thanks.
    Last edited by Hrast (2014-09-05 02:37:21)

    Hrast wrote:options bonding mode=4 miion=100
    Assuming it's a typo, but just in case... It's "miimon", not "miion"
    Are you using netctl? Share your /etc/network.d/ profile. Have you tried "mode=802.3ad" instead of "mode=4"? It shouldn't make a difference, but it might help show any problems.

  • NIC teaming/bonding for interconnection

    Hi to all,
    We are planning for a two node Oracle RAC implementation on Windows Server and we have some doubts about interconnect. Because redundancy is not as important, and because we are limited by money Gigabit switch will be used for interconnection and for public network.
    Can we get in bandwidth (load balancing) if we use NIC Bonding on one switch. Which is best?
    - To buy servers with two single 1Gb port NIC’s and bond this NIC’s for interconnection and one single port NIC for public network or
    - One dual port NIC (bond this NIC’s) for interconnects and one single NIC for public network?

    Nikoline wrote:
    We are planning for a two node Oracle RAC implementation on Windows Server and we have some doubts about interconnect. Because redundancy is not as important, and because we are limited by money Gigabit switch will be used for interconnection and for public network.Keep in mind that the h/w architecture used, directly determines how robust RAC is, how well it performs, and how it meets redundancy, high availability and scalability requirements.
    The wrong h/w choices can and will hurt all of these factors.
    Can we get in bandwidth (load balancing) if we use NIC Bonding on one switch. No. Bonding usually does not address load balancing. You need to check exactly what the driver stack used for bonding supports.
    The primary reason for bonding is redundancy. Not load balancing/load sharing.
    Which is best?
    - To buy servers with two single 1Gb port NIC’s and bond this NIC’s for interconnection and one single port NIC for public network or
    - One dual port NIC (bond this NIC’s) for interconnects and one single NIC for public network?A server with dual 1Gb NICs will be bonded into a single bonded NIC. And that is what then must be used - the logical bonded NIC and not the individual NICs.
    This means that 2 1Gb ports provide you with a single bonded port. And you need 2 ports for RAC - public and interconnect.
    So you either need 4 physical ports in total to create 2 bonded ports, or 3 ports in total for 1 bonded port and 1 unbonded port.
    And what about your cluster's shared storage?
    This, together with the Interconnect, determines the robustness and performance of RAC. 1Gb Interconnect is already pretty much a failure in providing a proper Interconnect infrastructure for RAC.
    Keep in mind that the Interconnect is used to share memory across cluster nodes (cache fusion). The purpose is to speed up I/O - making it faster for a cluster node to get data blocks from another node's cache instead of having to hit spinning rust to read that data.
    Old style fibre channel technology for shared storage is dual 2Gb fibre channels. Which will be faster than your 1Gb Interconnect. How does it make sense to use an Interconnect that is slower (less bandwidth and more latency) than the actual I/O fabric layer?
    Would you configure h/w for a Windows server (be that a database, web or mail server) where the logical I/O from the server's memory buffer cache for local disks is slower than actually reading that data off local disk?
    Then why do this for RAC?
    Last comment - over 90% of the 500 fastest and biggest clusters on this planet run Linux. 1% runs Windows. Windows is usually a poor choice when it comes to clustering. Even Oracle does not provide their Exadata Database Machine product (fastest RAC cluster in the world) on Windows - only on Linux and Solaris (and the latter only because Oracle now owns Solaris too).

  • What is acceptable level of Private Interconnect Latency for RAC

    We have build 3 Node RAC on RHEL5.4 on VMware.
    There is node eviction problem due to loss of Network Heartbeat.
    ocssd.log:[    CSSD]2010-03-05 17:48:21.908 [84704144] >TRACE: clssnmReadDskHeartbeat: node 3, vm-lnx-rds1173, has a disk HB, but no network HB, DHB has rcfg 0, wrtcnt, 2, LATS 1185024, lastSeqNo 2, timestamp 1267791501/1961474
    Ping statistics from Node2 to Node1 are as below
    --- rds1171-priv ping statistics ---
    443 packets transmitted, 443 received, 0% packet loss, time 538119ms
    rtt min/avg/max/mdev = 0.150/2.030/630.212/29.929 ms
    [root@vm-lnx-rds1172 oracle]#
    Can this be reason for Node eviction? What is acceptable level of of private interconnect latency for RAC ?

    What is acceptable level of of private interconnect latency for RAC ?Normal local network latency should be enough. By the way latency settings are very generous.
    Can you check if your to-be-evicted node runs and is reachable when seeing the node eviction messages?
    In addition to that: Can you check the log files of the eviced node. Check for time stamps around "2010-03-05 17:48:21.908". Make sure all systems are NTP synchronized.
    Ronny Egner
    My Blog: http://blog.ronnyegner-consulting.de

  • Force 802.11b mode for a wireless adapter

    Hello, I would like to know how to force a wireless adapter to work on the 802.11b band only (this is, make it so it does not use g/n).
    Thanks

    Butcher wrote:I don't really think that is going to help. (It does not.)
    That's probably because your wifi device driver module doesn't support enforcing 802.11b mode, like my b43:
    # iwconfig wlan0 modu 11b
    Error for wireless request "Set modulation" (8B2F):
    SET failed on device wlan0 ; Operation not supported.
    Does forcing a slower rate without changing mode help, in your casee?
    # iwconfig wlan0 rate 11M

  • Oracle RAC Interconnect, PowerVM VLANs, and the Limit of 20

    Hello,
    Our company has a requirement to build a multitude of Oracle RAC clusters on AIX using Power VM on 770s and 795 hardware.
    We presently have 802.1q trunking configured on our Virtual I/O Servers, and have currently consumed 12 of 20 allowed VLANs for a virtual ethernet adapter. We have read the Oracle RAC FAQ on Oracle Metalink and it seems to otherwise discourage the use of sharing these interconnect VLANs between different clusters. This puts us in a scalability bind; IBM limits VLANs to 20 and Oracle says there is a one-to-one relationship between VLANs and subnets and RAC clusters. We must assume we have a fixed number of network interfaces available and that we absolutely have to leverage virtualized network hardware in order to build these environments. "add more network adapters to VIO" isn't an acceptable solution for us.
    Does anyone know if Oracle can afford any flexibility which would allow us to host multiple Oracle RAC interconnects on the same 802.1q trunk VLAN? We will independently guarantee the bandwidth, latency, and redundancy requirements are met for proper Oracle RAC performance, however we don't want a design "flaw" to cause us supportability issues in the future.
    We'd like it very much if we could have a bunch of two-node clusters all sharing the same private interconnect. For example:
    Cluster 1, node 1: 192.168.16.2 / 255.255.255.0 / VLAN 16
    Cluster 1, node 2: 192.168.16.3 / 255.255.255.0 / VLAN 16
    Cluster 2, node 1: 192.168.16.4 / 255.255.255.0 / VLAN 16
    Cluster 2, node 2: 192.168.16.5 / 255.255.255.0 / VLAN 16
    Cluster 3, node 1: 192.168.16.6 / 255.255.255.0 / VLAN 16
    Cluster 3, node 2: 192.168.16.7 / 255.255.255.0 / VLAN 16
    Cluster 4, node 1: 192.168.16.8 / 255.255.255.0 / VLAN 16
    Cluster 4, node 2: 192.168.16.9 / 255.255.255.0 / VLAN 16
    etc.
    Whereas the concern is that Oracle Corp will only support us if we do this:
    Cluster 1, node 1: 192.168.16.2 / 255.255.255.0 / VLAN 16
    Cluster 1, node 2: 192.168.16.3 / 255.255.255.0 / VLAN 16
    Cluster 2, node 1: 192.168.17.2 / 255.255.255.0 / VLAN 17
    Cluster 2, node 2: 192.168.17.3 / 255.255.255.0 / VLAN 17
    Cluster 3, node 1: 192.168.18.2 / 255.255.255.0 / VLAN 18
    Cluster 3, node 2: 192.168.18.3 / 255.255.255.0 / VLAN 18
    Cluster 4, node 1: 192.168.19.2 / 255.255.255.0 / VLAN 19
    Cluster 4, node 2: 192.168.19.3 / 255.255.255.0 / VLAN 19
    Which eats one VLAN per RAC cluster.

    Thank you for your answer!!
    I think I roughly understand the argument behind a 2-node RAC and a 3-node or greater RAC. We, unfortunately, were provided with two physical pieces of hardware to virtualize to support production (and two more to support non-production) and as a result we really have no place to host a third RAC node without placing it within the same "failure domain" (I hate that term) as one of the other nodes.
    My role is primarily as a system engineer, and, generally speaking, our main goals are eliminating single points of failure. We may be misusing 2-node RACs to eliminate single points of failure since it seems to violate the real intentions behind RAC, which is used more appropriately to scale wide to many nodes. Unfortunately, we've scaled out to only two nodes, and opted to scale these two nodes up, making them huge with many CPUs and lots of memory.
    Other options, notably the active-passive failover cluster we have in HACMP or PowerHA on the AIX / IBM Power platform is unattractive as the standby node drives no resources yet must consume CPU and memory resources so that it is prepared for a failover of the primary node. We use HACMP / PowerHA with Oracle and it works nice, however Oracle RAC, even in a two-node configuration, drives load on both nodes unlike with an active-passive clustering technology.
    All that aside, I am posing the question to both IBM, our Oracle DBAs (whom will ask Oracle Support). Typically the answers we get vary widely depending on the experience and skill level of the support personnel we get on both the Oracle and IBM sides... so on a suggestion from a colleague (Hi Kevin!) I posted here. I'm concerned that the answer from Oracle Support will unthinkingly be "you can't do that, my script says to tell you the absolute most rigid interpretation of the support document" while all the time the same document talks of the use of NFS and/or iSCSI storage eye roll
    We have a massive deployment of Oracle EBS and honestly the interconnect doesn't even touch 100mbit speeds even though the configuration has been checked multiple times by Oracle and IBM and with the knowledge that Oracle EBS is supposed to heavily leverage RAC. I haven't met a single person who doesn't look at our environment and suggest jumbo frames. It's a joke at this point... comments like "OMG YOU DON'T HAVE JUMBO FRAMES" and/or "OMG YOU'RE NOT USING INFINIBAND WHATTA NOOB" are commonplace when new DBAs are hired. I maintain that the utilization numbers don't support this.
    I can tell you that we have 8Gb fiber channel storage and 10Gb network connectivity. I would probably assume that there were a bottleneck in the storage infrastructure first. But alas, I digress.
    Mainly I'm looking for a real-world answer to this question. Aside from violating every last recommendation and making oracle support folk gently weep at the suggestion, are there any issues with sharing interconnects between RAC environments that will prevent it's functionality and/or reduce it's stability?
    We have rapid spanning tree configured, as far as I know, and our network folks have tuned the timers razor thin. We have Nexus 5k and Nexus 7k network infrastructure. The typical issues you'd fine with standard spanning tree really don't affect us because our network people are just that damn good.

  • Network adapter for the private interface for RAC

    Hi guys
    This is the specification for the Network adapter for the private interface in RAC:
    The network adapter for the private interface must support the user datagram protocol (UDP) using high-speed network adapters and a network switch that supports TCP/IP (Gigabit Ethernet or better).
    Do you have a document where I can have more deep details about it ?
    Thanks

    user2931261 wrote:
    This is the specification for the Network adapter for the private interface in RAC:
    The network adapter for the private interface must support the user datagram protocol (UDP) using high-speed network adapters and a network switch that supports TCP/IP (Gigabit Ethernet or better).Note that TCP is also used in addition to UDP over the Interconnect. As for the statement that the NIC must support UDP - wrong statement to make. The NIC does not support UDP. The IP (Internet Protocol) stack does. The NIC supports the bottom layer of the ISO model. UDP is at layer 4.
    As for your Interconnect, you have two basic choices. Gigabyte Ethernet or Infiniband.
    For GigE you need of course a GigE NIC and a GigE switch, and a cable to wire the NIC into the switch.
    For Infiniband you need a HCA card (e.g. Mellanox InfiniHost PCI card) and an Infiniband switch. As HCA cards are dual port, it make sense to get 2 cables and wire both ports into the switch and bond those 2 ports as a single logical NIC. You can run IPoIB (IP over Infiniband), which means that the 2 ports on the HCA will be seen by the o/s as NICs - and bonding is supported which enables you to create a single logical NIC on top of these 2 NICs and thus have full redundancy.
    Also note that GigE typically is only 1Gb/s. 10 Gb/s is available, but expensive (especially the switch). Infiniband typically is QDR rate today that provides you with 40Gb/s pipes. So each port on the HCA will be 40Gb and the bonded port will thus be 2x40Gb/s.
    Infiniband is a lot faster and more scalable for an Interconnect than GigE.
    Also, because of the capacity and flexibility and low latency of Infiniband, you can also use it for your storage fabric layer (running your storage protocol over it).
    Typically you will need to get HBA cards (dual fibre port) to wire the RAC node to the storage array's fibre channel switch. These HBA ports are 2Gb/s.
    You can eliminate these HBAs and fibre cables by using a storage protocol like SRP (Scsi RDMA Protocol - RDMA=Remote Direct Memory Access) - running SRP over your Infiniband infrastructure, together with your Interconnect.
    This is btw what Oracle's Database Machine and Exadata Storage servers use (and one the reasons they have broken all RAC performance records).

  • RAC interconnect switch?

    Hi,
    We are in the process of migrating from a 10g single instance database to 2 node RAC (Windows Server 2008 OS, EMC storage with 2 SAN swithes,…) and we have some doubts about interconnect.
    We are having difficulty in selecting the correct interconnect speed for the interconnect network, difficulty in selecting the switch/switches, …
    1.     Because there are 2 nodes, and 4 Ethernet cables for interconnect, whether to use one or two switches? Using a switch can be a solution but a single switch become a big single point of failure.
    2.     Whether we can get in performance if we use 2 switches (bonding,…) ?
    3.     As mentioned, there are 4 Ethernet cables, is it good idea to use existing 1Gb switches that we use for public network or to buy 1Gb switches that will be used only for private interconnection?
    4.     Can we use simple 16 or 8 port GigE switches?
    Maybe you can point me out to some GigE and SAN switches (for nodes - storage connection) which you've seen that they work without any problems with RAC.
    How can we best deisgn the networks for the interconnects?
    Thank in advance!

    user9106065 wrote:
    So the best solution for interconnection would be InfiniBand or 10GigE.If you look at what Oracle itself chose for their RAC hardware product range, then yes. Infiniband is a better choice.
    What do you think about InfiniBand and Windows Server OS?Last used Windows as a server o/s back in the mid 90's. :-)
    No idea how robust the OFED driver stack is on Windows. It ships with Oracle Linux as Oracle uses it for their RAC products.
    What is the difference in price for InfiniBand and GigE switches?About the same I would think. A 40Gb 12 to 24 port switches a few years ago were actually cheaper than a 10GigE switch of the same port count. Pricing has come down for both though. We have recently bought a couple of 32 port QDR switches at far below $10,000 a switch.
    Cabling is needed and HCA (PCI) cards too. The cards are cheaper I think than HBAs (fibre channel cards). The only issue we had in this regard is getting pizza box/blade servers with 2 PCI slots for supporting both HBA and HCA. Recent server models often have only one PCI slot as oppose to the prior models of a few years ago. So when choosing a newer servers and you need both HBA and HCA, you need to make sure there are in fact 2 PCI slots in the server.
    And again, can you point me out to some InfiniBand switches which you've seen that they work without any problems with 2 node RAC?Oracle used Voltaire IB (Infiniband) switches for the first Oracle Database Machine/Exadata product. The only top500 cluster in Africa is basically (almost) next door to us here in Cape Town. They are also using Voltaire switches.
    If I'm not mistaken, the same Voltaire switches are OEM'ed and sold by Oracle/Sun and HP and others. I have an HP quote for about the same below $10,000 per switch price. Of course, you can get away with a much smaller switch for a 2 node RAC - and a 2nd switch is only a consideration if you can justify the cost for redundancy in the Interconnect redundancy layer.
    Voltaire pretty much seems to lead the market in this respect. Cisco used to sell IB switches too - but some of these were horribly buggy (especially the ones with FC gateways). Cisco acquired TopSpin back then - we still have a couple of old 10Gb TopSpin switches (bought from Cisco) and these have been pretty rock solid through the years. But QDR (40Gb) is what one should be looking at and not the older SDR or DDR technologies.
    You should be able to shop around your existing vendors (HP, Oracle/Sun, etc) for IB switches - with the exception of Cisco that no longer does IB switches (afaik).

  • How can I get my AE to work in 802.11n mode with my dlink router?

    Hi all,
    I have a dlink 655 wireless n router that I use for my network and a recent AE with 802.11n capability that I use to connect iTunes to my home stereo.
    I have a mix of computers at home and only my imac has 802.11n capability. I work alone in the day so to speed things up, I change the wireless setting on my dlink router to 802.11n only mode. Normally, it is in a mixed mode of 802.11g/n for the other non n computers on the network.
    My imac connects fine when in 802.11n only mode but the AE looses connection. As it is 802.11n capable, I was wondering why this happens?
    I tried checking Airport Utility for a 802.11n only mode or something similar, but didn't find anything. Will the AE only work in 802.11n mode when on the 5ghz frequency? if so, that would explain why it doesn't connect with the dlink 2.4ghz router in n mode.
    Any ideas?? Thanks!

    Hi. Thanks for the reply. It turns out that the router was not going into n mode because of the security I set on it. I finally figured this out and the AE did connect in n mode after all.
    Well at least it's working now but I feel a little bad for posting when it was my fault!!
    Cheers!

  • RAC Interconnect performance

    Hi,
    We are facing RAC Interconnect performance problems.
    Oracle Version: Oracle 9i RAC (9.2.0.7)
    Operating system: SunOS 5.8
    SQL> SELECT b1.inst_id, b2.value "RECEIVED",
    b1.value "RECEIVE TIME",
    ((b1.value / b2.value) * 10) "AVG RECEIVE TIME (ms)"
    FROM gv$sysstat b1, gv$sysstat b2
    WHERE b1.name = 'global cache cr block receive time'
    AND b2.name = 'global cache cr blocks received'
    AND b1.inst_id = b2.inst_id;
    INST_ID RECEIVED RECEIVE TIME AVG RECEIVE TIME (ms)
    1 323849 172359 5.32220263
    2 675806 94537 1.39887778
    After database restart average time increases for Instance 1 and instance 2 remains similar.
    Application performance degrades, restart database solves the issue. This is critical application and can not have frequent downtimes for restart.
    What specific points should I check to find out to improve interconnect performance?
    Thanks
    Dilip Patel.

    Hi,
    Configurations:
    Node: 1
    Hardware Model: Sun-Fire-V890
    OS: SunOS 5.8
    Release: Generic_117350-53
    CPU: 16 sparcv9 cpu(s) running at 1200 MHz
    Memory: 40.0GB
    Node: 2
    Hardware Model: Sun-Fire-V890
    OS: SunOS 5.8
    Release: Generic_117350-53
    CPU: 16 sparcv9 cpu(s) running at 1200 MHz
    Memory: 40.0GB
    CPU Utilization on Node 1 is never exceeded 40%.
    CPU Utilization on Node 2 is between 20% to 30%.
    Application load is more Node 1 compared to Node 2.
    I can observer wait event "global cache cr request" in top 5 wait events on most of the statspack report. Application faces degrade performacne after few days of restart database. No major changes done on application recently.
    Statapack report for Node 1:
    DB Name         DB Id    Instance     Inst Num Release     Cluster Host
    XXXX          2753907139 xxxx1               1 9.2.0.7.0   YES    xxxxx
                  Snap Id     Snap Time      Sessions Curs/Sess Comment
    Begin Snap:     61688 17-Feb-09 09:10:06      253     299.4
      End Snap:     61698 17-Feb-09 10:10:06      285     271.6
       Elapsed:               60.00 (mins)
    Cache Sizes (end)
    ~~~~~~~~~~~~~~~~~
                   Buffer Cache:     2,048M      Std Block Size:          8K
               Shared Pool Size:       384M          Log Buffer:      2,048K
    Load Profile
    ~~~~~~~~~~~~                            Per Second       Per Transaction
                      Redo size:            102,034.92              4,824.60
                  Logical reads:             60,920.35              2,880.55
                  Block changes:                986.07                 46.63
                 Physical reads:              1,981.12                 93.67
                Physical writes:                 28.30                  1.34
                     User calls:              2,651.63                125.38
                         Parses:                500.89                 23.68
                    Hard parses:                 21.44                  1.01
                          Sorts:                 66.91                  3.16
                         Logons:                  3.69                  0.17
                       Executes:                553.34                 26.16
                   Transactions:                 21.15
      % Blocks changed per Read:    1.62    Recursive Call %:     22.21
    Rollback per transaction %:    2.90       Rows per Sort:      7.44
    Instance Efficiency Percentages (Target 100%)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                Buffer Nowait %:   99.99       Redo NoWait %:    100.00
                Buffer  Hit   %:   96.75    In-memory Sort %:    100.00
                Library Hit   %:   98.30        Soft Parse %:     95.72
             Execute to Parse %:    9.48         Latch Hit %:     99.37
    Parse CPU to Parse Elapsd %:   90.03     % Non-Parse CPU:     92.97
    Shared Pool Statistics        Begin   End
                 Memory Usage %:   94.23   94.93
        % SQL with executions>1:   74.96   74.66
      % Memory for SQL w/exec>1:   82.93   72.26
    Top 5 Timed Events
    ~~~~~~~~~~~~~~~~~~                                                     % Total
    Event                                               Waits    Time (s) Ela Time
    db file sequential read                         1,080,532      13,191    40.94
    CPU time                                                       10,183    31.60
    db file scattered read                            456,075       3,977    12.34
    wait for unread message on broadcast channel        4,195       2,770     8.60
    global cache cr request                         1,633,056         873     2.71
    Cluster Statistics for DB: EPIP  Instance: epip1  Snaps: 61688 -61698
    Global Cache Service - Workload Characteristics
    Ave global cache get time (ms):                            0.8
    Ave global cache convert time (ms):                        1.1
    Ave build time for CR block (ms):                          0.1
    Ave flush time for CR block (ms):                          0.2
    Ave send time for CR block (ms):                           0.3
    Ave time to process CR block request (ms):                 0.6
    Ave receive time for CR block (ms):                        4.4
    Ave pin time for current block (ms):                       0.2
    Ave flush time for current block (ms):                     0.0
    Ave send time for current block (ms):                      0.3
    Ave time to process current block request (ms):            0.5
    Ave receive time for current block (ms):                   2.6
    Global cache hit ratio:                                    3.9
    Ratio of current block defers:                             0.0
    % of messages sent for buffer gets:                        3.7
    % of remote buffer gets:                                   0.3
    Ratio of I/O for coherence:                                1.1
    Ratio of local vs remote work:                            10.9
    Ratio of fusion vs physical writes:                        0.0
    Global Enqueue Service Statistics
    Ave global lock get time (ms):                             0.1
    Ave global lock convert time (ms):                         0.0
    Ratio of global lock gets vs global lock releases:         1.0
    GCS and GES Messaging statistics
    Ave message sent queue time (ms):                          0.4
    Ave message sent queue time on ksxp (ms):                  1.8
    Ave message received queue time (ms):                      0.2
    Ave GCS message process time (ms):                         0.1
    Ave GES message process time (ms):                         0.0
    % of direct sent messages:                                 8.0
    % of indirect sent messages:                              49.4
    % of flow controlled messages:                            42.6
    GES Statistics for DB: EPIP  Instance: epip1  Snaps: 61688 -61698
    Statistic                                    Total   per Second    per Trans
    dynamically allocated gcs resourc                0          0.0          0.0
    dynamically allocated gcs shadows                0          0.0          0.0
    flow control messages received                   0          0.0          0.0
    flow control messages sent                       0          0.0          0.0
    gcs ast xid                                      0          0.0          0.0
    gcs blocked converts                         2,830          0.8          0.0
    gcs blocked cr converts                      7,677          2.1          0.1
    gcs compatible basts                             5          0.0          0.0
    gcs compatible cr basts (global)               142          0.0          0.0
    gcs compatible cr basts (local)            142,678         39.6          1.9
    gcs cr basts to PIs                              0          0.0          0.0
    gcs cr serve without current lock                0          0.0          0.0
    gcs error msgs                                   0          0.0          0.0
    gcs flush pi msgs                              798          0.2          0.0
    gcs forward cr to pinged instance                0          0.0          0.0
    gcs immediate (compatible) conver            9,296          2.6          0.1
    gcs immediate (null) converts               52,460         14.6          0.7
    gcs immediate cr (compatible) con          752,507        209.0          9.9
    gcs immediate cr (null) converts         4,047,959      1,124.4         53.2
    gcs msgs process time(ms)                  153,618         42.7          2.0
    gcs msgs received                        2,287,640        635.5         30.0
    gcs out-of-order msgs                            0          0.0          0.0
    gcs pings refused                           70,099         19.5          0.9
    gcs queued converts                              0          0.0          0.0
    gcs recovery claim msgs                          0          0.0          0.0
    gcs refuse xid                                   1          0.0          0.0
    gcs retry convert request                        0          0.0          0.0
    gcs side channel msgs actual                40,400         11.2          0.5
    gcs side channel msgs logical            4,039,700      1,122.1         53.1
    gcs write notification msgs                     46          0.0          0.0
    gcs write request msgs                         972          0.3          0.0
    gcs writes refused                               4          0.0          0.0
    ges msgs process time(ms)                    2,713          0.8          0.0
    ges msgs received                           73,687         20.5          1.0
    global posts dropped                             0          0.0          0.0
    global posts queue time                          0          0.0          0.0
    global posts queued                              0          0.0          0.0
    global posts requested                           0          0.0          0.0
    global posts sent                                0          0.0          0.0
    implicit batch messages received           288,801         80.2          3.8
    implicit batch messages sent               622,610        172.9          8.2
    lmd msg send time(ms)                        2,148          0.6          0.0
    lms(s) msg send time(ms)                         1          0.0          0.0
    messages flow controlled                 3,473,393        964.8         45.6
    messages received actual                   765,292        212.6         10.1
    messages received logical                2,360,972        655.8         31.0
    messages sent directly                     654,760        181.9          8.6
    messages sent indirectly                 4,027,924      1,118.9         52.9
    msgs causing lmd to send msgs               33,481          9.3          0.4
    msgs causing lms(s) to send msgs            13,220          3.7          0.2
    msgs received queue time (ms)              379,304        105.4          5.0
    msgs received queued                     2,359,723        655.5         31.0
    msgs sent queue time (ms)                1,514,305        420.6         19.9
    msgs sent queue time on ksxp (ms)        4,349,174      1,208.1         57.1
    msgs sent queued                         4,032,426      1,120.1         53.0
    msgs sent queued on ksxp                 2,415,381        670.9         31.7
    GES Statistics for DB: EPIP  Instance: epip1  Snaps: 61688 -61698
    Statistic                                    Total   per Second    per Trans
    process batch messages received            278,174         77.3          3.7
    process batch messages sent                913,611        253.8         12.0
    Wait Events for DB: EPIP  Instance: epip1  Snaps: 61688 -61698
    -> s  - second
    -> cs - centisecond -     100th of a second
    -> ms - millisecond -    1000th of a second
    -> us - microsecond - 1000000th of a second
    -> ordered by wait time desc, waits desc (idle events last)
                                                                       Avg
                                                         Total Wait   wait    Waits
    Event                               Waits   Timeouts   Time (s)   (ms)     /txn
    db file sequential read         1,080,532          0     13,191     12     14.2
    db file scattered read            456,075          0      3,977      9      6.0
    wait for unread message on b        4,195      1,838      2,770    660      0.1
    global cache cr request         1,633,056      8,417        873      1     21.4
    db file parallel write              8,243          0        260     32      0.1
    buffer busy waits                  16,811          0        168     10      0.2
    log file parallel write           187,783          0        158      1      2.5
    log file sync                      75,143          0        147      2      1.0
    buffer busy global CR               9,713          0        102     10      0.1
    global cache open x                31,157      1,230         50      2      0.4
    enqueue                            58,261         14         45      1      0.8
    latch free                         33,398      7,610         44      1      0.4
    direct path read (lob)              9,925          0         36      4      0.1
    library cache pin                   8,777          1         34      4      0.1
    SQL*Net break/reset to clien       82,982          0         32      0      1.1
    log file sequential read              409          0         31     75      0.0
    log switch/archive                      3          3         29   9770      0.0
    SQL*Net more data to client       201,538          0         16      0      2.6
    global cache open s                 8,585        342         14      2      0.1
    global cache s to x                11,098        148         11      1      0.1
    control file sequential read        6,845          0          8      1      0.1
    db file parallel read               1,569          0          7      4      0.0
    log file switch completion             35          0          7    194      0.0
    row cache lock                     15,780          0          6      0      0.2
    process startup                        69          0          6     82      0.0
    global cache null to x              1,759         48          6      3      0.0
    direct path write (lob)               685          0          5      7      0.0
    DFS lock handle                     8,713          0          3      0      0.1
    control file parallel write         1,350          0          2      2      0.0
    wait for master scn                 1,194          0          1      1      0.0
    CGS wait for IPC msg               30,830     30,715          1      0      0.4
    global cache busy                      14          1          1     75      0.0
    ksxr poll remote instances         30,997     12,692          1      0      0.4
    direct path read                      752          0          0      1      0.0
    switch logfile command                  3          0          0    148      0.0
    log file single write                  24          0          0     13      0.0
    library cache lock                    668          0          0      0      0.0
    KJC: Wait for msg sends to c        1,161          0          0      0      0.0
    buffer busy global cache               26          0          0      6      0.0
    IPC send completion sync              261        260          0      0      0.0
    PX Deq: reap credit                 3,477      3,440          0      0      0.0
    LGWR wait for redo copy             1,751          0          0      0      0.0
    async disk IO                       1,059          0          0      0      0.0
    direct path write                     298          0          0      0      0.0
    slave TJ process wait                   1          1          0     18      0.0
    PX Deq: Execute Reply                   3          1          0      3      0.0
    PX Deq: Join ACK                        8          4          0      1      0.0
    global cache null to s                  8          0          0      1      0.0
    ges inquiry response                   16          0          0      0      0.0
    Wait Events for DB: EPIP  Instance: epip1  Snaps: 61688 -61698
    -> s  - second
    -> cs - centisecond -     100th of a second
    -> ms - millisecond -    1000th of a second
    -> us - microsecond - 1000000th of a second
    -> ordered by wait time desc, waits desc (idle events last)
                                                                       Avg
                                                         Total Wait   wait    Waits
    Event                               Waits   Timeouts   Time (s)   (ms)     /txn
    PX Deq: Parse Reply                     6          2          0      1      0.0
    PX Deq Credit: send blkd                2          1          0      0      0.0
    PX Deq: Signal ACK                      3          1          0      0      0.0
    library cache load lock                 1          0          0      0      0.0
    buffer deadlock                         6          6          0      0      0.0
    lock escalate retry                     4          4          0      0      0.0
    SQL*Net message from client     9,470,867          0    643,285     68    124.4
    queue messages                     42,829     41,144     42,888   1001      0.6
    wakeup time manager                   601        600     16,751  27872      0.0
    gcs remote message                795,414    120,163     13,606     17     10.4
    jobq slave wait                     2,546      2,462      7,375   2897      0.0
    PX Idle Wait                        2,895      2,841      7,021   2425      0.0
    virtual circuit status                120        120      3,513  29273      0.0
    ges remote message                142,306     69,912      3,504     25      1.9
    SQL*Net more data from clien      206,559          0         19      0      2.7
    SQL*Net message to client       9,470,903          0         14      0    124.4
    PX Deq: Execution Msg                 313        103          2      7      0.0
    Background Wait Events for DB: EPIP  Instance: epip1  Snaps: 61688 -61698
    -> ordered by wait time desc, waits desc (idle events last)
                                                                       Avg
                                                         Total Wait   wait    Waits
    Event                               Waits   Timeouts   Time (s)   (ms)     /txn
    db file parallel write              8,243          0        260     32      0.1
    log file parallel write           187,797          0        158      1      2.5
    log file sequential read              316          0         22     70      0.0
    enqueue                            56,204          0         15      0      0.7
    control file sequential read        5,694          0          6      1      0.1
    DFS lock handle                     8,682          0          3      0      0.1
    db file sequential read               276          0          2      8      0.0
    control file parallel write         1,334          0          2      2      0.0
    wait for master scn                 1,194          0          1      1      0.0
    CGS wait for IPC msg               30,830     30,714          1      0      0.4
    ksxr poll remote instances         30,972     12,681          1      0      0.4
    latch free                            356         54          1      2      0.0
    direct path read                      752          0          0      1      0.0
    log file single write                  24          0          0     13      0.0
    LGWR wait for redo copy             1,751          0          0      0      0.0
    async disk IO                         812          0          0      0      0.0
    global cache cr request                69          0          0      1      0.0
    row cache lock                         45          0          0      1      0.0
    direct path write                     298          0          0      0      0.0
    library cache pin                      29          0          0      1      0.0
    rdbms ipc reply                        29          0          0      0      0.0
    buffer busy waits                      10          0          0      0      0.0
    library cache lock                      2          0          0      0      0.0
    global cache open x                     2          0          0      0      0.0
    rdbms ipc message                 179,764     36,258     29,215    163      2.4
    gcs remote message                795,409    120,169     13,605     17     10.4
    pmon timer                          1,388      1,388      3,508   2527      0.0
    ges remote message                142,295     69,912      3,504     25      1.9
    smon timer                            414          0      3,463   8366      0.0
              -------------------------------------------------------------

  • RAC - Interconnect traffic

    In the client place the architecture team wants to implement a a node RAC cluster on Sun Solaris on Oracle 10g. But in order to minimize the interconnect traffic they want applications to connect only to one node and the other node will provide fail over capability.
    Though this can be achieved with less complexity thru physical standby, theoretically if in a 2 node RAC cluster if we just use one node will it eliminate interconnect traffic?
    I guess that there will not be any requests from the second node for blocks held by the first node so no transfer of blocks over interconnect will take place. But still the second node has to know about the blocks held by the first node in order to recover in case the first node crashes(cache fusion?). Right?
    TIA
    RadKrish

    just my 0.02 cents on this .. Why would you want to invest in a High Availability Solution ( RAC ) and attempt to achieve Disaster Recovery ( Failover ) through it ..
    Why would you want to minimize the Interconnect traffic in the first place .. A decent Switch plus Gigabit Ethernet / Cat5 cables is all you need for the Interconnect and its known to work well with most types of RAC Setup's ..
    With Dynamic Resource Remastering , mastership of blocks across the active nodes get distributed based on access patterns .. so with your kind of setup , almost all the blocks would be mastered on Node:1 after a period of sustained activity .. Further Instance recovery happens using information from the Redo Log Thread of the crashed Node so it should not affect Interconnect traffic ..
    Vishwa

Maybe you are looking for