InterConnect Scenario

I have to send requests to SOAP server through InterConnect and HTTP adapter, is it possible?
is possible to treat SOAP responses with InterConnect ?
regards

It would be better to do the backups locally, i.e. all through node1. The reason I say this is that during the backup, you are going to cause I/O contention for the applications using the file system. If you do this remotely, then not only are you impacting the application but you are pulling data across the interconnect and chewing up CPU cycles, etc.
Given that you can move the primary path of a global file system without needing to stop the application, it would be easier to do that, get the backup done more quickly and then migrate the path back than to do it the way you are considering. At least that's my view - though I've not done a bake-off to try out the two options.
Tim
---

Similar Messages

Gig Ethernet V/S SCI as Cluster Private Interconnect for Oracle RAC

Hello Gurus
Can any one pls confirm if it's possible to configure 2 or more Gigabit Ethernet interconnects ( Sun Cluster 3.1 Private Interconnects) on a E6900 cluster ?
It's for a High Availability requirement of Oracle 9i RAC. i need to know ,
1) can i use gigabit ethernet as Private cluster interconnect for Deploying Oracle RAC on E6900 ?
2) What is the recommended Private Cluster Interconnect for Oracle RAC ? GiG ethernet or SCI with RSM ?
3) How about the scenarios where one can have say 3 X Gig Ethernet V/S 2 X SCI , as their cluster's Private Interconnects ?
4) How the Interconnect traffic gets distributed amongest the multiple GigaBit ethernet Interconnects ( For oracle RAC) , & is anything required to be done at oracle Rac Level to enable Oracle to recognise that there are multiple interconnect cards it needs to start utilizing all of the GigaBit ethernet Interfaces for transfering packets ?
5) what would happen to Oracle RAC if one of the Gigabit ethernet private interconnects fails
Have tried searching for this info but could not locate any doc that can precisely clarify these doubts that i have .........
thanks for the patience
Regards,
Nilesh

Answers inline...
Tim
Can any one pls confirm if it's possible to configure
2 or more Gigabit Ethernet interconnects ( Sun
Cluster 3.1 Private Interconnects) on a E6900
cluster ?Yes, absolutely. You can configure up to 6 NICs for the private networks. Traffic is automatically striped across them if you specify clprivnet0 to Oracle RAC (9i or 10g). That is TCP connections and UDP messages.
It's for a High Availability requirement of Oracle
9i RAC. i need to know ,
1) can i use gigabit ethernet as Private cluster
interconnect for Deploying Oracle RAC on E6900 ? Yes, definitely.
2) What is the recommended Private Cluster
Interconnect for Oracle RAC ? GiG ethernet or SCI
with RSM ? SCI is or is in the process of being EOL'ed. Gigabit is usually sufficient. Longer term you may want to consider Infiniband or 10 Gigabit ethernet with RDS.
3) How about the scenarios where one can have say 3 X
Gig Ethernet V/S 2 X SCI , as their cluster's
Private Interconnects ? I would still go for 3 x GbE because it is usually cheaper and will probably work just as well. The latency and bandwidth differences are often masked by the performance of the software higher up the stack. In short, unless you tuned the heck out of your application and just about everything else, don't worry too much about the difference between GbE and SCI.
4) How the Interconnect traffic gets distributed
amongest the multiple GigaBit ethernet Interconnects
( For oracle RAC) , & is anything required to be done
at oracle Rac Level to enable Oracle to recognise
that there are multiple interconnect cards it needs
to start utilizing all of the GigaBit ethernet
Interfaces for transfering packets ?You don't need to do anything at the Oracle level. That's the beauty of using Oracle RAC with Sun Cluster as opposed to RAC on its own. The striping takes place automatically and transparently behind the scenes.
5) what would happen to Oracle RAC if one of the
Gigabit ethernet private interconnects fails It's completely transparent. Oracle will never see the failure.
Have tried searching for this info but could not
locate any doc that can precisely clarify these
doubts that i have .........This is all covered in a paper that I have just completed and should be published after Christmas. Unfortunately, I cannot give out the paper yet.
thanks for the patience
Regards,
Nilesh

How to enqueue a PIP from InterConnect to B2B IP_OUT_QUEUE??

Hi all,
i'm trying to enqueue an 3A4 PIP to B2B IP_OUT_QUEUE using InterConnect AQAdapter.
the scenario is like this:
our own developed application will enqueue an xml into PO_QUEUE and then do an simple transformation in InterConnect then only enqueue the PIP into B2B IP_OUT_QUEUE
can anyone show me the how can it do it??

anyone knows how to do it??

RAC connection problem with interconnect NIC failure

We have an 11g 2-node test RAC setup on RHEL 4 that is configured to have no load balancing (client or server), with Node2 existing as a failover node only. Connection and vip failover works fine in most situations (public interface fail, node fail, cable pull, node 2 interconnect fail, interconnect switch fail etc etc).
When the node1 interconnect card failure is emulated (ifdown eth1):
node2 gets evicted and reboots
failover of existing connections occurs
VIP from node2 is relocated to node1
However new connection attempts from clients and the server receive a ORA-12541: TNS:no listener message.
The basis of this is the issue that in the event of an interconnect failure, the lowest number node is supposed to survive - it looks like this includes the situation where the lowest number node has a failed interconnect NIC; ie it has a hardware fault.
I checked this with Oracle via an iTAR quite some time ago (under 10g) and they eventually confirmed that this eviction of the healthy 2nd node is correct behaviour. In 10g, this situation would result in the remaining instance failing due to the unavailable NIC, however I did not get the chance to fully test and resolve this with Oracle.
In 11g, the alert log continuously reports the NIC's unavailability. The instance remains up, but new connections cannot be established. If the NIC is re-enabled then new connections are able to be established. At all times, srvctl status nodeapps on the surviving node and lsnrtcl show that the listener is functional.
The alert log reports the following, regarding a failed W000 or M000 process:
ospid 13165: network interface with IP address 192.168.1.1 no longer operational
requested interface 19.2.168.1.1 not found. Check output from ifconfig command
ORA-603 : opidrv aborting process W000 ospid (16474_2083223480)
Process W000 died, see its trace file
The W000 trace file refers to an Invalid IP Address 192.168.1.1 (the interconnect ip address) obviously the source of the process dying.
Finally, if I restart the remaining instance via srvctl stop/start instance with the NIC still unavailable, the instance will allow new connections and does not report the failures of the W000/M000 process or appear to care about the failed NIC.
Before I go down the iTAR path or start posting details of the configuration, has anyone else experienced/resolved this, or can anyone else test it out?
Thanks for any input,
Gavin
Listener.ora is:
SID_LIST_LISTENER_NODE1=
(SID_LIST=
(SID_DESC=
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(SID_NAME=RAC_INST)
(SID_DESC=
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(SID_NAME=RAC_INST1)
(SID_DESC=
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(SID_NAME=RAC_INST2)
SID_LIST_LISTENER_NODE2=
(SID_LIST=
(SID_DESC=
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(SID_NAME=RAC_INST)
(SID_DESC=
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(SID_NAME=RAC_INST2)
(SID_DESC=
(ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1)
(SID_NAME=RAC_INST1)
LISTENER_NODE1 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL=TCP)(HOST=vip-NODE1)(PORT=1521)(IP=FIRST))
(ADDRESS = (PROTOCOL=TCP)(HOST=NODE1)(PORT=1521)(IP=FIRST))
LISTENER_NODE2 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL=TCP)(HOST=vip-NODE2)(PORT=1521)(IP=FIRST))
(ADDRESS = (PROTOCOL=TCP)(HOST=NODE2)(PORT=1521)(IP=FIRST))
)

Thanks for your reply.
There is no NIC bonding - the interconnect is a single, dedicated Gigabit link connected via a dedicated switch: plenty of bandwidth.
I know that providing interconnect NIC redundancy would provide a fallback position on this (although how far do you go: redundant interconnect switches as well?), and that remains an option.
However that's not the point. RAC does not require a redundant interconnect - as a high-availability solution it should inherently provide a failover position that continues to provide an available instance, as it does for all other component failures.
Unless I've made a mistake in the configuration (which is very possible, but all the other successful failover scenarios suggest I haven't), then this could be a scenario that renders a 2-node cluster unavailable to new connections.
Gavin

Oracle RAC new feature for interconnect configuration HAIP vs BONDING

Hi All:
I would like to get some opinion about using Oracle HAIP (High Avalibilty IP) for configuring the RAC interconnect vs using network interface bonding.
This seems to be a new feature for Oracle GI from 11.2.0.2 and later. Have anyone had any experience for using HAIP and any issues?
Thanks

Hi
Multiple private network adapters can be defined either during the installation phase or afterward using the oifcfg.Grid Infrastructure can activate a maximum of four private network adapters at a time. With HAIP, by default, interconnect traffic will be load balanced across all active interconnect interfaces, and corresponding HAIP address will be failed over transparently to other adapters if one fails or becomes non-communicative.
it is quite helpful in following scenarios
1) if one private network adapterrs fails, virtual private IP on that adapter would be relocated to healthy adapter.
There is very good document on Metalink ( Doc ID - 1210883.1)
Rgds
Harvinder

ORACLE RAC failure scenarios

Hello,
We have heard all the good points about the RAC and many of these are true but I just want real experience when even well configured RAC is failed and have unplanned downtime.
Can anyone tell the failure scenarios also? I understand very basic one for example, interconnect fails ,SAN failed etc but please share some real life experience where even Oracle Customer Service takes not only hours but days to resolve the problem and they simply termed that problem as bug.
Thanks,
S.Mann

I agree with Andreas and I think it's important to point out that the issues he mentioned (networking issues as well as other communication problems) are typically more common when RAC is deployed on a platform that isn't completely familiar to the implementor. That is, if you run Oracle on Windows servers, then deploying RAC on Linux successfully will probably be difficult.
My standard answer for "what's the best platform for RAC?" is to run RAC on the platform that you know the most about. When you're building a system to house your most critical applications, wouldn't you want to build it on the platform that you know the most about?

N7k Interconnection between Multiple VDCs .

Hi
I have 2 N7Ks
N7K-1 have 2 VDCs, D1 and D2
N7K-2 have 2 VDCc S1 and S2
D1&D2 have vPC configured between them and S1&S2 also have vPCs b/w them.
What is the best practice to interconnect D1&D2 to S1&S2 with redundancy ? I am yet to see a Cisco Doc that discusses this design, Please let me know your suggestions.
TIA.

Hello, So then you could do this:
Physically:
D1 to S1
D1 to S2
D2 to S1
D2 to S2
(not including the VPC peer or keepalives)
1)
From N7K1 and 2 Core VDC's
Have 1 VPC to N7K1-Access VDC
Have 1 VPC to N7K2-Access VDC
This means that your core will have VPC's but your access will have port-channels to your core and not vpc's.
And then have your N7K2's / FEX attached to this access layer.
D1 and 2 to S1 - VPC 1 on CORE
D1 and 2 to S2 - VPC 2 on CORE
or
2)
The other way, which I haven't quite tried before, but no reason why it shouldn't work...
You have D1 and D2 provide one VPC (a) to S1 and S2
You have D1 and D2 provide one VPC (b) to S1 and S2
In this scenario you would have x2 VPC's on the core, and x2 VPC's on the access.
D1 to S2 - VPC1 - both sides
D2 to S1 - VPC1 - both sides
D1 to S1 - VPC2 - both sides
D2 to S2 - VPC2 - both sides
Just be sure to enable peer-switch, peer-gateway, ip arp synchronize for the VPC domain for efficiency
hth.

Uplink failover scenarios - The correct behavior

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;}
Hello Dears,
I’m somehow confused about the failover scenarios related to the uplinks and the Fabric Interconnect (FI) switches, as we have a lot of failover points either in the vNIC , FEX , FI or uplinks.
I have some questions and I hope that someone can clear this confusion:
A-     Fabric Interconnect failover
1-      As I understand when I create a vNIC , it can be configured to use FI failover , which means if FI A is down , or the uplink from the FEX to the FI is down , so using the same vNIC it will failover to the other FI via the second FEX ( is that correct , and is that the first stage of the failover ?).
2-      This vNIC will be seen by the OS as 1 NIC and it will not feel or detect anything about the failover done , is that correct ?
3-      Assume that I have 2 vNICs for the same server (metal blade with no ESX or vmware), and I have configured 2 vNICs to work as team (by the OS), does that mean that if primary FI or FEX is down , so using the vNIC1 it will failover to the 2nd FI, and for any reason the 2nd vNIC is down (for example if uplink is down), so it will go to the 2nd vNIC using the teaming ?
B-      FEX failover
1-      As I understand the blade server uses the uplink from the FEX to the FI based on their location in the chassis, so what if this link is down, does that mean FI failover will trigger, or it will be assigned to another uplink ( from the FEX to the FI)
C-      Fabric Interconnect Uplink failover
1-      Using static pin LAN group, the vNIC is associated with an uplink, what is the action if this uplink is down ? will the vNIC:
a.       Brought down , as per the Network Control policy applied , and in this case the OS will go for the second vNIC
b.      FI failover to the second FI , the OS will not detect anything.
c.       The FI A will re-pin the vNIC to another uplink on the same FI with no failover
I found all theses 3 scenarios in a different documents and posts, I did not have the chance it to test it yet, so it will be great if anyone tested it and can explain.
Finally I need to know if the correct scenarios from the above will be applied to the vHBA or it has another methodology.
Thanks in advance for your support.
Moamen

Moamen
A few things about Fabric Failover (FF) to keep in mind before I try to address your questions.
FF is only supported on the M71KR and the M81KR.
FF is only applicable/supported in End Host Mode of Operation and applies only to ethernet traffic. For FC traffic one has to use multipathing software (the way FC failover has worked always). In End Host mode, anything along the path (adapter port, FEX-IOM link, uplinks) fails and FF is initiated for ethernet traffic *by the adapter*.
FF is an event which is triggered by vNIC down i.e a vNIC is triggered down and the adapter initiates the failover i.e it sends a message to the other fabric to activate the backup veth (switchport) and the FI sends our gARPs for the MAC as part of it. As it is adapter driven, this is why FF is only available on a few adapters i.e for now the firmware for which is done by Cisco.
For the M71KR (menlo's) the firmware on the Menlo chip is made by Cisco. The Oplin and FC parts of the card, and Intel/Emulex/Qlogic control that.
The M81KR is made by Cisco exclusively for UCS and hence the firmware on that is done by us.
Now to your questions -
>1-      As I understand when I create a vNIC , it can be configured to use FI failover , which means if FI A is down , or the uplink from the FEX to the >FI is down , so using the same vNIC it will failover to the other FI via the second FEX ( is that correct , and is that the first stage of the failover ?).
Yes
> 2-      This vNIC will be seen by the OS as 1 NIC and it will not feel or detect anything about the failover done , is that correct ?
Yes
>3-      Assume that I have 2 vNICs for the same server (metal blade with no ESX or vmware), and I have configured 2 vNICs to work as team (by the >OS), does that mean that if primary FI or FEX is down , so using the vNIC1 it will failover to the 2nd FI, and for any reason the 2nd vNIC is down (for >example if uplink is down), so it will go to the 2nd vNIC using the teaming ?
Instead of FF vNICs you can use NIC teaming. You bond the two vNICs which created a bond interface and you specify an IP on it.
With NIC teaming you will not have the vNICs (in the Service Profile) as FF. So the FF will not kick in and the vNIC will be down for the teaming software to see on a fabric failure etc for the teaming driver to come into effect.
> B-      FEX failover
> 1-      As I understand the blade server uses the uplink from the FEX to the FI based on their location in the chassis, so what if this link is down, > >does that mean FI failover will trigger, or it will be assigned to another uplink ( from the FEX to the FI)
Yes, we use static pinning between the adapters and the IOM uplinks which depends on the number of links.
For example, if you have 2 links between IOM-FI.
Link 1 - Blades 1,3,5,7
Link 2 - Blades 2,4,6,8
If Link 1 fails, Blade 1,3,5,7 move to the other IOM.
i.e it will not failover to the other links on the same IOM-FI i.e it is no a port-channel.
The vNIC down event will be triggered. If FF is initiated depends on the setting (above explanation).
> C-      Fabric Interconnect Uplink failover
> 1-      Using static pin LAN group, the vNIC is associated with an uplink, what is the action if this uplink is down ? will the vNIC:
> a.       Brought down , as per the Network Control policy applied , and in this case the OS will go for the second vNIC
If you are using static pin group, Yes.
If you are not using static pin groups, the same FI will map it to another available uplink.
Why? Because by defining static pinning you are purposely defining the uplink/subscription ratio etc and you don't want that vNIC to go to any other uplink. Both fabrics are active at any given time.
> b.      FI failover to the second FI , the OS will not detect anything.
Yes.
> c.       The FI A will re-pin the vNIC to another uplink on the same FI with no failover
For dynamic pinning yes. For static pinning NO as above.
>I found all theses 3 scenarios in a different documents and posts, I did not have the chance it to test it yet, so it will be great if anyone tested it and >can explain.
I would still highly recommend testing it. Maybe its me but I don't believe anything till I have tried it.
> Finally I need to know if the correct scenarios from the above will be applied to the vHBA or it has another methodology.
Multipathing driver as I mentioned before.
FF *only* applies to ethernet.
Thanks
--Manish

Data Center InterConnect with Dark Fibre

Dear all,
We are designing a Data Center InterConnection for our two Data Centers on top of a 10G Dark Fibre.
Our primary goal is to:-
extend a few vlans between the two DCs;
Support VMware vMotion betwen the two DCs;
asymmetric SAN synchronization;
FCoE for SAN connectivity between the two DCs;
So may I ask if we could run both LAN and SAN connections on this DF connection? We have NX5K on one DC and NX7K on the other, are there specific devices required to enable both LAN and SAN connections?
It would be really appreciated if anyone could shed any lights on this. Any suggestions are welcome!
Best Regards,
James Ren

Hello.
If you are running Active/Backup DC scenario, I would suggest to make network design and configuration exactly the same. This includes platforms, interconnectivity types and etc.
Do you know what is the latency on the fiber between these two DCs?
Another question: why do you run 6880 in VSS, do you really need this?
Q about the diagram: are you going to utilize 4 fibers for DC interconnection?
PS: did you think about OTV+LISP instead of MPLS?

Data Center Interconnect using MPLS/VPLS

We are deploying a backup data center and need to extend couple of Vlans over the backup data center.These two DC's which are interconnected by a fibre link which we manage and terminates on the ODC2MAN and ODCXMAN.We run MPLS on these devices ODC2MAN and ODCXMAN(Cisco 6880) as PE routers. I configured OSPF between these devices and advertised their loopbacks.
I need configuration assistance on my PE (odcxman and odc2man) to run the VFI and the VPLS instances.The vlans on the ODCXAGG need to extend to the ODC2AGG.
Also, I am looking for the configuration assistance such that each core devices should have 3 eigrp neighbors.
For example:
ODC2COR1 should have Eigrp neighbors with ODCXCOR1, ODCXCOR2 and ODC2COR2 and my VPLS Cloud should be emulated as a transparent bridge to my core devices such that it appears that ODC2COR1 is directly connected to ODCXCOR1 and ODCXCOR2 and have cdp neighbor relation. I have attached the diagram.Please let me know your inputs.

Hello.
If you are running Active/Backup DC scenario, I would suggest to make network design and configuration exactly the same. This includes platforms, interconnectivity types and etc.
Do you know what is the latency on the fiber between these two DCs?
Another question: why do you run 6880 in VSS, do you really need this?
Q about the diagram: are you going to utilize 4 fibers for DC interconnection?
PS: did you think about OTV+LISP instead of MPLS?

Any InterConnect white papers..?

Other than the small (and confused) Appendix C in AS Installation Guide, are there any papers discussion different installation "scenarios" for InterConnect?

Pls. take a look at http://otn.oracle.com/tech/integration/content.html or
http://www.oracle.com/ip/deploy/ias/integration/
Thanks,
Sudi Narasimhan
Oracle9iAS Portal Partner Services

If RAC interconnect can handle this?

Could anybody tell me if RAC cluster still persists if we take if out of network? I mean to ask if nodes will still be able to talk to each other via interconnect? Does CRS handle all this?
Thanks to all in advance
gtcol

Hello,
Consider a scenario, Two node rac cluster with each node having one Public, Private and Virtual IP configured.
f your Public interface is down on either one of the nodes then CRS decides to move VIP to another node of the cluster and listener configured on that node would go offline state, also some of the higly available services which are running on that node using it as Preferred node would move to another node from that node . However if your Private interconnect fails due to some problem on either of the nodes, then NODE EVICTION process is started by CRS in order to avoid Split Brain Situation, the CRS evicts that node from the cluster and reboots that node.
In case of public network failure between all the nodes, all the services would go offline.
Lalit Verma
http://sites.google.com/site/racinsights1

Fabric Interconnect 6248 & 5548 Connectivity on 4G SFP with FC

Hi,
Recently I came across a scenario when I connected a 4G SFP on Expansion Module of 6248 Fabric Interconnect at one end and at other end 4G SFP on 5548UP. I was unable to establish FC connectivity between both of the devices and the momemt I connected 4G SFP on Fixed Module of 6248 connectivity got established between both the devices
I would like to know do I have to do any changes on FI's Expansion module to get the connectivity working or this kind of behivor is expected behavior
Do let me know if you need any other information on this
Regards,
Amit Vyas

Yes, On FI-B port 15-16 should be in VSAN 101 instead of 100, I have made that correction
Q. are you migrating the fc ports from the fixed to the expansion module ?
     A: As off now I am not migrating FC port but in near future I have to migrate FC ports to Expansion module and I don't want to waste my time for troubleshooting at that time.
Is my understanding correct, that you have 2 links from each FI to a 5548, no port fc port-channel ?
     A: Yes, your understanding is correct we have 2 links from each FI to 5548 and no FC port-channel is configured
I will do the FC port-channel later on once I am able to fix the connectivity issue
I will try to put 4G SFP on expansion module and will provide you output of "show interface brife"
Following is the out of "show interface brife" from both 5548UP switches
Primary5548_SW# show interface brief
Interface Vsan   Admin Admin   Status          SFP    Oper Oper   Port
                  Mode   Trunk                          Mode Speed Channel
                         Mode                                 (Gbps)
fc1/29     100    auto   on      up               swl    F       4    --
fc1/30     100    auto   on      up               swl    F       4    --
fc1/31     100    auto   on      up               swl    F       4    --
fc1/32     100    auto   on      up               swl    F       4    --
Ethernet      VLAN    Type Mode   Status Reason                   Speed     Port
Interface                                                                    Ch #
Eth1/1        1       eth access down    Link not connected          10G(D) --
Eth1/2        1       eth access down    Link not connected          10G(D) --
Eth1/3        1       eth access down    SFP not inserted            10G(D) --
Eth1/4        1       eth access down    SFP not inserted            10G(D) --
Eth1/5        1       eth access down    SFP not inserted            10G(D) --
Eth1/6        1       eth access down    SFP not inserted            10G(D) --
Eth1/7        1       eth access down    SFP not inserted            10G(D) --
Eth1/8        1       eth access down    SFP not inserted            10G(D) --
Eth1/9        1       eth access down    SFP not inserted            10G(D) --
Eth1/10       1       eth access down    SFP not inserted            10G(D) --
Eth1/11       1       eth access down    SFP not inserted            10G(D) --
Eth1/12       1       eth access down    SFP not inserted            10G(D) --
Eth1/13       1       eth access down    SFP not inserted            10G(D) --
Eth1/14       1       eth access down    SFP not inserted            10G(D) --
Eth1/15       1       eth access down    SFP not inserted            10G(D) --
Eth1/16       1       eth access down    SFP not inserted            10G(D) --
Eth1/17       1       eth access down    SFP not inserted            10G(D) --
Eth1/18       1       eth access down    SFP not inserted            10G(D) --
Eth1/19       1       eth access down    SFP not inserted            10G(D) --
Eth1/20       1       eth access down    SFP not inserted            10G(D) --
Eth1/21       1       eth access down    SFP not inserted            10G(D) --
Eth1/22       1       eth access down    SFP not inserted            10G(D) --
Eth1/23       1       eth access down    SFP not inserted            10G(D) --
Eth1/24       1       eth access down    SFP not inserted            10G(D) --
Eth1/25       1       eth access down    SFP not inserted            10G(D) --
Eth1/26       1       eth access down    SFP not inserted            10G(D) --
Eth1/27       1       eth access down    SFP not inserted            10G(D) --
Eth1/28       1       eth access down    SFP not inserted            10G(D) --
Eth2/1        1       eth access down    SFP not inserted            10G(D) --
Eth2/2        1       eth access down    SFP not inserted            10G(D) --
Eth2/3        1       eth access down    SFP not inserted            10G(D) --
Eth2/4        1       eth access down    SFP not inserted            10G(D) --
Eth2/5        1       eth access down    SFP not inserted            10G(D) --
Eth2/6        1       eth access down    SFP not inserted            10G(D) --
Eth2/7        1       eth access down    SFP not inserted            10G(D) --
Eth2/8        1       eth access down    SFP not inserted            10G(D) --
Eth2/9        1       eth access down    SFP not inserted            10G(D) --
Eth2/10       1       eth access down    SFP not inserted            10G(D) --
Eth2/11       1       eth access down    SFP not inserted            10G(D) --
Eth2/12       1       eth access down    SFP not inserted            10G(D) --
Eth2/13       1       eth access down    SFP not inserted            10G(D) --
Eth2/14       1       eth access down    SFP not inserted            10G(D) --
Eth2/15       1       eth access down    SFP not inserted            10G(D) --
Eth2/16       1       eth access down    SFP not inserted            10G(D) --
Port   VRF          Status IP Address                              Speed    MTU
mgmt0 --           up     172.20.10.82                            1000     1500
Interface Vsan   Admin Admin   Status      Bind                 Oper    Oper
                  Mode   Trunk               Info                 Mode    Speed
                         Mode                                            (Gbps)
vfc1       100    F     on     errDisabled Ethernet1/1              --
Primary5548_SW#
Secondary5548_SW# show interface brief
Interface Vsan   Admin Admin   Status          SFP    Oper Oper   Port
                  Mode   Trunk                          Mode Speed Channel
                         Mode                                 (Gbps)
fc1/29     101    auto   on      up               swl    F       4    --
fc1/30     101    auto   on      up               swl    F       4    --
fc1/31     101    auto   on      up               swl    F       4    --
fc1/32     101    auto   on      up               swl    F       4    --
Ethernet      VLAN    Type Mode   Status Reason                   Speed     Port
Interface                                                                    Ch #
Eth1/1        1       eth access down    Link not connected          10G(D) --
Eth1/2        1       eth access down    Link not connected          10G(D) --
Eth1/3        1       eth access down    SFP not inserted            10G(D) --
Eth1/4        1       eth access down    SFP not inserted            10G(D) --
Eth1/5        1       eth access down    SFP not inserted            10G(D) --
Eth1/6        1       eth access down    SFP not inserted            10G(D) --
Eth1/7        1       eth access down    SFP not inserted            10G(D) --
Eth1/8        1       eth access down    SFP not inserted            10G(D) --
Eth1/9        1       eth access down    SFP not inserted            10G(D) --
Eth1/10       1       eth access down    SFP not inserted            10G(D) --
Eth1/11       1       eth access down    SFP not inserted            10G(D) --
Eth1/12       1       eth access down    SFP not inserted            10G(D) --
Eth1/13       1       eth access down    SFP not inserted            10G(D) --
Eth1/14       1       eth access down    SFP not inserted            10G(D) --
Eth1/15       1       eth access down    SFP not inserted            10G(D) --
Eth1/16       1       eth access down    SFP not inserted            10G(D) --
Eth1/17       1       eth access down    SFP not inserted            10G(D) --
Eth1/18       1       eth access down    SFP not inserted            10G(D) --
Eth1/19       1       eth access down    SFP not inserted            10G(D) --
Eth1/20       1       eth access down    SFP not inserted            10G(D) --
Eth1/21       1       eth access down    SFP not inserted            10G(D) --
Eth1/22       1       eth access down    SFP not inserted            10G(D) --
Eth1/23       1       eth access down    SFP not inserted            10G(D) --
Eth1/24       1       eth access down    SFP not inserted            10G(D) --
Eth1/25       1       eth access down    SFP not inserted            10G(D) --
Eth1/26       1       eth access down    SFP not inserted            10G(D) --
Eth1/27       1       eth access down    SFP not inserted            10G(D) --
Eth1/28       1       eth access down    SFP not inserted            10G(D) --
Eth2/1        1       eth access down    SFP not inserted            10G(D) --
Eth2/2        1       eth access down    SFP not inserted            10G(D) --
Eth2/3        1       eth access down    SFP not inserted            10G(D) --
Eth2/4        1       eth access down    SFP not inserted            10G(D) --
Eth2/5        1       eth access down    SFP not inserted            10G(D) --
Eth2/6        1       eth access down    SFP not inserted            10G(D) --
Eth2/7        1       eth access down    SFP not inserted            10G(D) --
Eth2/8        1       eth access down    SFP not inserted            10G(D) --
Eth2/9        1       eth access down    SFP not inserted            10G(D) --
Eth2/10       1       eth access down    SFP not inserted            10G(D) --
Eth2/11       1       eth access down    SFP not inserted            10G(D) --
Eth2/12       1       eth access down    SFP not inserted            10G(D) --
Eth2/13       1       eth access down    SFP not inserted            10G(D) --
Eth2/14       1       eth access down    SFP not inserted            10G(D) --
Eth2/15       1       eth access down    SFP not inserted            10G(D) --
Eth2/16       1       eth access down    SFP not inserted            10G(D) --
Port   VRF          Status IP Address                              Speed    MTU
mgmt0 --           up     172.20.10.84                            1000     1500
Interface Vsan   Admin Admin   Status      Bind                 Oper    Oper
                  Mode   Trunk               Info                 Mode    Speed
                         Mode                                            (Gbps)
vfc1       1      F     on     errDisabled Ethernet1/1              --
Secondary5548_SW#

SC3.2, S10, V40z - failover scenarios/troubles

Hi,
Few days ago I finished the above setup (2 nodes, sc3.2, sol10x86 - all updates).
There are two shared storages connected to the cluster: T3(fiber) and 3310(scsi raid)
Configured atm is a single MySQL instance in active-standby mode.
Later, an NFS service might be added.
There are three global mounts.
The system need to get into production asap but there were some problems while testing the failover scenarios.
= Disconnecting any combination of interconnect cables - WORKS
= Disconnecting any combination of network cables - WORKS
= Shutting down any node - WORKS
= Powering off any node - WORKS
= RGs and resources are switched as expected and any node is able to take ownership
The one test which failed was when the SCSI and FC cables were unplugged.
In both cases, both nodes were rebooted almost instantly.
Is this behavior configurable or expected?
Any suggestions for another test scenario?
I found a single forum thread with similar problem described which was tracked down to bad grounding ... anyone else having experience with that?
I can send more details if anyone is able to help.
Thanks in advance !!!
Paul.

OK, I repeated the scenario yesterday.
For some reason only the node which was mastering the RG was rebooted after panicking about state database records.
I'm not sure if this is a natural behavior or if there is some misconfiguration.
Please, see the logs and cluster info below:
========
Cluster Info
========
-- Cluster Nodes --
Node name Status
Cluster node: CLNODE2 Online
Cluster node: CLNODE1 Online
-- Cluster Transport Paths --
Endpoint Endpoint Status
Transport path: CLNODE2:ce1 CLNODE1:ce1 Path online
Transport path: CLNODE2:bge1 CLNODE1:bge1 Path online
-- Quorum Summary --
Quorum votes possible: 3
Quorum votes needed: 2
Quorum votes present: 3
-- Quorum Votes by Node --
Node Name Present Possible Status
Node votes: CLNODE2 1 1 Online
Node votes: CLNODE1 1 1 Online
-- Quorum Votes by Device --
Device Name Present Possible Status
Device votes: /dev/did/rdsk/d7s2 1 1 Online
-- Device Group Servers --
Device Group Primary Secondary
Device group servers: new_mysql CLNODE1 CLNODE2
Device group servers: new_ibdata CLNODE1 CLNODE2
Device group servers: new_binlog CLNODE1 CLNODE2
-- Device Group Status --
Device Group Status
Device group status: new_mysql Online
Device group status: new_ibdata Online
Device group status: new_binlog Online
-- Multi-owner Device Groups --
Device Group Online Status
-- Resource Groups and Resources --
Group Name Resources
Resources: mysql-failover-rg mysql-has mysql-lh mysql-res
-- Resource Groups --
Group Name Node Name State Suspended
Group: mysql-failover-rg CLNODE2 Offline No
Group: mysql-failover-rg CLNODE1 Online No
-- Resources --
Resource Name Node Name State Status Message
Resource: mysql-has CLNODE2 Offline Offline
Resource: mysql-has CLNODE1 Online Online
Resource: mysql-lh CLNODE2 Offline Offline - LogicalHostname offline.
Resource: mysql-lh CLNODE1 Online Online - LogicalHostname online.
Resource: mysql-res CLNODE2 Offline Offline
Resource: mysql-res CLNODE1 Online Online - Service is online.
-- IPMP Groups --
Node Name Group Status Adapter Status
IPMP Group: CLNODE2 sc_ipmp0 Online ce0 Online
IPMP Group: CLNODE2 sc_ipmp0 Online bge0 Online
IPMP Group: CLNODE1 sc_ipmp0 Online ce0 Online
IPMP Group: CLNODE1 sc_ipmp0 Online bge0 Online
=========
Devices
=========
===
DIDs
===
CLNODE1:root[]didadm -L
1 CLNODE2:/dev/rdsk/c0t0d0 /dev/did/rdsk/d1 INTERNAL DISKS/HARDWARE RAID
2 CLNODE2:/dev/rdsk/c1t0d0 /dev/did/rdsk/d2 INTERNAL DISKS/HARDWARE RAID
3 CLNODE2:/dev/rdsk/c4t60020F200000C51E48874D0D000DA3EEd0 /dev/did/rdsk/d3 FC VOLUMES
3 CLNODE1:/dev/rdsk/c4t60020F200000C51E48874D0D000DA3EEd0 /dev/did/rdsk/d3 FC VOLUMES
4 CLNODE2:/dev/rdsk/c4t60020F200000C51E48874D49000686F0d0 /dev/did/rdsk/d4 FC VOLUMES
4 CLNODE1:/dev/rdsk/c4t60020F200000C51E48874D49000686F0d0 /dev/did/rdsk/d4 FC VOLUMES
5 CLNODE1:/dev/rdsk/c5t1d0 /dev/did/rdsk/d5 SCSI RAID
5 CLNODE2:/dev/rdsk/c5t1d0 /dev/did/rdsk/d5 SCSI RAID
6 CLNODE1:/dev/rdsk/c5t0d0 /dev/did/rdsk/d6 SCSI RAID
6 CLNODE2:/dev/rdsk/c5t0d0 /dev/did/rdsk/d6 SCSI RAID
7 CLNODE2:/dev/rdsk/c4t60020F200000C51E48874DA900088862d0 /dev/did/rdsk/d7 FC VOLUMES
7 CLNODE1:/dev/rdsk/c4t60020F200000C51E48874DA900088862d0 /dev/did/rdsk/d7 FC VOLUMES
8 CLNODE2:/dev/rdsk/c4t60020F200000C51E48874DDD000CE109d0 /dev/did/rdsk/d8 FC VOLUMES
8 CLNODE1:/dev/rdsk/c4t60020F200000C51E48874DDD000CE109d0 /dev/did/rdsk/d8 FC VOLUMES
11 CLNODE1:/dev/rdsk/c0t0d0 /dev/did/rdsk/d11 INTERNAL DISKS/HARDWARE RAID
12 CLNODE1:/dev/rdsk/c1t0d0 /dev/did/rdsk/d12 INTERNAL DISKS/HARDWARE RAID
===
metasets
===
CLNODE1:root[]metaset -s new_binlog
Set name = new_binlog, Set number = 8
Host Owner
CLNODE1 Yes
CLNODE2
Driv Dbase
d5 Yes
d6 Yes
CLNODE1:root[]metaset -s new_mysql
Set name = new_mysql, Set number = 5
Host Owner
CLNODE1 Yes
CLNODE2
Driv Dbase
d3 Yes
d4 Yes
CLNODE1:root[]metaset -s new_ibdata
Set name = new_ibdata, Set number = 7
Host Owner
CLNODE1 Yes
CLNODE2
Driv Dbase
d7 Yes
d8 Yes
===
metadb info
===
CLNODE1:root[]metadb -s new_binlog
flags first blk block count
a m luo r 16 8192 /dev/did/dsk/d5s7
a luo r 16 8192 /dev/did/dsk/d6s7
CLNODE1:root[]metadb -s new_mysql
flags first blk block count
a m luo r 16 8192 /dev/did/dsk/d3s7
a luo r 16 8192 /dev/did/dsk/d4s7
CLNODE1:root[]metadb -s new_ibdata
flags first blk block count
a m luo r 16 8192 /dev/did/dsk/d7s7
a luo r 16 8192 /dev/did/dsk/d8s7
===
md.tab - 3 configured mirrors mounted as global
===
d110 -m d103 d104
new_mysql/d110 -m new_mysql/d103 new_mysql/d104
d120 -m d115 d116
new_binlog/d120 -m new_binlog/d115 new_binlog/d116
d130 -m d127 d128
new_ibdata/d130 -m new_ibdata/d127 new_ibdata/d128
=========
Log at time of failure - CLNODE1 is the master - SCSI cable disconnected - CLNODE2 takes over RG after CLNODE1's panic
=========
Aug 7 14:16:27 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@0,0 (sd81):
Aug 7 14:16:27 CLNODE1 disk not responding to selection
Aug 7 14:16:28 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@1,0 (sd82):
Aug 7 14:16:28 CLNODE1 disk not responding to selection
Aug 7 14:16:33 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@0,0 (sd81):
Aug 7 14:16:33 CLNODE1 disk not responding to selection
Aug 7 14:16:35 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@1,0 (sd82):
Aug 7 14:16:35 CLNODE1 disk not responding to selection
Aug 7 14:16:38 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@0,0 (sd81):
Aug 7 14:16:38 CLNODE1 disk not responding to selection
Aug 7 14:16:38 CLNODE1 md: [ID 312844 kern.warning] WARNING: md: state database commit failed
Aug 7 14:16:39 CLNODE1 cl_dlpitrans: [ID 624622 kern.notice] Notifying cluster that this node is panicking
Aug 7 14:16:39 CLNODE1 unix: [ID 836849 kern.notice]
Aug 7 14:16:39 CLNODE1 ^Mpanic[cpu1]/thread=fffffe800030bc80:
Aug 7 14:16:39 CLNODE1 genunix: [ID 268973 kern.notice] md: Panic due to lack of DiskSuite state
Aug 7 14:16:39 CLNODE1 database replicas. Fewer than 50% of the total were available,
Aug 7 14:16:39 CLNODE1 so panic to ensure data integrity.
Aug 7 14:16:39 CLNODE1 unix: [ID 100000 kern.notice]
Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bb80 md:mddb_commitrec_wrapper+8c ()
Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bbc0 md_mirror:process_resync_regions+16a ()
Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bbf0 md_mirror:check_resync_regions+df ()
Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bc50 md:md_daemon+10b ()
Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bc60 md:start_daemon+e ()
Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bc70 unix:thread_start+8 ()
Aug 7 14:16:39 CLNODE1 unix: [ID 100000 kern.notice]
Aug 7 14:16:39 CLNODE1 genunix: [ID 672855 kern.notice] syncing file systems...
Aug 7 14:16:39 CLNODE1 genunix: [ID 733762 kern.notice] 1
Aug 7 14:16:40 CLNODE1 genunix: [ID 904073 kern.notice] done
Aug 7 14:16:41 CLNODE1 genunix: [ID 111219 kern.notice] dumping to /dev/dsk/c1t0d0s1, offset 429391872, content: kernel
Aug 7 14:16:52 CLNODE1 genunix: [ID 409368 kern.notice] ^M100% done: 148178 pages dumped, compression ratio 4.10,
Aug 7 14:16:52 CLNODE1 genunix: [ID 851671 kern.notice] dump succeeded
Aug 7 14:19:39 CLNODE1 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_118855-36 64-bit
Aug 7 14:19:39 CLNODE1 genunix: [ID 172907 kern.notice] Copyright 1983-2006 Sun Microsystems, Inc. All rights reserved.
=========================================================
Anyone?
Thanks in advance!

Using the interconnect for backups???

Hi all, I have a question regarding using the interconnect for backups. Here's the scenario that's been put to me. You have a two-node cluster (node1 and node2). There are 8 global filesystems: 4 owned by node1 and 4 owned by node2. Both nodes can see all 8 filesystems. You want to backup the data on all 8 global filesystems. Can you use node1 to backup all 8 filesystems? That is, node1 backups its own 4 filesystems and uses the interconnect to backup node2's 4 filesystems.
Is this a bad idea? Does it not matter?
Thanks in advance,
Stewart

It would be better to do the backups locally, i.e. all through node1. The reason I say this is that during the backup, you are going to cause I/O contention for the applications using the file system. If you do this remotely, then not only are you impacting the application but you are pulling data across the interconnect and chewing up CPU cycles, etc.
Given that you can move the primary path of a global file system without needing to stop the application, it would be easier to do that, get the backup done more quickly and then migrate the path back than to do it the way you are considering. At least that's my view - though I've not done a bake-off to try out the two options.
Tim
---

InterConnect Scenario

Similar Messages

Maybe you are looking for