IPMP failover Failure

I have one qustion about IPMP under solaris 9 9/04 SPARC 64-bit
My OS: with EIS 3.1.1 patches
Clusterware: Sun Cluster 3.1u4 with EIS 3.1.1 patches
My IPMP group contains two NICs: ce0 & ce3.
Two NICs are linked to CISCO 4506
IPMP configuration Files as the following:
*/etc/hostname.ce0*
lamp-test2 netmask + broadcast + group ipmp1 deprecated -failover up
*/etc/hostname.ce3*
lamp netmask + broadcast + group ipmp1 up \
addif lamp-test1 netmask + broadcast + deprecated -failover up
I am alway using the default in.mpathd configuration file
But once I pull out ceN NIC's cable, my IPMP group will complaint that:
+Mar 18 18:06:34 lamp in.mpathd[2770]: [ID 215189 daemon.error] The link has gone down on ce0+
+Mar 18 18:06:34 lamp in.mpathd[2770]: [ID 594170 daemon.error] NIC failure detected on ce0 of group ipmp1+
+Mar 18 18:06:34 lamp ip: [ID 903730 kern.warning] WARNING: IP: Hardware address '00:03:ba:b0:5d:54' trying to be our address 192.168.217.020!+
+Mar 18 18:06:34 lamp in.mpathd[2770]: [ID 832587 daemon.error] Successfully failed over from NIC ge0 to NIC ce0+
+Mar 18 18:06:34 lamp ip: [ID 903730 kern.warning] WARNING: IP: Hardware address '00:03:ba:b0:5d:54' trying to be our address 192.168.217.020!+
Why do solaris OS tell us Hardware Address conflict ?
But I'm sure this IPMP configuration files can cowork finely with CISCO 2950 and DLINK mini switch.
By the way, there are no the same MACs in the LAN.
I should modify some CICSO parameters?
Your advicement is so appreciated!!!

lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
     inet 127.0.0.1 netmask ff000000
ce0: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
     inet 192.168.217.6 netmask ffffff00 broadcast 192.168.217.255
     groupname ipmp1
     ether 0:3:ba:b0:5d:54
ce3: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
     inet 192.168.217.20 netmask ffffff00 broadcast 192.168.217.255
     groupname ipmp1
     ether 0:3:ba:95:5d:6e
ce3:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 4
     inet 192.168.217.4 netmask ffffff00 broadcast 192.168.217.255
General speaking,
When I switch float IP from ce0 to ce3, IPMP will say ce0 MAC is "trying to be our address ....", then ce0 test IP failed, FLOAT IP didn't failover.
When I switch float IP from ce3 to ce0, IPMP will say ce3 MAC is "trying to be our address ....",
then ce0 test IP failed, FLOAT IP didn't failover.
In my viewpoint, float NIC MAC & address information may be cached in CICSO device's RAM, not released in time, I think.

Similar Messages

Creation of IPMP Group failure

Hi All,
I used the following commands to create the IPMP Group for 10 Gbe interfaces but it seems to fail:
root@ebsprdb1 # ipadm create-ipmp ipmp1
root@ebsprdb1 # ipadm create-ip net16
root@ebsprdb1 # ipadm create-ip net33
root@ebsprdb1 # ipadm add-ipmp -i net16 -i net33 ipmp1
root@ebsprdb1 # ipadm create-addr -T static -a ebsprdb1-data/16 ipmp1/data1
root@ebsprdb1 # ipadm create-addr -T static -a ebsprdb1-vsw2-test1/16 net16/test
root@ebsprdb1 # ipadm create-addr -T static -a ebsprdb1-vsw3-test1/16 net33/test
root@ebsprdb1 # cat /etc/hosts
# Copyright 2009 Sun Microsystems, Inc. All rights reserved.
# Use is subject to license terms.
# Internet host table
::1 localhost
127.0.0.1 localhost loghost
172.16.4.51 ebsprdb1.oocep.com ebsprdb1
172.16.4.52 ebsprdb1-vsw0-test1.oocep.com ebsprdb1-vsw0-test1
172.16.4.53 ebsprdb1-vsw1-test1.oocep.com ebsprdb1-vsw1-test1
10.0.0.130 ebsprdb1-data
10.0.0.131 ebsprdb1-vsw2-test1
10.0.0.132 ebsprdb1-vsw3-test1
root@ebsprdb1 # ipadm
NAME CLASS/TYPE STATE UNDER ADDR
ipmp0 ipmp ok -- --
ipmp0/data1 static ok -- 172.16.4.51/24
ipmp1 ipmp ok -- --
ipmp1/data1 static ok -- 10.0.0.130/16
lo0 loopback ok -- --
lo0/v4 static ok -- 127.0.0.1/8
lo0/v6 static ok -- ::1/128
net14 ip ok ipmp0 --
net14/test static ok -- 172.16.4.52/24
net16 ip failed ipmp1 --
net16/test static ok -- 10.0.0.131/16
net25 ip ok -- --
net25/v4 static ok -- 169.254.182.77/24
net29 ip ok ipmp0 --
net29/test static ok -- 172.16.4.53/24
net33 ip ok ipmp1 --
net33/test static ok -- 10.0.0.132/16
As soon as I add the net16 device to the IPMP group, there is a failure.
Can anyone please help?
Regards.

Julius,
Thanks for the update. As suggested by you i have inserted one entry of auth group in TBRG table against FI object with SE16.
Now how do we maintain the view of V_TBRG. Is it from SE11?, if yes then i should do this step from ABAP login.
But what i heard is, this activity is purely involved by Basis people.
Please suggest.
Rgds,
Durga.

ACE 4710 FT failover failure

Hello,
I am running redundant ACE 4710 appliances running A3(2.7). I have five FT groups configured along with FT Tracking and when the vlans fail due to physical links being down, the contexts to do not failover. If one of the ACE boxes fail completely, failover works fine. I have included the FT config from one of the contexts below. I have a case open with TAC and the Engineer is suggesting the use of a query interface in additon to FT Tracking. We have had two incidents on separate contexts where we lost a physical interface on the primary ACE, one for the maintenance of the core switch, the other was a cable disconnect and we are unable to understand why the indivdual context didn't failover. Any ideas would be much appreciated. Let me know if more info/configs are needed.
Dave
ft interface vlan 900
ip address 10.10.10.1 255.255.255.0
peer ip address 10.10.10.2 255.255.255.0
no shutdown
ft peer 1
heartbeat interval 300
heartbeat count 20
ft-interface vlan 900
ft group 3
peer 1
no preempt
priority 210
peer priority 120
associate-context XYZ
inservice
FT Group                     : 3
No. of Contexts             : 1
Context Name                 : XYZ
Context Id                   : 2
Configured Status           : in-service
Maintenance mode             : MAINT_MODE_OFF
My State                   : FSM_FT_STATE_ACTIVE
My Config Priority           : 210
My Net Priority             : 210
My Preempt                   : Disabled
Peer State                   : FSM_FT_STATE_STANDBY_HOT
Peer Config Priority         : 120
Peer Net Priority           : 120
Peer Preempt                 : Disabled
Peer Id                     : 1
Last State Change time       : Wed Jan 11 13:14:16 2012
Running cfg sync enabled     : Enabled
Running cfg sync status     : Running configuration sync has completed
Startup cfg sync enabled     : Enabled
Startup cfg sync status     : Startup configuration sync has completed
Bulk sync done for ARP: 0
Bulk sync done for LB: 0
Bulk sync done for ICM: 0
show int
vlan424 is up, VLAN up on the physical port
Hardware type is VLAN
MAC address is 00:1e:68:1e:ba:b7
Virtual MAC address is 00:0b:fc:fe:1b:03
Mode : routed
IP address is 10.104.224.6 netmask is 255.255.255.0
FT status is active
Description:"New Server VIP and real"
MTU: 1500 bytes
Last cleared: never
Last Changed: Sun Mar 11 01:13:12 2012
No of transitions: 3
Alias IP address is 10.104.224.5 netmask is 255.255.255.0
Peer IP address is 10.104.224.7 Peer IP netmask is 255.255.255.0
Assigned on the physical port, up on the physical port
Previous State: Sun Mar 11 00:04:57 2012, VLAN not up on the physical port
Previous State: Sun Sep 18 10:21:15 2011, administratively up
     3991888419 unicast packets input, 23734607976687 bytes
     20246934 multicast, 174801 broadcast
     0 input errors, 0 unknown, 0 ignored, 0 unicast RPF drops
     1609345958 unicast packets output, 23690663385228 bytes
     7 multicast, 55807 broadcast
     0 output errors, 0 ignored

Dave,
For tracking to work you need to have preempt enabled. Can you try enabling preempt under the ft group and test your tracking again? Another potential issue you may run into is if your tracking is not lowering the priority enough when it fails. The difference between the active and standby device is 100. If you are not decrementing the priority greater than this value even if priority is enabled it will not lower it enough to force the failover. If after enabling preempt on this group the tracking still does not work as expected send you whole config for us to look at.
Regarding the query interface; This is not a bad idea. It will help prevent an active active situation if there is a problem with the ft link between the two modules.
Thanks
Jim

Failover Failure.

First let me apologize for my ignorance and please assume I know nothing! I am managing a hyper-v environment that someone else configured. We are experiencing and issue that I will try to describe to the best of my ability.
We have two physical locations that each contain a SAN and three Hyper-V hosts. He have 3 CSV's (high, medium, and low priority) as well as a witness disk. One location is our primary location (PL) and the other our disaster recovery location (DRL)
if the hosts at the PL have the VM's residing on them (as well as the CSV's) and all the hosts at the PL go down, everything will failover to our DRL and come back up with no problems. However, if our DRL goes down EVEN IF EVERYTHING IS RUNNING
AT THE primary location, everything goes down! All of the VM's attempt to failover to the DRL that went offline! I am very confused by this! It has caused major problems a couple of times and I have no idea what to do!
Any help would be greatly appreciated.
Thanks!

Although you aren't mentioning it, I assume that there's some storage replication involved here?
If yes, then you should engage with the vendor in order to point at the configuration that is causing this.
Using the Microsoft stack in the same scenario, you would use Hyper-V Replica with Azure Site recovery and System Center, that would control and orchestrate the Disaster Recovery scenarios for you.
Also note that I am referring to Disaster Recovery - and not High Availability in this case as the MS solutions are DR and not HA across sites.
-kn
Kristian (Virtualization and some coffee: http://kristiannese.blogspot.com )

Logicalhostname IP wont failover when one member of the cluster dies

Hi There,
I've setup a failover cluster with 2 servers. The cluser IP is set up as a logicalhostname and each server has two network cards configured as IPMP groups.
I can test the IPMP failover on each server by failing a network card and checkign the IP address fails over.
I can test the logicalhost name failsover by switchign the resource group over from one node to the other
BUT
If I drop one member of the cluster the failover fails
Nov 4 15:09:06 nova cl_runtime: NOTICE: clcomm: Path nova:qfe2 - gambit:qfe2 errors during initiation
Nov 4 15:09:06 nova cl_runtime: WARNING: Path nova:ce1 - gambit:bge1 initiation encountered errors, errno = 62. Remote node may be down or unreachable through this path.
Nov 4 15:09:06 nova cl_runtime: WARNING: Path nova:qfe2 - gambit:qfe2 initiation encountered errors, errno = 62. Remote node may be down or unreachable through this path.
ova
Nov 4 15:09:08 nova Cluster.PNM: PNM daemon system error: SIOCLIFADDIF failed.: Network is down
Nov 4 15:09:08 nova Cluster.PNM: production can't plumb 130.159.17.1.
Nov 4 15:09:08 nova SC[SUNW.LogicalHostname,test-vle,vle1,hafoip_prenet_start]: IPMP logical interface configuration operation failed with <-1>.
Nov 4 15:09:08 nova Cluster.RGM.rgmd: Method <hafoip_prenet_start> failed on resource <vle1> in resource group <test-vle>, exit code <1>, time used: 0% of timeout <300 seconds>
Nov 4 15:09:08 nova ip: TCP_IOC_ABORT_CONN: local = 130.159.017.001:0, remote = 000.000.000.000:0, start = -2, end = 6
Nov 4 15:09:08 nova ip: TCP_IOC_ABORT_CONN: aborted 0 connection
scswitch: Resource group test-vle failed to start on chosen node and may fail over to other node(s)
Any ideas would be appreciated as I dont understand how it all fails over correctly if the cluster is up but fails when one member is down.

Hi,
looking at the messages, the problem seems to be with the network setup on nova. I would suggest to try to configure the logical IP on nova manually to see if that works. If that does not it should tell you where the problem is.
Or are you saying that manually switching the RG works, but when a node dies and cluster switches the RG it doesn't. That would be strange.
You should also post the status of your network on nova in the failure case. There might be something wrong with your IPMP setup. Or has the public net failed completely when you killed the other node?
Regards
Hartmut

Replacing network adapter from IPMP group (Sun cluster 3.3)

Hello!
I need to change network devices from IPMP group that have devices ge0 ge1 ge2 to ce5 ce6 ce7
I can do this procedure online? something like:
Creating files adding to the ipmp groups: /etc/hostname.ce5 ,ce6, c7
unmonitoring resources group
umplumb old devices and plumb up new devices
# scstat -i
-- IPMP Groups --
Node Name Group Status Adapter Status
IPMP Group: node0 ipmp0 Online ge1 Online
IPMP Group: node0 ipmp0 Online ge0 Online
IPMP Group: node0 ipmp1 Online ce2 Online
IPMP Group: node0 ipmp1 Online ce0 Online
IPMP Group: node1 ipmp0 Online ge1 Online
IPMP Group: node1 ipmp0 Online ge0 Online
IPMP Group: node1 ipmp1 Online ce2 Online
IPMP Group: node1 ipmp1 Online ce0 Online
/etc/hostname.ge0
n0-testge0 netmask + broadcast + group ipmp0 deprecated -failover up
addif node0 netmask + broadcast + up
/etc/hostname.ge1
n0-testge1 netmask + broadcast + group ipmp0 deprecated -failover up
/etc/hostname.ge2
backupn0 mtu 1500
# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
ce0: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
inet 172.19.1.25 netmask ffffff00 broadcast 172.19.1.255
groupname ipmp1
ether 0:14:4f:23:1d:9
ce0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 172.19.1.10 netmask ffffff00 broadcast 172.19.1.255
ce1: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 9
inet 172.16.0.129 netmask ffffff80 broadcast 172.16.0.255
ether 0:14:4f:23:1d:a
ce2: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 3
inet 172.19.1.26 netmask ffffff00 broadcast 172.19.1.255
groupname ipmp1
ether 0:14:4f:26:a4:83
ce2:1: flags=1001040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,FIXEDMTU> mtu 1500 index 3
inet 172.19.1.23 netmask ffffff00 broadcast 172.19.1.255
ce4: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 8
inet 172.16.1.1 netmask ffffff80 broadcast 172.16.1.127
ether 0:14:4f:42:7f:28
dman0: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500 index 4
inet 192.168.103.6 netmask ffffffe0 broadcast 192.168.103.31
ether 0:0:be:aa:1c:58
ge0: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 5
inet 10.1.0.25 netmask ffffff00 broadcast 10.1.0.255
groupname ipmp0
ether 8:0:20:e6:61:a7
ge0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 5
inet 10.1.0.10 netmask ffffff00 broadcast 10.1.0.255
ge1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 6
inet 10.1.0.26 netmask ffffff00 broadcast 10.1.0.255
groupname ipmp0
ether 0:3:ba:c:74:62
ge1:1: flags=1001040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,FIXEDMTU> mtu 1500 index 6
inet 10.1.0.23 netmask ffffff00 broadcast 10.1.0.255
ge2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 7
inet 10.1.2.10 netmask ffffff00 broadcast 10.1.2.255
ether 8:0:20:b5:25:88
clprivnet0: flags=1009843<UP,BROADCAST,RUNNING,MULTICAST,MULTI_BCAST,PRIVATE,IPv4> mtu 1500 index 10
inet 172.16.4.1 netmask fffffe00 broadcast 172.16.5.255
ether 0:0:0:0:0:1
Thanks in advance!

You should be able to replace adapters in an IPMP group one-by-one without affecting the cluster operation.
BUT: You must make sure that the status of the new adapter in the IPMP group gets back to normal, before you start replacing the next adapter.
Solaris Cluster only reacts to IPMP group failures, not to failures of individual NICs.
Note, that IPMP is only used for the public network. Cluster interconnects are not configured using IPMP. Nevertheless the same technique can be applied to replace adapters in the cluster interconnect. You need to use the clintr command (IIRC) to replace individual NICs. Again, make sure that all the NICs of the interconnect are healthy before you continue replacing the next adapater.

"has no ifIndex" Errors while failing a IPMP group

Hi,
I have a solaris 10 on x86 server, with a IPMP failover group configured,the Ips are dummys:
[root@vm2:/]# cat /etc/hostname.e1000g2
vm2 netmask + broadcast + group sc_ipmp0 up \
addif 11.0.0.110 deprecated -failover netmask + broadcast + up
[root@vm2:/]# cat /etc/hostname.e1000g3
11.0.0.111 deprecated group sc_ipmp0 -failover standby up
[root@vm2:/]# uname -a
SunOS vm2 5.10 Generic_142910-17 i86pc i386 i86pc
[root@vm2:/]#
e1000g2: flags=9000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER> mtu 1500 index 2
     inet 11.0.0.102 netmask ffffff00 broadcast 11.0.0.255
     groupname sc_ipmp0
     ether 8:0:27:1d:69:a9
e1000g2:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4> mtu 1500 index 2
     inet 11.0.0.105 netmask ffffff00 broadcast 11.0.0.255
e1000g2:2: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4> mtu 1500 index 2
     inet 11.0.0.104 netmask ffffff00 broadcast 11.0.0.255
e1000g2:3: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
     inet 11.0.0.110 netmask ffffff00 broadcast 11.0.0.255
e1000g3: flags=69040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,STANDBY,INACTIVE> mtu 1500 index 6
     inet 11.0.0.111 netmask ffffff00 broadcast 11.0.0.255
     groupname sc_ipmp0
     ether 8:0:27:9e:57:93
To test it out I unplug the e1000g2 card, and the IP failover works ok for the logical interfaces, but not for the main ip(11.0.0.102) I get the following messages in dmesg:
Sep 29 15:51:37 vm2 e1000g: [ID 801725 kern.info] NOTICE: pci8086,100e - e1000g[2] : link down
Sep 29 15:51:37 vm2 in.mpathd[269]: [ID 215189 daemon.error] The link has gone down on e1000g2
Sep 29 15:51:37 vm2 in.routed[390]: [ID 238047 daemon.warning] interface e1000g2 to 11.0.0.102 turned off
Sep 29 15:51:37 vm2 in.mpathd[269]: [ID 594170 daemon.error] NIC failure detected on e1000g2 of group sc_ipmp0
Sep 29 15:51:37 vm2 in.mpathd[269]: [ID 832587 daemon.error] Successfully failed over from NIC e1000g2 to NIC e1000g3
Sep 29 15:51:37 vm2 in.routed[390]: [ID 970160 daemon.notice] unable to get interface flags for e1000g2:1: No such device or address
Sep 29 15:51:37 vm2 in.routed[390]: [ID 472501 daemon.notice] e1000g2:1 has no ifIndex: No such device or address
Sep 29 15:51:37 vm2 in.routed[390]: [ID 970160 daemon.notice] unable to get interface flags for e1000g2:2: No such device or address
Sep 29 15:51:37 vm2 in.routed[390]: [ID 472501 daemon.notice] e1000g2:2 has no ifIndex: No such device or address
ifconfig -a output of the failed device:
e1000g2: flags=19000803<UP,BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 1500 index 2
     inet 11.0.0.102 netmask ffffff00 broadcast 11.0.0.255
     groupname sc_ipmp0
     ether 8:0:27:1d:69:a9
e1000g2:3: flags=19040803<UP,BROADCAST,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,FAILED> mtu 1500 index 2
     inet 11.0.0.110 netmask ffffff00 broadcast 11.0.0.255
e1000g3: flags=29040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,STANDBY> mtu 1500 index 6
     inet 11.0.0.111 netmask ffffff00 broadcast 11.0.0.255
     groupname sc_ipmp0
     ether 8:0:27:9e:57:93
e1000g3:1: flags=21040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,STANDBY> mtu 1500 index 6
     inet 11.0.0.105 netmask ffffff00 broadcast 11.0.0.255
e1000g3:2: flags=21040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,STANDBY> mtu 1500 index 6
     inet 11.0.0.104 netmask ffffff00 broadcast 11.0.0.255
What i'm doing wrong ?, I thought the 11.0.0.102 IP should failover to the e1000g3 interface also

Ok, sorry I found out it was the -failover parameter in e1000g2, if only I read a bit..
thnx

Unexpected behavior: Solaris10 , vlan , ipmp, non-global zones

I've configured a System with several non-global zones.
Each of them has ip - connection via a seperate vlan (1 vlan for each nonglobal zone). The vlans are established by the global zone. They are additionally brought under control of ipmp.
I followed the instructions described at:
http://forum.sun.com/thread.jspa?threadID=21225&messageID=59653#59653
to create the defaultrouters for the non-global zones.
In addition to that, I've created the default route for the 2nd ipmp-interface. (to keep the route in the non-global Zone in case of ipmp-failover)
ie:
route add default 172.16.3.1 -ifp ce1222000
route add default 172.16.3.1 -ifp ce1222002Furthermore, i' ve put the 172.16.3.1 in the /etc/defaultrouter of the global zone, to ensure it will be the 1st entry in the routing table (because it's the defaultrouter for the global zone)
Here the unexpected:
Tried to reach a ip-target ouside the configured subnets, say 172.16.1.3 , via icmp. The router 172.16.3.1 knows the proper route to get it. The 1st tries (can't remember the exact number) went through ce1222000 and associated icmp-replies travelled back trough ce1222000. But suddenly the outgoing interface changed to ce1322000 or ce1122000 ! The defaultrouters configured on these vlans are not aware of the 172.16.1.3 (172.16.1.0/24), and there was no answer. The defaultroutes seemed to be "cycled" between the configured.
Furthermore the connection from the outside to the nonglobal-zones (wich do have only 1 defaultrouter configured: the one of the vlan the non-global Zone belongs to) was broken intermittent.
So, how to get the combination of VLAN ,IPMP, diff. defaultrouters, non-global Zones running?
Got the following config visible in the global zone:
(the 172.13.x.y are sc3.1u4 priv. interconnect)
netstat -rn
Routing Table: IPv4
Destination           Gateway           Flags Ref   Use   Interface
172.31.193.1         127.0.0.1            UH        1      0 lo0
172.16.19.0          172.16.19.6          U         1   4474 ce1322000
172.16.19.0          172.16.19.6          U         1      0 ce1322000:1
172.16.19.0          172.16.19.6          U         1   1791 ce1322002
172.31.1.0           172.31.1.2           U         1 271194 ce5
172.31.0.128         172.31.0.130         U         1 271158 ce1
172.16.11.0          172.16.11.6          U         1   8715 ce1122000
172.16.11.0          172.16.11.6          U         1      0 ce1122000:1
172.16.11.0          172.16.11.6          U         1   7398 ce1122002
172.16.3.0           172.16.3.6           U         1   4888 ce1222000
172.16.3.0           172.16.3.6           U         1      0 ce1222000:1
172.16.3.0           172.16.3.6           U         1   4236 ce1222002
172.16.27.0          172.16.27.6          U         1      0 ce1411000
172.16.27.0          172.16.27.6          U         1      0 ce1411000:1
172.16.27.0          172.16.27.6          U         1      0 ce1411002
192.168.0.0          192.168.0.62         U         1 24469 ce3
172.31.193.0         172.31.193.2         U         1    651 clprivnet0
172.16.11.0          172.16.11.6          U         1      0 ce1122002:1
224.0.0.0            192.168.0.62         U         1      0 ce3
default              172.16.3.1           UG        1   1454
default              172.16.19.1          UG        1      0 ce1322000
default              172.16.19.1          UG        1      0 ce1322002
default              172.16.11.1          UG        1      0 ce1122000
default              172.16.11.1          UG        1      0 ce1122002
default              172.16.3.1           UG        1      0 ce1222000
default              172.16.3.1           UG        1      0 ce1222002
127.0.0.1            127.0.0.1            UH        41048047 lo
#ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232
index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232
index 1
        zone Z-BTO1-1
        inet 127.0.0.1 netmask ff000000
lo0:2: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232
index 1
        zone Z-BTO1-2
        inet 127.0.0.1 netmask ff000000
lo0:3: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232
index 1
        zone Z-ITR1-1
        inet 127.0.0.1 netmask ff000000
lo0:4: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232
index 1
        zone Z-TDN1-1
        inet 127.0.0.1 netmask ff000000
lo0:5: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232
index 1
        zone Z-DRB1-1
        inet 127.0.0.1 netmask ff000000
ce1: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500
index 10
        inet 172.31.0.130 netmask ffffff00 broadcast 172.31.0.255
        ether 0:3:ba:f:63:95
ce3: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 8
        inet 192.168.0.62 netmask ffffff00 broadcast 192.168.0.255
        groupname ipmp0
        ether 0:3:ba:f:68:1
ce5: flags=1008843<UP,BROADCAST,RUNNING,MULTICAST,PRIVATE,IPv4> mtu 1500
index 9
        inet 172.31.1.2 netmask ffffff00 broadcast 172.31.1.127
        ether 0:3:ba:d5:b1:44
ce1122000: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500
index 2
        inet 172.16.11.6 netmask ffffff00 broadcast 172.16.11.127
        groupname ipmp2
        ether 0:3:ba:f:63:94
ce1122000:1:
flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,CoS>
mtu 1500 index 2
        inet 172.16.11.7 netmask ffffff00 broadcast 172.16.11.127
ce1122002:
flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu
1500 index 3
        inet 172.16.11.8 netmask ffffff00 broadcast 172.16.11.127
        groupname ipmp2
        ether 0:3:ba:f:68:0
ce1122002:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4>
mtu 1500 index 3
        inet 172.16.11.10 netmask ffffff00 broadcast 172.16.11.255
ce1122002:2: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4>
mtu 1500 index 3
        zone Z-ITR1-1
        inet 172.16.11.9 netmask ffffff00 broadcast 172.16.11.255
ce1222000: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500
index 4
        inet 172.16.3.6 netmask ffffff00 broadcast 172.16.3.127
        groupname ipmp3
        ether 0:3:ba:f:63:94
ce1222000:1:
flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,CoS>
mtu 1500 index 4
        inet 172.16.3.7 netmask ffffff00 broadcast 172.16.3.127
ce1222002:
flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu
1500 index 5
        inet 172.16.3.8 netmask ffffff00 broadcast 172.16.3.127
        groupname ipmp3
        ether 0:3:ba:f:68:0
ce1222002:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4>
mtu 1500 index 5
        zone Z-BTO1-1
        inet 172.16.3.9 netmask ffffff00 broadcast 172.16.3.255
ce1222002:2: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4>
mtu 1500 index 5
        zone Z-BTO1-2
        inet 172.16.3.10 netmask ffffff00 broadcast 172.16.3.255
ce1322000: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500
index 6
        inet 172.16.19.6 netmask ffffff00 broadcast 172.16.19.127
        groupname ipmp1
        ether 0:3:ba:f:63:94
ce1322000:1:
flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,CoS>
mtu 1500 index 6
        inet 172.16.19.7 netmask ffffff00 broadcast 172.16.19.127
ce1322002:
flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu
1500 index 7
        inet 172.16.19.8 netmask ffffff00 broadcast 172.16.19.127
        groupname ipmp1
        ether 0:3:ba:f:68:0
ce1322002:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4>
mtu 1500 index 7
        zone Z-TDN1-1
        inet 172.16.19.9 netmask ffffff00 broadcast 172.16.19.255
ce1411000: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500
index 12
        inet 172.16.27.6 netmask ffffff00 broadcast 172.16.27.255
        groupname ipmp4
        ether 0:3:ba:f:63:94
ce1411000:1:
flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,CoS>
mtu 1500 index 12
        inet 172.16.27.7 netmask ffffff00 broadcast 172.16.27.255
ce1411002:
flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu
1500 index 13
        inet 172.16.27.8 netmask ffffff00 broadcast 172.16.27.255
        groupname ipmp4
        ether 0:3:ba:f:68:0
ce1411002:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4>
mtu 1500 index 13
        zone Z-DRB1-1
        inet 172.16.27.9 netmask ffffff00 broadcast 172.16.27.255
clprivnet0:
flags=1009843<UP,BROADCAST,RUNNING,MULTICAST,MULTI_BCAST,PRIVATE,IPv4> mtu
1500 index 11
        inet 172.31.193.2 netmask ffffff00 broadcast 172.31.193.255
        ether 0:0:0:0:0:2

IPMP with VLANs

Hi All
Am trying to get my head around the new network config on Solaris 11.
I have 3 interfaces which access 2 VLANS (let's say 1 & 2).
For each VLAN, I wish to setup an IPMP failover connection, so that's 2 IPMP groups per NIC.
Do I have to create a Virtual NIC for each VLAN and add them to the IPMP groups respectively?
I'm attempting to understand the order of things here, since it also seems possible to add a 'blank' NIC to an IPMP group and still assign an IP address to it.
If anyone has any advice regarding the correct order (not a hand holding exercise just a point in the right direction) to set this up I'd be most grateful. (:
Thanks
John

Does anybody familiar with setting up multiple VLANs tags on network interfaces in Solaris 10?
Regards
Leonid

IPMP svc command

Hi, I am thinking of changing the config of a live system which has configured 2 IPs. I'll leave only one and create an IPMP failover but when it fails over it would take the same IP as the primary interface.
This is the config:
bge0
<hostname> netmask + broadcast + group <name> up
bge1
group <name> failover up
I would like to know if when I do this command to restart networking service would drop the connection at all or not:
svcadm restart svc:/network/physical:default
PS: Solaris 10 x86 11/06. Could do the same on a sparc machine running Sol10
Edited by: wb on Apr 24, 2008 4:24 PM

I have router 2901K9 with 15.2.T3 but I have not the command:
BPT-DE-01#sh ver
Cisco IOS Software, C2900 Software (C2900-UNIVERSALK9-M), Version 15.2(3)T, RELEASE SOFTWARE (fc1)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2012 by Cisco Systems, Inc.
Compiled Fri 23-Mar-12 16:57 by prod_rel_team
ROM: System Bootstrap, Version 15.0(1r)M15, RELEASE SOFTWARE (fc1)
BPT-DE-01 uptime is 4 minutes
System returned to ROM by reload at 16:42:54 UTC Tue Sep 11 2012
System image file is "flash:c2900-universalk9-mz.SPA.152-3.T.bin"
Last reload type: Normal Reload
Last reload reason: Reload Command
This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.
A summary of U.S. laws governing Cisco cryptographic products may be found at:
http://www.cisco.com/wwl/export/crypto/tool/stqrg.html
If you require further assistance please contact us by sending email to
[email protected]
BPT-DE-01#
BPT-DE-01#
BPT-DE-01#
BPT-DE-01#conf t
Enter configuration commands, one per line. End with CNTL/Z.
BPT-DE-01(config)#web
BPT-DE-01(config)#webvpn in
BPT-DE-01(config)#webvpn in?
% Unrecognized command
BPT-DE-01(config)#webvpn ?
cef         Enable data process in CEF path
context     Specify webvpn context
gateway     Virtual Gateway configuration
sslvpn-vif SSLVPN Virtual Interface commands

Storage issues during live clone in Server 2012 R2

We just set up a new 2012 R2 cluster and we are having troubles with live cloning. I have seen it work in our environment but usually it fails.
We get this error in our cluster events:
Cluster Shared Volume 'Volume8' ('Volume8') has entered a paused state because of '(c0000435)'. All I/O will temporarily be queued until a path to the volume is reestablished.
and the clone fails with:
Error (2916)
VMM is unable to complete the request. The connection to the agent ServerName was lost.
WinRM: URL: [http://ServerPath], Verb: [INVOKE], Method: [GetError], Resource: [http://schemas.microsoft.com/wbem/wsman/1/wmi/root/microsoft/bits/BitsClientJob?JobId={DC6D9530-26F1-4F95-A1AB-0197B1406F98}]
Not found (404) (0x80190194)
The storage is a Dell EqualLogic PS6500 and its connected to two S55 Force 10 48 port switches. The client side of the network is hooked to two Cisco 2960G 48 port switches.
I can't find any information on error c0000435. Has anyone head of this issue?

Hi,
I am Chetan Savade from Symantec Technical Support Team.
Few issues have been reported with the older version of SEP. I would recommend to test the connection using the latest version of SEP. SEP 12.1 RU4 MP1b is the latest version.
Reported issues:
Cluster environment does not fail over
Fix ID: 2731793
Symptom: A cluster environment does not fail over when Symantec Endpoint Protection client is installed due to inability to unload drivers.
Solution: Modified a driver to properly detach from a volume when the volume dismounts
Reference:
http://www.symantec.com/docs/TECH199676
Cluster is unable to fail over with AutoProtect enabled
Fix ID: 3246552
Symptom: With AutoProtect enabled, an active cluster node cannot fail over and hangs.
Solution: Corrected a delay in the AutoProtect volume dismount that resulted in cluster failover failures
http://www.symantec.com/docs/TECH211972
Best Regards,
Chetan

Oracle 10g CRS autorecovery from network failures - Solaris with IPMP

Hi all,
Just wondering if anyone has experience with a setup similar to mine. Let me first apologise for the lengthy introduction that follows >.<
A quick run-down of my implementation: Sun SPARC Solaris 10, Oracle CRS, ASM and RAC database patched to version 10.2.0.4 respectively, no third-party cluster software used for a 2-node cluster. Additionally, the SAN storage is attached directly with fiber cable to both servers, and the CRS files (OCR, voting disks) are always visible to the servers, there is no switch/hub between the server and the storage. There is IPMP configured for both the public and interconnect network devices. When performing the usual failover tests for IPMP, both the OS logs and the CRS logs show a failure detected, and a failover to the surviving network interface (on both the public and the private network devices).
For the private interconnect, when both of the network devices are disabled (by manually disconnecting the network cables), this results in the 2nd node rebooting, and the CRS process starting, but unable to synchronize with the 1st node (which is running fine the whole time). Further, when I look at the CRS logs, it is able to correctly identify all the OCR files and voting disks. When the network connectivity is restored, both the OS and CRS logs reflect this connection has been repaired. However, the CRS logs at this point still state that node 1 (which is running fine) is down, and the 2nd node attempts to join the cluster as the master node. When I manually run the 'crsctl stop crs' and 'crsctl start crs' commands, this results in a message stating that the node is going to be rebooted to ensure cluster integrity, and the 2nd node reboots, starts the CRS daemons again at startup, and joins the cluster normally.
For the public network, when the 2nd node is manually disconnected, the VIP is seen to not failover, and any attempts to connect to this node via the VIP result in a timeout. When connectivity is restored, as expected the OS and CRS logs acknowledge the recovery, and the VIP for node 2 automatically fails over, but the listener goes down as well. Using the 'srvctl start listener' command brings it up again, and everything is fine. During this whole process, the database instance runs fine on both nodes.
From the case studies above, I can see that the network failures are detected by the Oracle Clusterware, and a simple command run once this failure is repaired restores full functionality to the RAC database. However, is there anyway to automate this recovery, for the 2 cases stated above, so that there is no need for manual intervention by the DBAs? I was able to test case 2 (public network) with the Oracle document 805969.1 (VIP does not relocate back to the original node after public network problem is resolved), is there a similar workaround for the interconnect?
Any and all pointers would be appreciated, and again, sorry for the lengthy post.
Edited by: NS Selvam on 16-Dec-2009 20:36
changed some minor typos

hi
i ve given the shell script.i just need to run that i usually get the op like
[root@rac-1 Desktop]# sh iscsi-corntab.sh
Logging in to [iface: default, target: iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz, portal: 192.168.181.10,3260]
Login to [iface: default, target: iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz, portal: 192.168.181.10,3260]: successfulthe script contains :
iscsiadm -m node -T iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz -p 192.168.181.10 -l
iscsiadm -m node -T iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz -p 192.168.181.10 --op update -n node.startup -v automatic
(cd /dev/disk/by-path; ls -l *sayantan-chakraborty* | awk '{FS=" "; print $9 " " $10 " " $11}')
[root@rac-1 Desktop]# (cd /dev/disk/by-path; ls -l *sayantan-chakraborty* | awk '{FS=" "; print $9 " " $10 " " $11}')
ip-192.168.181.10:3260-iscsi-iqn.2010-02-23.de.sayantan-chakraborty:storage.disk1.amiens.sys1.xyz-lun-1 -> ../../sdc
[root@rac-1 Desktop]# can you post the oput of ls /dev/iscsi ??you may get like this:
[root@rac-1 Desktop]# ls /dev/iscsi
xyz
[root@rac-1 Desktop]#

IPMP Failure

Hello buddies,
We are facing a severe issues with the IPMP configuration that was working fine till the last patch ( Feb Patch)installation where we have virtual IP 192.168.1.1 and two physical IPs 192.168.1.2 & 192.168.1.3 that binded on ce0 and ce1 respectively. Let me come to the issue... All interfaces in the iPMP group are getting down ( three times in a day) and complaing that the default router is not pingable by the server through the two physical links. I am pretty sure that the switch where both links are connected are working fine without any outage. The follwing are the logs
Mar 15 19:54:13 serv1 in.mpathd[2095]: [ID 594170 daemon.error] NIC failure detected on ce1 of group ipmp-pub
Mar 15 19:54:13 serv1 in.mpathd[2095]: [ID 594170 daemon.error] NIC failure detected on ce1 of group ipmp-pub
Mar 15 19:54:13 serv1 in.mpathd[2095]: [ID 594170 daemon.error] NIC failure detected on ce1 of group ipmp-pub
Mar 15 19:54:13 serv1 in.mpathd[2095]: [ID 832587 daemon.error] Successfully failed over from NIC ce1 to NIC ce0
Mar 15 19:54:13 serv1 in.mpathd[2095]: [ID 832587 daemon.error] Successfully failed over from NIC ce1 to NIC ce0
Mar 15 19:54:13 serv1 in.mpathd[2095]: [ID 832587 daemon.error] Successfully failed over from NIC ce1 to NIC ce0
Mar 15 19:54:39 serv1 in.mpathd[2095]: [ID 168056 daemon.error] All Interfaces in group ipmp-pub have failed
Mar 15 19:54:39 serv1 in.mpathd[2095]: [ID 168056 daemon.error] All Interfaces in group ipmp-pub have failed
Mar 15 19:54:39 serv1 in.mpathd[2095]: [ID 168056 daemon.error] All Interfaces in group ipmp-pub have failed
Mar 15 20:33:32 serv1 in.mpathd[2095]: [ID 620804 daemon.error] Successfully failed back to NIC ce1
Mar 15 20:33:32 serv1 in.mpathd[2095]: [ID 620804 daemon.error] Successfully failed back to NIC ce1
Mar 15 20:33:32 serv1 in.mpathd[2095]: [ID 620804 daemon.error] Successfully failed back to NIC ce1
Mar 15 20:33:32 serv1 in.mpathd[2095]: [ID 299542 daemon.error] NIC repair detected on ce1 of group ipmp-pub
Mar 15 20:33:32 serv1 in.mpathd[2095]: [ID 299542 daemon.error] NIC repair detected on ce1 of group ipmp-pub
Mar 15 20:33:32 serv1 in.mpathd[2095]: [ID 299542 daemon.error] NIC repair detected on ce1 of group ipmp-pub
Mar 15 20:33:32 serv1 in.mpathd[2095]: [ID 237757 daemon.error] At least 1 interface (ce1) of group ipmp-pub has repaired
Any help would be great appreciable.
Thanks,
Muhammed Afsal K.S

lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
     inet 127.0.0.1 netmask ff000000
ce0: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
     inet 192.168.217.6 netmask ffffff00 broadcast 192.168.217.255
     groupname ipmp1
     ether 0:3:ba:b0:5d:54
ce3: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
     inet 192.168.217.20 netmask ffffff00 broadcast 192.168.217.255
     groupname ipmp1
     ether 0:3:ba:95:5d:6e
ce3:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 4
     inet 192.168.217.4 netmask ffffff00 broadcast 192.168.217.255
General speaking,
When I switch float IP from ce0 to ce3, IPMP will say ce0 MAC is "trying to be our address ....", then ce0 test IP failed, FLOAT IP didn't failover.
When I switch float IP from ce3 to ce0, IPMP will say ce3 MAC is "trying to be our address ....",
then ce0 test IP failed, FLOAT IP didn't failover.
In my viewpoint, float NIC MAC & address information may be cached in CICSO device's RAM, not released in time, I think.

IPMP failures on bge Interface

We've been testing IPMP on Solaris Sparc hosts that also have the Apani IPSec Agent installed. It works fine on older hosts that have 'qfe' and 'le' interfaces, but our v210's and T1000's with 'bge' interfaces have a problem. If we configure an IPMP group to use, say, bge0 and bge1 (with bge0 as the primary interface), it works fine. Disconnecting bge0 causes a failover to bge1, also fine. Disconnecting bge1 causes the following errors:
Nov 2 10:32:29 cs22 in.mpathd[146]: NIC failure detected on bge1 of group test
Nov 2 10:32:29 cs22 in.mpathd[146]: Successfully failed over from NIC bge1 to NIC bge0
Nov 2 10:32:37 cs2 in.mpathd[146]: All Interfaces in group test have failed
All interfaces fail, even though bge0 is still connected and was active before disconnecting bge1. The system recovers once bge0 is reconnected. The two interfaces are physically connected to the same switch, and the hostname.bgeX files are:
-------- hostname.bge0
cs22 netmask + broadcast + group test up \
addif cs21 deprecated -failover netmask + broadcast + up
-------- hostname.bge1
sp12 netmask + broadcast + group test up \
addif sp16 deprecated -failover netmask + broadcast + up
Any help would be appreciated, thanks in advance.

Hello again,
When gathering data for the previous reply, I also noticed that the default route had not been set. We usually do specify that, so I added that to the configuration. But, the host had found the correct router previously, it's 63.192.77.9. Specifying it did not change the problem symptoms, anyway. Here's the other requested info:
-> netstat -rn
Routing Table: IPv4
Destination           Gateway           Flags Ref   Use   Interface
63.192.77.0          63.192.77.12         U         1      5 bge1
63.192.77.0          63.192.77.22         U         1      1 bge0
63.192.77.0          63.192.77.22         U         1      0 bge0:1
63.192.77.0          63.192.77.12         U         1      0 bge1:1
224.0.0.0            63.192.77.22         U         1      0 bge0
default              63.192.77.9          UG        1      0
127.0.0.1            127.0.0.1            UH        7     93 lo0
-> routeadm
              Configuration   Current              Current
                     Option   Configuration        System State
            IPv4 forwarding   disabled             disabled
               IPv4 routing   default (disabled)   disabled
            IPv6 forwarding   disabled             disabled
               IPv6 routing   disabled             disabled
        IPv4 routing daemon   "/usr/sbin/in.routed"
   IPv4 routing daemon args   ""
   IPv4 routing daemon stop   "kill -TERM `cat /var/tmp/in.routed.pid`"
        IPv6 routing daemon   "/usr/lib/inet/in.ripngd"
   IPv6 routing daemon args   "-s"
   IPv6 routing daemon stop   "kill -TERM `cat /var/tmp/in.ripngd.pid`"
r
-> arp -an
Net to Media Table: IPv4
Device   IP Address               Mask      Flags   Phys Addr
bge1   63.192.77.1          255.255.255.255       00:03:ba:c0:77:75
bge0   63.192.77.9          255.255.255.255       00:16:46:f1:b5:c2
bge1   63.192.77.9          255.255.255.255       00:16:46:f1:b5:c2
bge1   63.192.77.186        255.255.255.255       00:c0:4f:60:6a:ab
bge0   63.192.77.186        255.255.255.255       00:c0:4f:60:6a:ab
bge1   63.192.77.191        255.255.255.255       00:0c:f1:bf:1d:01
bge0   63.192.77.191        255.255.255.255       00:0c:f1:bf:1d:01
bge1   63.192.77.169        255.255.255.255       00:0c:f1:bf:1c:92
bge0   63.192.77.169        255.255.255.255       00:0c:f1:bf:1c:92
bge1   63.192.77.175        255.255.255.255       00:c0:4f:60:68:64
bge0   63.192.77.175        255.255.255.255       00:c0:4f:60:68:64
bge1   63.192.77.144        255.255.255.255       00:c0:4f:60:68:94
bge0   63.192.77.144        255.255.255.255       00:c0:4f:60:68:94
bge1   63.192.77.150        255.255.255.255       00:c0:4f:60:6a:70
bge0   63.192.77.150        255.255.255.255       00:c0:4f:60:6a:70
bge0   63.192.77.130        255.255.255.255       00:0c:f1:bf:1d:1f
bge1   63.192.77.130        255.255.255.255       00:0c:f1:bf:1d:1f
bge1   63.192.77.128        255.255.255.255       00:0c:f1:bf:1c:65
bge0   63.192.77.128        255.255.255.255       00:0c:f1:bf:1c:65
bge1   63.192.77.242        255.255.255.255       00:0d:56:0b:eb:2a
bge0   63.192.77.242        255.255.255.255       00:0d:56:0b:eb:2a
bge1   63.192.77.243        255.255.255.255       00:0f:1f:91:c1:9b
bge0   63.192.77.243        255.255.255.255       00:0f:1f:91:c1:9b
bge1   63.192.77.240        255.255.255.255       00:13:72:17:cb:13
bge0   63.192.77.240        255.255.255.255       00:13:72:17:cb:13
bge1   63.192.77.247        255.255.255.255       00:c0:4f:60:6a:e6
bge0   63.192.77.247        255.255.255.255       00:c0:4f:60:6a:e6
bge1   63.192.77.224        255.255.255.255       00:09:6b:2e:61:dd
bge0   63.192.77.224        255.255.255.255       00:09:6b:2e:61:dd
bge1   63.192.77.225        255.255.255.255       00:11:11:c4:9c:eb
bge0   63.192.77.225        255.255.255.255       00:11:11:c4:9c:eb
bge1   63.192.77.236        255.255.255.255       00:03:ba:eb:17:6d
bge0   63.192.77.236        255.255.255.255       00:03:ba:eb:17:6d
bge1   63.192.77.210        255.255.255.255       00:11:11:b1:2b:6e
bge0   63.192.77.210        255.255.255.255       00:11:11:b1:2b:6e
bge1   63.192.77.222        255.255.255.255       00:30:6e:08:ed:3a
bge0   63.192.77.222        255.255.255.255       00:30:6e:08:ed:3a
bge1   63.192.77.193        255.255.255.255       00:13:72:23:32:aa
bge0   63.192.77.193        255.255.255.255       00:13:72:23:32:aa
bge1   63.192.77.207        255.255.255.255       00:0c:f1:b6:26:aa
bge0   63.192.77.207        255.255.255.255       00:0c:f1:b6:26:aa
bge1   63.192.77.204        255.255.255.255       00:c0:4f:60:68:5b
bge0   63.192.77.204        255.255.255.255       00:c0:4f:60:68:5b
bge1   63.192.77.48         255.255.255.255       00:0a:95:99:e4:40
bge0   63.192.77.48         255.255.255.255       00:0a:95:99:e4:40
bge0   63.192.77.49         255.255.255.255       00:03:93:90:52:f6
bge1   63.192.77.61         255.255.255.255       00:c0:4f:60:6a:75
bge0   63.192.77.61         255.255.255.255       00:c0:4f:60:6a:75
bge1   63.192.77.35         255.255.255.255       00:30:6e:49:41:50
bge0   63.192.77.35         255.255.255.255       00:30:6e:49:41:50
bge1   63.192.77.36         255.255.255.255       00:16:35:3e:7d:0a
bge0   63.192.77.36         255.255.255.255       00:16:35:3e:7d:0a
bge0   63.192.77.42         255.255.255.255       00:11:11:c4:9d:05
bge1   63.192.77.42         255.255.255.255       00:11:11:c4:9d:05
bge1   63.192.77.40         255.255.255.255       00:0c:f1:bf:1f:8d
bge0   63.192.77.40         255.255.255.255       00:0c:f1:bf:1f:8d
bge1   63.192.77.41         255.255.255.255       00:0c:f1:bf:1d:10
bge0   63.192.77.41         255.255.255.255       00:0c:f1:bf:1d:10
bge0   63.192.77.19         255.255.255.255       08:00:20:f0:ea:e4
bge1   63.192.77.19         255.255.255.255       08:00:20:f0:ea:e4
bge1   63.192.77.16         255.255.255.255 SP    00:14:4f:2a:9b:83
bge0   63.192.77.22         255.255.255.255 SP    00:14:4f:2a:9b:82
bge0   63.192.77.23         255.255.255.255       00:09:6b:3e:2b:82
bge1   63.192.77.23         255.255.255.255       00:09:6b:3e:2b:82
bge0   63.192.77.21         255.255.255.255 SP    00:14:4f:2a:9b:82
bge1   63.192.77.29         255.255.255.255       00:09:6b:2e:46:51
bge0   63.192.77.29         255.255.255.255       00:09:6b:2e:46:51
bge0   63.192.77.1          255.255.255.255       00:03:ba:c0:77:75
bge1   63.192.77.12         255.255.255.255 SP    00:14:4f:2a:9b:83
bge0   63.192.77.115        255.255.255.255       00:0c:f1:bf:1c:e6
bge1   63.192.77.115        255.255.255.255       00:0c:f1:bf:1c:e6
bge1   63.192.77.122        255.255.255.255       00:10:83:f9:34:d4
bge0   63.192.77.122        255.255.255.255       00:10:83:f9:34:d4
bge1   63.192.77.125        255.255.255.255       00:0f:1f:91:bf:7d
bge0   63.192.77.125        255.255.255.255       00:0f:1f:91:bf:7d
bge1   63.192.77.99         255.255.255.255       00:0c:f1:bf:1a:52
bge0   63.192.77.99         255.255.255.255       00:0c:f1:bf:1a:52
bge1   63.192.77.100        255.255.255.255       00:0c:f1:b6:26:b4
bge0   63.192.77.100        255.255.255.255       00:0c:f1:b6:26:b4
bge1   63.192.77.101        255.255.255.255       00:0c:f1:bf:1c:fe
bge0   63.192.77.101        255.255.255.255       00:0c:f1:bf:1c:fe
bge1   63.192.77.107        255.255.255.255       00:0d:56:14:48:4d
bge0   63.192.77.107        255.255.255.255       00:0d:56:14:48:4d
bge1   63.192.77.110        255.255.255.255       00:c0:4f:60:6a:44
bge0   63.192.77.110        255.255.255.255       00:c0:4f:60:6a:44
bge1   63.192.77.108        255.255.255.255       00:14:bf:31:ec:e2
bge0   63.192.77.108        255.255.255.255       00:14:bf:31:ec:e2
bge0   63.192.77.80         255.255.255.255       00:16:cb:a6:5e:3d
bge1   63.192.77.80         255.255.255.255       00:16:cb:a6:5e:3d
bge1   63.192.77.92         255.255.255.255       00:40:63:d3:8c:46
bge0   63.192.77.92         255.255.255.255       00:40:63:d3:8c:46
bge1   63.192.77.68         255.255.255.255       00:0c:f1:b6:27:10
bge0   63.192.77.68         255.255.255.255       00:0c:f1:b6:27:10
bge1   63.192.77.69         255.255.255.255       00:13:72:17:ca:4a
bge0   63.192.77.69         255.255.255.255       00:13:72:17:ca:4a
bge1   63.192.77.73         255.255.255.255       00:03:93:d1:db:cc
bge0   63.192.77.73         255.255.255.255       00:03:93:d1:db:cc
bge1   63.192.77.77         255.255.255.255       00:30:65:a8:22:bc
bge0   63.192.77.77         255.255.255.255       00:30:65:a8:22:bc
bge1   224.0.0.0            240.0.0.0       SM    01:00:5e:00:00:00
bge0   224.0.0.0            240.0.0.0       SM    01:00:5e:00:00:00
-> ps -aef
     UID   PID PPID   C    STIME TTY         TIME CMD
    root     0     0   0 15:11:12 ?           0:11 sched
    root     1     0   0 15:11:13 ?           0:00 /sbin/init
    root     2     0   0 15:11:13 ?           0:00 pageout
    root     3     0   0 15:11:13 ?           0:00 fsflush
daemon   196     1   0 15:11:37 ?           0:00 /usr/sbin/rpcbind
    root     7     1   0 15:11:15 ?           0:10 /lib/svc/bin/svc.startd
    root     9     1   0 15:11:16 ?           0:16 /lib/svc/bin/svc.configd
    root   256     1   0 15:11:40 ?           0:00 /usr/sbin/cron
    root   335     1   0 15:11:49 ?           0:00 /usr/sbin/syslogd
    root   113     1   0 15:11:33 ?           0:00 /usr/sbin/nscd -S passwd,yes
    root   726   691   0 15:16:16 pts/1       0:00 ps -aef
daemon   201     1   0 15:11:37 ?           0:00 /usr/lib/nfs/statd
    root   200     1   0 15:11:37 ?           0:00 /usr/sbin/keyserv
    root   192     1   0 15:11:36 ?           0:01 /opt/apani/uagent/nlagent
daemon    86     1   0 15:11:26 ?           0:00 /usr/lib/crypto/kcfd
    root   152     1   0 15:11:35 ?           0:00 /usr/lib/inet/in.mpathd -a
    root   212     7   0 15:11:38 ?           0:00 /usr/lib/saf/sac -t 300
    root    89     1   0 15:11:26 ?           0:00 /usr/lib/picl/picld
daemon   247     1   0 15:11:40 ?           0:00 /usr/lib/nfs/nfs4cbd
    root   102     1   0 15:11:28 ?           0:00 /usr/lib/power/powerd
    root    98     1   0 15:11:27 ?           0:00 /usr/lib/sysevent/syseventd
    root   215     1   0 15:11:38 ?           0:00 /usr/sbin/nis_cachemgr
daemon   214     1   0 15:11:38 ?           0:00 /usr/lib/nfs/lockd
    root   213     1   0 15:11:38 ?           0:00 /usr/lib/utmpd
    root   217     7   0 15:11:38 console     0:00 -sh
    root   223   192   0 15:11:39 ?           0:00 inm -p9165
    root   222   212   0 15:11:39 ?           0:00 /usr/lib/saf/ttymon
daemon   255     1   0 15:11:40 ?           0:00 /usr/lib/nfs/nfsmapid
    root   399   397   0 15:11:52 ?           0:00 /usr/sadm/lib/smc/bin/smcboot
    root   252     1   0 15:11:40 ?           0:04 /usr/lib/inet/inetd start
    root   398   397   0 15:11:52 ?           0:00 /usr/sadm/lib/smc/bin/smcboot
    root   317     1   0 15:11:48 ?           0:00 /usr/lib/autofs/automountd
    root   359     1   0 15:11:50 ?           0:00 /usr/lib/sendmail -bd -q15m
    root   448   447   0 15:11:53 ?           0:00 /usr/lib/locale/ja/wnn/jserver_m
    root   351     1   0 15:11:50 ?           0:02 /usr/lib/fm/fmd/fmd
    root   674   252   0 15:12:14 ?           0:00 /usr/sbin/in.telnetd
    root   347     1   0 15:11:50 ?           0:00 /usr/lib/ssh/sshd
   smmsp   360     1   0 15:11:50 ?           0:00 /usr/lib/sendmail -Ac -q15m
    root   461     1   0 15:11:53 ?           0:00 /usr/lib/locale/ja/atokserver/atokmngdaemon
    root   397     1   0 15:11:52 ?           0:00 /usr/sadm/lib/smc/bin/smcboot
    root   468   459   0 15:11:53 ?           0:00 htt_server -port 9010 -syslog -message_locale C
    root   441     1   0 15:11:53 ?           0:00 /usr/lib/locale/ja/wnn/dpkeyserv
    root   447     1   0 15:11:53 ?           0:00 /usr/lib/locale/ja/wnn/jserver
    root   459     1   0 15:11:53 ?           0:00 /usr/lib/im/htt -port 9010 -syslog -message_locale C
    root   512     1   0 15:11:55 ?           0:00 /usr/lib/snmp/snmpdx -y -c /etc/snmp/conf
    root   520     1   0 15:11:56 ?           0:00 /usr/lib/dmi/dmispd
    root   528     1   0 15:11:56 ?           0:00 /usr/sbin/vold
    root   521     1   0 15:11:56 ?           0:00 /usr/lib/dmi/snmpXdmid -s cstoc77022
    root   511     1   0 15:11:55 ?           0:00 /usr/dt/bin/dtlogin -daemon
    root   691   677   0 15:12:18 pts/1       0:00 bash
    root   677   674   0 15:12:14 pts/1       0:00 -sh
    root   585     1   0 15:11:57 ?           0:00 /usr/sfw/sbin/snmpd

RE: Hard Failures, KeepAlive, and Failover --Follow-up

Hi,
It's a really challenging question. However, what do you want to do after
the network crash? Failover or just stop the service? Should we assume
that when the network is down, and so do your name service?
One idea is to use externalconnection to "listen" to your external non-forte
alarm, so do "whatever" after you receive the alarm instead of letting the
"logical connection" to time out or hang.
Regards,
Peter Sham.
-----Original Message-----
From: Michael Lee [SMTP:[email protected]]
Sent: Wednesday, June 16, 1999 12:44 AM
To: [email protected]
Subject: Hard Failures, KeepAlive, and Failover -- Follow-up
I've gotten a handful of responses to my original post, and the suggested
solutions are all variations on the same theme -- periodically ping remote
nodes/partitions and then react when the node/partition goes down. In
other circumstance this would work, but unless I'm missing something this
solution doesn't solve the problem I'm running into.
Some background...
When a connection is set up between partitions on two different nodes,
Forte is effectively establishing two connections: a "physical
connection"
over TCP/IP between two ports and a "logical connection" between the two
partitions (running on top of the physical connection). Once a connection
is established between two partitions Forte assumes the logical connection
is valid until one of two things happen:
1) The logical connection is broken (by shutting down a partition from
Econsole/Escript, by killing a node manager, by terminating the ftexec,
etc.)
2) Forte detects that the physical connection is broken (via its KeepAlive
functionality).
If a physical connection is broken (via a cut cable or power-off
condition), and Forte has not yet detected the situation (via a KeepAlive
failure), the logical connection is still valid and Forte will still allow
method calls on the remote partition. In effect, Forte thinks the remote
partition is still up and running. In this situation, any method calls
made after the physical connection has been broken will simply hang. No
exceptions are generated and failover does not occur.
However, once a KeepAlive failure is detected all is made right.
Unfortunately, the lowest-bound latency of KeepAlive is greater than one
second, and we need to detect and react to hard failures in the 250-500ms
range. Using technology outside of Forte we are able to detect the hard
failures within the required times, but we haven't been able to get Forte
to react to this "outside" knowledge. Here's why:
Since Forte has not yet detected a KeepAlive failure, the logical
connection to the remote partition is still "valid". Although there are a
number of mechanisms that would allow a logical connection to be broken,
they all assume a valid physical connection -- which, of course, we don't
have!
It appears I'm in a "Catch-22" situation: In order to break a logical
connection between partitions, I need a valid physical connection. But
the
reason I'm trying to break the logical connection in the first place is
that I know (but Forte doesn't yet know) that the physical connection has
been broken.
If anyone knows a way around this Catch-22, please let me know.
Mike
To unsubscribe, email '[email protected]' with
'unsubscribe forte-users' as the body of the message.
Searchable thread archive <URL:http://pinehurst.sageit.com/listarchive/>-
To unsubscribe, email '[email protected]' with
'unsubscribe forte-users' as the body of the message.
Searchable thread archive <URL:http://pinehurst.sageit.com/listarchive/>

Make sure you chose the right format, and as far as partitioning in concerned, you have to select at least one partition, which will be the entire drive.

IPMP failover Failure

Similar Messages

Maybe you are looking for