Retaining state on failover of SO
We have a ServiceObject that we need to mark for failover. The data that
resides on this SO needs to be re-populated on the backup copy of the SO
when it becomes active. My question is how are others handling the
duplication of the data on the primary SO so that it is available to
populate the backup SO once that backup SO takes over? Writing to a file
and then reading from that file? Maintaining a mirror object containing
duplicate data? If this is the case then how often is this data 'synched
up'?
Your help would be greatly appreciated!!!
George Vallas
Systems Engineer
EDS Medi-Cal - Systems
3215 Prospect Park Dr.
Rancho Cordova, CA 95670
Phone: (916)636-1183
Mail: [email protected]
To unsubscribe, email '[email protected]' with
'unsubscribe forte-users' as the body of the message.
Searchable thread archive <URL:http://pinehurst.sageit.com/listarchive/>
We have a ServiceObject that we need to mark for failover. The data that
resides on this SO needs to be re-populated on the backup copy of the SO
when it becomes active. My question is how are others handling the
duplication of the data on the primary SO so that it is available to
populate the backup SO once that backup SO takes over? Writing to a file
and then reading from that file? Maintaining a mirror object containing
duplicate data? If this is the case then how often is this data 'synched
up'?
Your help would be greatly appreciated!!!
George Vallas
Systems Engineer
EDS Medi-Cal - Systems
3215 Prospect Park Dr.
Rancho Cordova, CA 95670
Phone: (916)636-1183
Mail: [email protected]
To unsubscribe, email '[email protected]' with
'unsubscribe forte-users' as the body of the message.
Searchable thread archive <URL:http://pinehurst.sageit.com/listarchive/>
Similar Messages
-
Resources in UNKNOWN state after failover
Hi,
I'm new to RAC and trying to understand how to bring a node back online. See crsctl output below.
I've noted that the listener, ons and the database on the db01 node are all in an UNKNOWN state. The vip state is INTERMEDIATE.
The cause was activating a virtual NIC on the db01 node. The vNIC was deactivated but not before a failover.
I see oc4j is offline also which I believe is contributing to OEM dbconsole being unnavailable.
Please advise/suggest the best way forward to remedy this.
Mr C
[oracle@db01 ~]$ $ORACLE_HOME/bin/crsctl status resource -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DATA.dg
ONLINE ONLINE db01
ONLINE ONLINE db02
ora.FRA.dg
ONLINE ONLINE db01
ONLINE ONLINE db02
ora.LISTENER.lsnr
ONLINE UNKNOWN db01
ONLINE ONLINE db02
ora.VOTE.dg
ONLINE ONLINE db01
ONLINE ONLINE db02
ora.asm
ONLINE ONLINE db01 Started
ONLINE ONLINE db02 Started
ora.eons
ONLINE ONLINE db01
ONLINE ONLINE db02
ora.gsd
OFFLINE OFFLINE db01
OFFLINE OFFLINE db02
ora.net1.network
ONLINE ONLINE db01
ONLINE ONLINE db02
ora.ons
ONLINE UNKNOWN db01
ONLINE ONLINE db02
ora.registry.acfs
ONLINE ONLINE db01
ONLINE ONLINE db02
Cluster Resources
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE db02
ora.LISTENER_SCAN2.lsnr
1 ONLINE UNKNOWN db01
ora.LISTENER_SCAN3.lsnr
1 ONLINE UNKNOWN db01
ora.e1oradb.db
*1 OFFLINE UNKNOWN db01*
2 ONLINE ONLINE db02 Open
ora.db01.vip
*1 ONLINE INTERMEDIATE db02 FAILED OVER*
ora.db02.vip
1 ONLINE ONLINE db02
ora.oc4j
*1 OFFLINE OFFLINE*
ora.scan1.vip
1 ONLINE ONLINE db02
ora.scan2.vip
1 ONLINE ONLINE db02
ora.scan3.vip
1 ONLINE ONLINE db02
[oracle@db01 ~]$Thanks Sebastian,
I was a little busy so only now do I have the time to reply.
Another DBA was able to troubleshoot the root cause of the issue, which I will put here for the benefit of all.
Essentially the hosts file (see snippet below) contained multiple entries for the node name, both on the "localhost" line and also in the "Public" section. A quick edit solved the problem.
# Do not remove the following line, or various programs
# that require network functionality will fail.
#127.0.0.1 db02.placeholder.com.au db02 localhost.localdomain
localhost
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
# Public
ww.xx.yy.zzz db01.placeholder.com.au db01
ww.xx.yy.zzz db02.placeholder.com.au db02
# Private
ww.xx.yy.z db01-priv.placeholder.com.au db01-priv
ww.xx.yy.z db02-priv.placeholder.com.au db02-priv
# Virtual
ww.xx.yy.zzz db01-vip.placeholder.com.au db01-vip
ww.xx.yy.zzz db02-vip.placeholder.com.au db02-vip
Mr C -
Retain state when browser is refreshed
Hi I have 2 states - Login and Viewer
When app is launched, it goes to Login state.
When logged in, it switches to Viewer state
In Viewer state, when refreshing browser whole page is reloaded and Login form appears again.
I have tried to use Deeplinking, but this doesn't work.
Has anyone got an example how this can be achieved.
ThanksYou can store information in a SharedObject, and read this information in when the application loads. Use this data to determine the state your application should be in. A simplified form of the memento pattern would work well for this case.
-
Hi, All Forte Experts
I have 2 questions:
1. Would any one tell me where can I put the event loop block in a
Service Object?
I tried to put it in the Init to get a timer.tick event, but the SO
hanged to wait there.
How can I do this to let the SO do something at certain time?
2. I have SO doing failover. Can one sleeping SO get any kind of
information when it becomes alive
after the running SO dies?
Thanks a lot for your help.
Alex
Carpe Diem, Seize the Day !
Alex Lee (Li Zhongling)
Forte, Java/CORBA Group
International Business Corporation
Bangalore 560010, India
To unsubscribe, email '[email protected]' with
'unsubscribe forte-users' as the body of the message.
Searchable thread archive <URL:http://pinehurst.sageit.com/listarchive/>Hi,
Thank you all a lot for your help.
For the second question, I explain the situation here. I have a failover
LockManager SO which contains a LockList. When the first running SO dies, I
want the second one to become alive and restore the state of the first one( I
mean the LockList should not be lost). Then I tried to do it in this way:
1. As Daniel Nguyen suggested in Re:Retaining state on failover of SO
(Jan.19,1999), I put one shared object holding one Locklist on the Router
partition. Let the running SO always refresh its contents.
2. I make the SO Transactionaln with Transaction Dialog Duration. On the
client, every time I start a transaction to add or remove lock through SO, I
try to catch the exception "AbortException" (which means the SO dies). To
handle the exception, I use Releaseconnection, then force the backup SO get the
locklist from the shared object on Router partition. In this way, the backup SO
becomes alive and also restore the state.
I think this works. I will also try your suggestion ASAP. However, would you
please tell me how to control the secondary SO not to start and later to start
it? Every time I try to run the app, all the partition will start. And I guess,
I can not new it.
Thanks.
Rds
Alex
Alex,
1) as Arpad mentioned in his posting, you can start task a method which
contains your event loop from the init method.
as for
2), what I've done in the past is to start both primary and secondary SOs,
where the primary SO's event loop begins at start up, but the secondary SO
is dormant.. I created a "monitoring" SO which listens for remote access
exceptions (or distributed access exceptions) and on the death of the first
SO, the second SO's event loop is started. The other option is to not start
the secondary SO until the primary one fails, but there is a lag time for
the SO to come up.
j
Carpe Diem, Seize the Day !
Alex Lee (Li Zhongling)
Forte, Java/CORBA Group
International Business Corporation
Bangalore 560010, India
To unsubscribe, email '[email protected]' with
'unsubscribe forte-users' as the body of the message.
Searchable thread archive <URL:http://pinehurst.sageit.com/listarchive/> -
Retaining path selection state!
Hi,
I am writing tool which modifies path using GetPathSegments and SetPathSegments. Afterwards, path is always fully selected! I want path to retain state of selection (maybe some direction handles showing) like what happens when modified with white arrow.
Tried calling
sAIArt->SetArtUserAttr( targetPath, kArtSelected, 0 );
which deselects path OK. But even if I also afterwards use
sAIPath->SetPathSegmentSelected( targetPath, segNumber, SegmentInAndOutSelected );
path is always fully selected after tool mouseUp. Why???
Thank you,
ChunkyIllustrator's plug-in API undo mechanism doesn't remember the selection state for individual points on paths. So on undo/redo paths become fully selected. The white arrow tool is not implemented as a plug-in so it gets to control the details of undo/redo for itself. I don't think there's much, if anything, you can do to work around this.
-
Link outage in Etherchannel causes interface down and failover Secondary Faild
Hi,
I have configured port-channel Firewall ASA5515-X and stacking switch WS-3750X. Also firewall configured as failover mode. Problem is that my active firewall connected switch port show green and working but standby firewall connected switch port shows orange color. When i inpute show failover command on firewall, secondary is faild. Please assist. Here is the below show command.
mdbl-int-fw-01# sho port-channel 10
Ports: 2 Maxports = 16
Port-channels: 1 Max Port-channels = 48
Protocol: LACP/ active
Minimum Links: 1
Maximum Bundle: 8
Load balance: src-dst-ip
mdbl-int-fw-01# sho interface port-channel 10
Interface Port-channel10 "inside", is up, line protocol is up
Hardware is EtherChannel/LACP, BW 2000 Mbps, DLY 10 usec
Auto-Duplex(Full-duplex), Auto-Speed(1000 Mbps)
Input flow control is unsupported, output flow control is off
Description: *** Connected to CORE-SW ***
MAC address 4c00.821d.511f, MTU 1500
IP address 10.98.8.97, subnet mask 255.255.255.248
Traffic Statistics for "inside":
56859 packets input, 3419130 bytes
148709 packets output, 16063580 bytes
56858 packets dropped
1 minute input rate 0 pkts/sec, 46 bytes/sec
1 minute output rate 2 pkts/sec, 216 bytes/sec
1 minute drop rate, 0 pkts/sec
5 minute input rate 0 pkts/sec, 46 bytes/sec
5 minute output rate 2 pkts/sec, 216 bytes/sec
5 minute drop rate, 0 pkts/sec
Members in this channel:
Active: Gi0/1 Gi0/2
mdbl-int-fw-01# sho port
mdbl-int-fw-01# sho port-channel sum
mdbl-int-fw-01# sho port-channel summary
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
U - in use N - not in use, no aggregation/nameif
M - not in use, no aggregation due to minimum links not met
w - waiting to be aggregated
Number of channel-groups in use: 1
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
10 Po10(U) LACP Gi0/1(P) Gi0/2(P)
mdbl-int-fw-01#
mdbl-int-fw-01# sho port-channel ?
<1-48> Channel group number
brief Brief information
detail Detail information
port Port information
protocol protocol enabled
summary One-line summary per channel-group
| Output modifiers
<cr>
mdbl-int-fw-01# sho port-channel bri
mdbl-int-fw-01# sho port-channel brief
Channel-group listing:
Group: 10
Ports: 2 Maxports = 16
Port-channels: 1 Max Port-channels = 48
Protocol: LACP/ active
Minimum Links: 1
Maximum Bundle: 8
Load balance: src-dst-ip
mdbl-int-fw-01# sho port-channel ?
<1-48> Channel group number
brief Brief information
detail Detail information
port Port information
protocol protocol enabled
summary One-line summary per channel-group
| Output modifiers
<cr>
mdbl-int-fw-01# sho port-channel pro
mdbl-int-fw-01# sho port-channel protocol
Channel-group listing:
Group: 10
Protocol: LACP
mdbl-int-fw-01# sho port-channel ?
<1-48> Channel group number
brief Brief information
detail Detail information
port Port information
protocol protocol enabled
summary One-line summary per channel-group
| Output modifiers
<cr>
mdbl-int-fw-01# sho port-channel det
mdbl-int-fw-01# sho port-channel detail
Channel-group listing:
Group: 10
Ports: 2 Maxports = 16
Port-channels: 1 Max Port-channels = 48
Protocol: LACP/ active
Minimum Links: 1
Maximum Bundle: 8
Load balance: src-dst-ip
Ports in the group:
Port: Gi0/1
Port state = bndl
Channel group = 10 Mode = LACP/ active
Port-channel = Po10
Flags: S - Device is sending Slow LACPDUs F - Device is sending fast LACPDUs.
A - Device is in active mode. P - Device is in passive mode.
Local information:
LACP port Admin Oper Port Port
Port Flags State Priority Key Key Number State
Gi0/1 SA bndl 32768 0xa 0xa 0x2 0x3d
Partner's information:
Partner Partner LACP Partner Partner Partner Partner Partner
Port Flags State Port Priority Admin Key Oper Key Port Number Port State
Gi0/1 SA bndl 32768 0x0 0xa 0x118 0x3d
Port: Gi0/2
Port state = bndl
Channel group = 10 Mode = LACP/ active
Port-channel = Po10
Flags: S - Device is sending Slow LACPDUs F - Device is sending fast LACPDUs.
A - Device is in active mode. P - Device is in passive mode.
Local information:
LACP port Admin Oper Port Port
Port Flags State Priority Key Key Number State
Gi0/2 SA bndl 32768 0xa 0xa 0x3 0x3d
Partner's information:
Partner Partner LACP Partner Partner Partner Partner Partner
Port Flags State Port Priority Admin Key Oper Key Port Number Port State
Gi0/2 SA bndl 32768 0x0 0xa 0x119 0x3d
mdbl-int-fw-01#
mdbl-int-fw-01#
mdbl-int-fw-01#
mdbl-int-fw-01#
mdbl-int-fw-01# sho port-channel ?
<1-48> Channel group number
brief Brief information
detail Detail information
port Port information
protocol protocol enabled
summary One-line summary per channel-group
| Output modifiers
<cr>
mdbl-int-fw-01# sho fail
mdbl-int-fw-01# sho failover st
mdbl-int-fw-01# sho failover state
State Last Failure Reason Date/Time
This host - Primary
Active None
Other host - Secondary
Failed Ifc Failure 22:03:03 UTC Jan 8 2014
outside: No Link
dmz: No Link
mgt: No Link
inside: No Link
====Configuration State===
Sync Done
====Communication State===
Mac set
mdbl-int-fw-01#
mdbl-int-fw-01#
mdbl-int-fw-01#
mdbl-int-fw-01# sho failover
Failover On
Failover unit Primary
Failover LAN Interface: failover GigabitEthernet0/3 (up)
Unit Poll frequency 200 milliseconds, holdtime 800 milliseconds
Interface Poll frequency 500 milliseconds, holdtime 5 seconds
Interface Policy 1
Monitored Interfaces 4 of 114 maximum
failover replication http
Version: Ours 8.6(1)2, Mate 8.6(1)2
Last Failover at: 02:16:48 UTC Jan 8 2014
This host: Primary - Active
Active time: 74479 (sec)
slot 0: ASA5515 hw/sw rev (1.0/8.6(1)2) status (Up Sys)
Interface outside (118.179.139.4): No Link (Waiting)
Interface dmz (10.98.56.3): No Link (Waiting)
Interface mgt (10.10.11.1): Unknown (Waiting)
Interface inside (10.98.8.97): Normal (Waiting)
slot 1: IPS5515 hw/sw rev (N/A/7.1(4)E4) status (Up/Up)
IPS, 7.1(4)E4, Up
Other host: Secondary - Failed
Active time: 0 (sec)
slot 0: ASA5515 hw/sw rev (1.0/8.6(1)2) status (Up Sys)
Interface outside (118.179.139.6): No Link (Waiting)
Interface dmz (10.98.56.2): No Link (Waiting)
Interface mgt (0.0.0.0): No Link (Waiting)
Interface inside (10.98.8.98): No Link (Waiting)
slot 1: IPS5515 hw/sw rev (N/A/7.1(4)E4) status (Up/Up)
IPS, 7.1(4)E4, Up
Stateful Failover Logical Update Statistics
Link : failover GigabitEthernet0/3 (up)
Stateful Obj xmit xerr rcv rerr
General 12665 0 9929 0
sys cmd 9929 0 9929 0
up time 0 0 0 0
RPC services 0 0 0 0
TCP conn 0 0 0 0
UDP conn 0 0 0 0
ARP tbl 2735 0 0 0
Xlate_Timeout 0 0 0 0
IPv6 ND tbl 0 0 0 0
VPN IKEv1 SA 0 0 0 0
VPN IKEv1 P2 0 0 0 0
VPN IKEv2 SA 0 0 0 0
VPN IKEv2 P2 0 0 0 0
VPN CTCP upd 0 0 0 0
VPN SDI upd 0 0 0 0
VPN DHCP upd 0 0 0 0
SIP Session 0 0 0 0
Route Session 0 0 0 0
User-Identity 1 0 0 0
Logical Update Queue Information
Cur Max Total
Recv Q: 0 7 9930
Xmit Q: 0 30 99581
mdbl-int-fw-01#
mdbl-int-fw-01#
mdbl-int-fw-01# sho failover state
State Last Failure Reason Date/Time
This host - Primary
Active None
Other host - Secondary
Failed Ifc Failure 22:03:03 UTC Jan 8 2014
outside: No Link
dmz: No Link
mgt: No Link
inside: No Link
====Configuration State===
Sync Done
====Communication State===
Mac set
mdbl-int-fw-01# sho failover ?
descriptor Show failover interface descriptors. Two numbers are shown for
each interface. When exchanging information regarding a
particular interface, this unit uses the first number in messages
it sends to its peer. And it expects the second number in
messages it receives from its peer. For trouble shooting, collect
the show output from both units and verify that the numbers
match.
exec Show failover command execution information
history Show failover switching history
interface Show failover command interface information
state Show failover internal state information
statistics Show failover command interface statistics information
| Output modifiers
<cr>
mdbl-int-fw-01# sho failover inter
mdbl-int-fw-01# sho failover interface
interface failover GigabitEthernet0/3
System IP Address: 10.98.8.89 255.255.255.248
My IP Address : 10.98.8.89
Other IP Address : 10.98.8.90
mdbl-int-fw-01# sho failover stati
mdbl-int-fw-01# sho failover statistics
tx:995725
rx:980617
mdbl-int-fw-01# sho failover hi
mdbl-int-fw-01# sho failover history
==========================================================================
From State To State Reason
==========================================================================
02:16:40 UTC Jan 8 2014
Not Detected Negotiation No Error
02:16:48 UTC Jan 8 2014
Negotiation Just Active No Active unit found
02:16:48 UTC Jan 8 2014
Just Active Active Drain No Active unit found
02:16:48 UTC Jan 8 2014
Active Drain Active Applying Config No Active unit found
02:16:48 UTC Jan 8 2014
Active Applying Config Active Config Applied No Active unit found
02:16:48 UTC Jan 8 2014
Active Config Applied Active No Active unit found
==========================================================================
mdbl-int-fw-01# sho failover
Failover On
Failover unit Primary
Failover LAN Interface: failover GigabitEthernet0/3 (up)
Unit Poll frequency 200 milliseconds, holdtime 800 milliseconds
Interface Poll frequency 500 milliseconds, holdtime 5 seconds
Interface Policy 1
Monitored Interfaces 4 of 114 maximum
failover replication http
Version: Ours 8.6(1)2, Mate 8.6(1)2
Last Failover at: 02:16:48 UTC Jan 8 2014
This host: Primary - Active
Active time: 74554 (sec)
slot 0: ASA5515 hw/sw rev (1.0/8.6(1)2) status (Up Sys)
Interface outside (118.179.139.4): No Link (Waiting)
Interface dmz (10.98.56.3): No Link (Waiting)
Interface mgt (10.10.11.1): Unknown (Waiting)
Interface inside (10.98.8.97): Normal (Waiting)
slot 1: IPS5515 hw/sw rev (N/A/7.1(4)E4) status (Up/Up)
IPS, 7.1(4)E4, Up
Other host: Secondary - Failed
Active time: 0 (sec)
slot 0: ASA5515 hw/sw rev (1.0/8.6(1)2) status (Up Sys)
Interface outside (118.179.139.6): No Link (Waiting)
Interface dmz (10.98.56.2): No Link (Waiting)
Interface mgt (0.0.0.0): No Link (Waiting)
Interface inside (10.98.8.98): No Link (Waiting)
slot 1: IPS5515 hw/sw rev (N/A/7.1(4)E4) status (Up/Up)
IPS, 7.1(4)E4, Up
Stateful Failover Logical Update Statistics
Link : failover GigabitEthernet0/3 (up)
Stateful Obj xmit xerr rcv rerr
General 12676 0 9938 0
sys cmd 9938 0 9938 0
up time 0 0 0 0
RPC services 0 0 0 0
TCP conn 0 0 0 0
UDP conn 0 0 0 0
ARP tbl 2737 0 0 0
Xlate_Timeout 0 0 0 0
IPv6 ND tbl 0 0 0 0
VPN IKEv1 SA 0 0 0 0
VPN IKEv1 P2 0 0 0 0
VPN IKEv2 SA 0 0 0 0
VPN IKEv2 P2 0 0 0 0
VPN CTCP upd 0 0 0 0
VPN SDI upd 0 0 0 0
VPN DHCP upd 0 0 0 0
SIP Session 0 0 0 0
Route Session 0 0 0 0
User-Identity 1 0 0 0
Logical Update Queue Information
Cur Max Total
Recv Q: 0 7 9940
Xmit Q: 0 30 99677Hi Ganesan,
I am proposing a design like this. You can have the STP in pvst mode and have a different priority set for the core switch to make it core a as root bridge. There is nothing wrong with your design you have made you core switch which will be physically down to your firewall... but in real it comes on the top of your firewall as well... But spanning tree conf should be done properly to achieve this... I have proposed my design which is pretty simple but easy for troubleshoot....
You can have your firewalls connected to core switch on the down and can directly connected to router on outside... always core a -->py fw--rtra will be the primary path... if anything goes wrong then secondary line will come in to picture....
make sure that your hsrp will have high priority to ur core a vlan conf for the access switches.....
Please do rate for the helpful posts.
By
Karthik -
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin-top:0in;
mso-para-margin-right:0in;
mso-para-margin-bottom:10.0pt;
mso-para-margin-left:0in;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:"Times New Roman";
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;}
This topic has been beat to death, but I did not see a real answer. Here is configuration:
1) 2 x ASA 5520, running 8.2
2) Both ASA are in same outside and inside interface broadcast domains – common Ethernet on interfaces
3) Both ASA are running single context but are active/standby failovers of each other. There are no more ASA’s in the equation. Just these 2. NOTE: this is not a Active/Active failover configuration. This is simply a 1-context active/standby configuration.
4) I want to share VPN load among two devices and retain active/standby failover functionality. Can I use VPN load balancing feature?
This sounds trivial, but I cannot find a clear answer (without testing this); and many people are confusing the issue. Here are some examples of confusion. These do not apply to my scenario.
Active/Active failover is understood to mean only two ASA running multi-contexts. Context 1 is active on ASA1 Context 2 is active on ASA2. They are sharing failover information. Active/Active does not mean two independently configured ASA devices, which do not share failover communication, but do VPN load balancing. It is clear that this latter scenario will work and that both ASA are active, but they are not in the Active/Active configuration definition. Some people are calling VPN load balancing on two unique ASA’s “active/active”, but it is not
The other confusing thing I have seen is that VPN config guide for VPN load balancing mentions configuring separate IP address pools on the VPN devices, so that clients on ASA1 do not have IP address overlap with clients on ASA2. When you configure ip address pool on active ASA1, this gets replicated to standby ASA2. In other words, you cannot have two unique IP address pools on a ASA Active/Standby cluster. I guess I could draw addresses from external DHCP server, and then do some kind of routing. Perhaps this will work?
In any case, any experts out there that can answer question? TIA!Wow, some good info posted here (both questions and some answers). I'm in a similar situation with a couple of vpn load-balanced pairs... my goal was to get active-standby failover up and running in each pair- then I ran into this thread and saw the first post about the unique IP addr pools (and obviously we can't have unique pools in an active-standby failover rig where the complete config is replicated). So it would seem that these two features are indeed mutually exclusive. Real nice initial post to call this out.
Now I'm wondering if the ASA could actually handle a single addr pool in an active-standby fo rig- *if* the code supported the exchange of addr pool status between the fo members (so they each would know what addrs have been farmed out from this single pool)? Can I get some feedback from folks on this? If this is viable, then I suppose we could submit a feature request to Cisco... not that this would necessarily be supported anytime soon, but it might be worth a try. And I'm also assuming we might need a vip on the inside int as well (not just on the outside), to properly flip the traffic on both sides if the failover occurs (note we're not currently doing this).
Finally, if a member fails in a std load-balanced vpn pair (w/o fo disabled), the remaining member must take over traffic hitting the vip addr (full time)... can someone tell me how this works? And when this pair is working normally (with both members up), do the two systems coordinate who owns the vip at any time to load-balance the traffic? Is this basically how their load-balancing scheme works?
Anyway, pretty cool thread... would really appreciate it if folks could give some feedback on some of the above.
Thanks much,
Mike -
Clustered role 'Cluster Group' has exceeded its failover threshold.
Hello.
I’m hoping to get some help with a cluster issue I’m having using Windows Storage Server 2012.
When the cluster is created my Cluster Core Resources are all happy and online.
I can more the Cluster Name using “move Core Cluster Resources” between the two nodes without any problems.
If I select ‘Simulate Failure’ on the IP Address resource, it works the first time
If I do it again shortly after it fails and I get an Event ID 1254, 1205 and 1069.
Event ID 1254
Clustered role 'Cluster Group' has exceeded its failover threshold.
It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state.
No additional attempts will be made to bring the role online or fail it over to another node in the cluster.
Please check the events associated with the failure. After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period.
Event ID 1205
The Cluster service failed to bring clustered service or application 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.
Event ID 1069
Cluster resource 'Cluster IP Address' of type 'IP Address' in clustered role 'Cluster Group' failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.
Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
Basically I’m trying to simulate a network failure to make sure the failover kicks in.
If I click on it and ‘Bring Online’ it comes up fine.
Where do I find this Threshold Policy and set it to initiate failover if the IP Address resources fails?
Thank you in advance for your help.Hi,
The failover threshold is the number of times the group can fail over within the number of hours specified by the failover period. For example, if a group failover threshold is set to "5" and its failover period to "3," the clustering software stops attempting
to bring the group online and leaves the resources within the group in their current state. For example, if the IP Address resource is brought online but the Network Name resource fails, the group is left offline, but the IP Address resource is left online.
To configure thresholds for a resource:
Right-click the cluster resource and then select 'Propereties'
Click 'Advanced'
Select 'Do not restart' if the cluster service should not attempt to restart. Restart is the default
If 'Restart' is selected:
Affect the Group: uncheck to prevent a failure of the selected resource from causing the Server group to failover
Threshold: number of times the cluster service will attempt to restart the resource, and period is the amount of time in seconds between retries
Do not modify the 'LooksAlive' and 'IsAlive' settings
Unless necessary, do not alter the 'Pending Timeout'. This is the amount of time the resource is either in the online or pending or offline pending states before the the cluster service puts it in either offline or failed state
For more information please refer to following MS articles:
Windows Failover Clustering Overview
http://blogs.technet.com/b/rob/archive/2008/05/07/failover-clustering.aspx
Tuning Failover Cluster Network Thresholds
http://blogs.msdn.com/b/clustering/archive/2012/11/21/10370765.aspx
Failover cluster (group) maximum failures limit
http://blogs.msdn.com/b/arvindsh/archive/2012/03/09/failover-cluster-group-maximum-failures-limit.aspx
Lawrence
TechNet Community Support -
Windows Server 2012R2 Failover Cluster error with mounted volumes
Hi all,
I've a problem with mounted volume on a WSFC build on top of Windows Server 2012R2, the situation is:
M: is the volume hosting mounting points
disk-1, disk-2, disk-3 are volume mounted on M:\SomeFolder
Theese volumes are used by a SQL Server Failover Cluster Instance, but my problem is related to WSFC. I've set dependencies so disk-1, disk-2, disk-3 depend upon H:
If I try a failover of the role "SQL Server" I observe that when the disk come online in the other node they fail with this error:
Cluster resource 'disk-1' of type 'Physical Disk' in clustered role 'SQL Server (ISQL2014A)' failed. The error code was '0xaa' ('The requested resource is in use.').
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster
Manager or the Get-ClusterResource Windows PowerShell cmdlet.
If I manually take offline H: and than bring it online and then manually take online all disk (1 to 3) they come online with no error.
I'm going crazy!I've found the root of the problem: the servers are virtual machine on a VMware ESX 5.5 infrastructure, VMware claims that on 5.5 multipath is supported for raw device mapping disks but disabling multipath (I've set to fixed path) the Windows Server Failover
Cluster stops to get problems.
Now we have opened a support call with VMware. -
How to Ensure Symbol State Persists?
Okay, so I have a symbol that I want to show/hide; it's a named MovieClip with two labelled frames ("show" and "hide" in that order), the default is to be shown, and both frames have a stop() command.
This seems fine however at one point in my animation I trigger the symbol to hide via foo.gotoAndStop("hide"), but when the animation loops the symbol reappears (or rather, returns to the "show" frame).
Now, obviously this is annoying, but other symbols that function in the exact same way appear to be working as expected, so I'm at a complete loss as to why this one symbol in particular seems to be resetting? Is there some characteristic of the animation that might be causing Flash to see the symbol as if it were a new instance? I've checked that the symbol's name is constant throughout the animation, and I'm really only using classic tweens; certainly nothing that stands out as different in way compared to the other, working symbols.
Aside from completely re-creating the animation for that symbol I'm not sure what else I can try, so any suggestions are welcomed!Okay, I'm going to try to highlight what I'm seeing by taking a snapshot of the timeline where the problem occurs:
The selected layer has the problem symbol, the layer immediately above is the mask layer. The blue circle indicates the frame at which the symbol is hidden (or shown) by a script further up in the timeline. The symbol displays as expected for the duration of the blue tween. At the green circle, a new tween begins but the symbol remains unchanged, continuing to be shown or hidden as expected for the during of that tween. The red circle however is where the symbol apparently resets to the default (showing) frame, and loses any other state assigned to it (e.g - if I were to set a variable upon it such as foo.bar = "foobar").
Why would the symbol retain state through the green circle, but not the red one? It seems that at the green circle the symbol is still the same instance, but at the red circle it is not, even though nothing seems to be happening any differently. -
I was looking for a documentation that explains the failover reasons but the only doc I found (command guide), does not explain the reasons only the states.
http://www.cisco.com/en/US/docs/security/asa/asa82/command/reference/s3.html#wp1473355
•No Error
•Set by the CI config cmd
•Failover state check
•Failover interface become OK
•HELLO not heard from mate
•Other unit has different software version
•Other unit operating mode is different
•Other unit license is different
•Other unit chassis configuration is different
•Other unit card configuration is different
•Other unit want me Active
•Other unit want me Standby
•Other unit reports that I am failed
•Other unit reports that it is failed
•Configuration mismatch
•Detected an Active mate
•No Active unit found
•Configuration synchronization done
•Recovered from communication failure
•Other unit has different set of vlans configured
•Unable to verify vlan configuration
•Incomplete configuration synchronization
•Configuration synchronization failed
•Interface check
•My communication failed
•ACK not received for failover message
•Other unit got stuck in learn state after sync
•No power detected from peer
•No failover cable
•HA state progression failed
•Detect service card failure
•Service card in other unit has failed
•My service card is as good as peer
•LAN Interface become un-configured
•Peer unit just reloaded
•Switch from Serial Cable to LAN-Based fover
•Unable to verify state of config sync
•Auto-update request
•Unknown reasonRe "interface check" - it's pretty straightforward. The active unit queries the monitored interfaces on the standby for state (line up, protocol up) and, when a standby IP is configured, reachability.
If it fails any of those, the standby unit is marked as not ready due to interface check failing. -
Interpret failover command on my asa
Hi Everyone, thank you very much for your help in advance...
I would like to ask if you can help if you can interpret each line of the following commands means, and how the failover works (with the settings below)?
failover
failover lan unit primary
failover lan interface failover Management0/0
failover polltime unit 3 holdtime 9
failover replication http
failover link failover Management0/0
failover interface ip failover 192.168.0.1 255.255.255.240 standby 192.168.0.2
icmp unreachable rate-limit 1 burst-size 1
failover
failover lan unit primary
failover lan interface failover Management0/0
failover polltime unit 3 holdtime 9
failover replication http
failover link failover Management0/0
failover interface ip failover 192.168.0.1 255.255.255.240 standby 192.168.0.2
icmp unreachable rate-limit 1 burst-size 1
THank you very much for your help
Takami chirofailover : Enables Failover.
failover lan unit primary : Makes the unit that this command is entered as the primary FW
failover lan interface failover Management0/0: Specify which interface will be used for exchanging FW Hellos and other messges.
failover polltime unit 3 holdtime 9 : This command changes the frequency at which Failover hellos are sent to the other FW
failover replication http : Enter this if you want http sessions to be replicated to the standby FW. if this is entered, user will not have to refresh his browser.
failover link failover Management0/0 : Specify the interface for exchanging state information.
failover interface ip failover 192.168.0.1 255.255.255.240 standby 192.168.0.2 : Specify the Failover IP addresses.
HTH
Zubair -
ASAs failover pair which design is the best
Guys
I am designing the firewall solution. I have 2 ASA with 2 Switches. Please see the diagram design1 and design2. Let me know your thoughts. Design 1 uses a stacking cable with 2 switches but in a diagram it is represented as one due to lack of diagram availability. Design 2 uses 2 switches connected seperately. What are advantages of one over the another.?
Thanks in advance.By all means you can use a switch to interconnect both ASAs and it is not achieving anything different from using a cross-over cable for the purpose of deploying a state-full failover.
I have deployed at least 15 state-full failover ASAs over the course of 14 years of network career just by using a cross-over cable. If you weight pros and cons using a switch vs the cross-over cable. I would say cross-over cable have more pros than con and this is my take.
Nothing against Cisco but sometime Cisco recommendation also comes with sales and marketing strategy.
"Each interface should connect to a switch port so that the link status is always up"
So does the cross-over cable and there is an additional point of failure by a switch coming in between ASA and a switch that sending statefull sync data to standby ASA.
Thanks -
SQL Cluster unexpected failover
So we had one of our SQL clusters unexpectedly failover recently. Second time in a few months. Two node active/passive SQL 2012 cluster running on Windows 2012 Standard.
Here's what we could cull from the application/system logs?
1. "
Cluster resource 'SQLServer' of type 'SQL Server' in clustered role 'SQLServerRole' failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster
Manager or the Get-ClusterResource Windows PowerShell cmdlet."
2. "
Cluster resource 'SQLServer' (resource type 'SQL Server', DLL 'sqsrvres.dll') did not respond to a request in a timely fashion. Cluster health detection will attempt to automatically recover by terminating the Resource Hosting Subsystem (RHS) process running
this resource. This may affect other resources hosted in the same RHS process. The resources will then be restarted.
The suspect resource 'SQLServer' will be marked to run in an isolated RHS process to avoid impacting multiple resources in the event that this resource failure occurs again. Please ensure services, applications, or underlying infrastructure (such as storage
or networking) associated with the suspect resource is functioning properly."
3. "The cluster Resource Hosting Subsystem (RHS) stopped unexpectedly. An attempt will be made to restart it. This is usually associated with recovery of a crashed or deadlocked resource. Please determine which resource and resource DLL is causing
the issue and verify it is functioning properly."
4. "A timeout (30000 milliseconds) was reached while waiting for a transaction response from the MSSQLSERVER service."
Cluster.log wasn't much more helpful on the root cause either:
00000f28.00001c78::2014/12/04-21:25:54.662 INFO [RES] Network Name <Cluster Name>: Netbios: Slow Operation, FinishWithReply: 0
00000f28.00001c78::2014/12/04-21:25:54.662 INFO [RES] Network Name: [NN] got sync reply: 0
00000f28.00001c78::2014/12/04-21:25:54.662 INFO [RES] Network Name <Cluster Name>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle
00000f20.00000e94::2014/12/04-21:25:55.240 INFO [RES] SQL Server Agent <SQL Server Agent>: [sqagtres] IsAlive request.
00000f20.00000e94::2014/12/04-21:25:55.240 INFO [RES] SQL Server Agent <SQL Server Agent>: [sqagtres] CheckServiceAlive: returning TRUE (success)
00001134.000001d8::2014/12/04-21:25:57.287 ERR [RES] SQL Server <SQLServer>: [sqsrvres] Failure detected, diagnostics heartbeat is lost
00001134.000001d8::2014/12/04-21:25:57.287 INFO [RES] SQL Server <SQLServer>: [sqsrvres] IsAlive returns FALSE
00001134.000001d8::2014/12/04-21:25:57.287 WARN [RHS] Resource SQLServer IsAlive has indicated failure.
00000880.0000161c::2014/12/04-21:25:57.303 INFO [NM] Received request from client address HOST-XXX-SQL02.
00000880.0000161c::2014/12/04-21:25:57.303 INFO [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'SQLServer', gen(3) result 1/0.
00000880.000023a4::2014/12/04-21:25:57.303 INFO [GEM] Sending 1 messages as a batched GEM message
00000880.0000161c::2014/12/04-21:25:57.303 INFO [RCM] Res SQLServer: Online -> ProcessingFailure( StateUnknown )
00000880.0000161c::2014/12/04-21:25:57.303 INFO [RCM] TransitionToState(SQLServer) Online-->ProcessingFailure.
00000880.0000161c::2014/12/04-21:25:57.318 INFO [RCM] rcm::RcmGroup::UpdateStateIfChanged: (SQLServerRole, Online --> Pending)
00000880.00001db8::2014/12/04-21:25:57.334 INFO [GEM] Sending 1 messages as a batched GEM message
00000880.0000161c::2014/12/04-21:25:57.334 ERR [RCM] rcm::RcmResource::HandleFailure: (SQLServer)
00000880.00001db8::2014/12/04-21:25:57.334 INFO [GEM] Sending 1 messages as a batched GEM message
00000880.00000bac::2014/12/04-21:25:57.334 INFO [RCM] ignored non-local state Pending for group SQLServerRole
00000880.0000161c::2014/12/04-21:25:57.350 INFO [RCM] resource SQLServer: failure count: 1, restartAction: 2 persistentState: 1.
00000880.0000161c::2014/12/04-21:25:57.350 INFO [RCM] Greater than restartPeriod time has elapsed since first failure of SQLServer, resetting failureTime and failureCount.
00000880.0000161c::2014/12/04-21:25:57.350 INFO [RCM] Will queue immediate restart (500 milliseconds) of SQLServer after terminate is complete."
Any ideas? Anywhere we could look for more specific info? Any preventative measures we could take?
Thanks,
RyanHello,
Since you are using SQL Server 2012, there is an extended events trace running on the cluster that holds all of the return values from sp_server_diagnostics, check that out (.xel) to see if there is anything in there.
The error is pretty straight forward, there wasn't a timely response to the sp_server_diagnostics return set. Look for schedulers that are overwhelmed, SQL server paging a bunch of memory (outside OS pressure), someone pausing a service, etc.
Is this happening during a peak traffic or load time?
-Sean
The views, opinions, and posts do not reflect those of my company and are solely my own. No warranty, service, or results are expressed or implied. -
New LUN Takes A long Time To Format and Errors Out
Good afternoon,<o:p></o:p>
I have a Hyper-V Cluster composed of 4 nodes and these nodes are able to access multiple CSVs (14 in total). I recently requested a new LUN (LUN 15)to be provisioned
to my Hyper-V cluster in the size of 500GB. Here is my problem:<o:p></o:p>
1. Formatting of a 500GB LUN (with quick format selected) should not take more than a few seconds. Instead, the quick format takes about 2hrs if not longer. I have actually
seen it go for half the day.<o:p></o:p>
2. Once the formatting has completed (no errors), taking the formatted LUN offline freezes the Computer Management screen and shows the status (Not Responding). This
will take place for 30 minutes or less and show that the LUN has been taken offline.<o:p></o:p>
3. In the Failover Cluster Manager, detecting the disks takes about 15 minutes. Once the available LUNs have been detected I can add the 500GB LUN to the Disks screen
without any problems.<o:p></o:p>
4. While in the Disks screen, adding the LUN to the Clustered Shared Volumes takes about 5 minutes (too long).<o:p></o:p>
Already seeing that there is a problem, I went ahead and used the Hyper-V Manager to create a 200GB vhd on the new LUN which has been added to the CSV. The bar indicating
the progress of the vhd creation does not display any progress (no green progress bar appears, not even a tiny bit of it) and after 3 hours (more or less) I receive an error, stating that the creation of the vhd failed.
<o:p>NOTE: The vhd shows up in Volume 9 (LUN 15) but I can only bet that it will not work, plus I would not want to work with a vhd file which failed during the
creation process.</o:p>
<o:p>Long story short, I repeated the above steps to see if that was a temporary problem, but it is not. The same problem occurs no matter which Hyper-V cluster
node the operations are performed on. I would like to add, that I tested the creation of a vhd on an already configured LUN and the creation was completed successfully, and within a n expected time frame.</o:p>
NOTE: When LUN 15 errors out, it's status shows "Failed" in the Failover Cluster Manager. This in turn, causes the re-scanning of available disks to take forever (in Computer Management) and it keeps searching. Pretty
much, the fail of one LUN affects the entire functionality of the entire Hyper-V Cluster.
Errors Listed In Event Details For LUN 15:
1. Cluster Shared Volume 'Volume9' ('Cluster Disk 5') is no longer accessible from this cluster node because of error 'ERROR_TIMEOUT(1460)'. Please troubleshoot this node's connectivity to the storage device and
network connectivity.
Event ID: 5142; Source: Microsoft-Windows Failover Clustering;Task Category: Cluster Shared Volume
2. Cluster Shared Volume 'Volume9' ('Cluster Disk 5') is no longer available on this node because of 'STATUS_IO_TIMEOUT(c00000b5)'. All I/O will temporarily be queued until a path to the volume is reestablished.
Event ID: 5120; Source: Microsoft-Windows Failover Clustering;Task Category: Cluster Shared Volume
3.Cluster resource 'Cluster Disk 5' of type 'Physical Disk' in clustered role '4530acc9-8552-4696-b6c3-636ff8d58c46' failed.
Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover
Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
Event ID: 1069; Source: Microsoft-Windows Failover Clustering;Task Category: Resource Control Manager
4.
Cluster resource 'Cluster Disk 5' (resource type 'Physical Disk', DLL 'clusres.dll') did not respond to a request in a timely fashion. Cluster health detection will attempt to automatically recover by terminating the Resource Hosting Subsystem (RHS)
process running this resource. This may affect other resources hosted in the same RHS process. The resources will then be restarted.
The suspect resource 'Cluster Disk 5' will be marked to run in an isolated RHS process to avoid impacting multiple resources in the event that this resource failure occurs again. Please ensure services, applications, or underlying infrastructure
(such as storage or networking) associated with the suspect resource is functioning properly.
Event ID: 1230; Source: Microsoft-Windows FailoverClustering;Task Category: Resource Control Manager
Any and all help will be appreciated!Hi AquilaXXIII,
What server edition you are using? If you are using 2012r2 as cluster node, please install Recommended hotfixes and updates for Windows Server 2012 R2-based failover clusters
update first,
Recommended hotfixes and updates for Windows Server 2012 R2-based failover clusters
http://social.technet.microsoft.com/Forums/en-US/f9c1a5f7-4fcf-409a-8d7e-388b85512bfe/new-lun-takes-a-long-time-to-format-and-errors-out?forum=winserv
Before you install the new shared storage please first validation this storage first, you can refer the following KB to validation the new LUN.
Understanding Cluster Validation Tests: Storage
http://technet.microsoft.com/en-us/library/cc771259.aspx
I’m glad to be of help to you!
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place.
Maybe you are looking for
-
Does BB GPS work on T-Mobile Pearl without a data plan?
I do not have a data plan with my BB. The GPS entry under Advanced Options shows: GPS Data Source: None, BPS Services: Location ON. I'd like to use the GPSLogger application that says it needs no data access to operate, but it sits on "Waiting for GP
-
Runtime error SAPSQL_INVALID_TABLENAME while activatin Infocube
Hi, I modified an Infocube. I saved it in a transport request. While activating the Infocube I get the short dump:\ Runtime Errors SAPSQL_INVALID_TABLENAME Except. CX_SY_DYNAMIC_OSQL_SEMANTICS Short text A table name, speci
-
How can I record a QuickTime video clip of in-game action?
I've seen video reviews of games from time to time that show actual video clips of a game being played. The ability to record video clips of a game would come in very handy for me, because I'm experiencing a bug in the graphics of Call of Duty 2 that
-
How to execute the steps synchronously in Teststand?
Hello All, I want to execute the some steps synchronously,Such asne step for setting some input conditions to DUT,and one step for measuring some test points under the conditions.and the measurement is real-time. So i want to know if have t
-
JPA's EntityManagerFactory jndi lookup problem
Hi, I need to do jndi lookup of EntityManagerFactory from DAO layer. I don't want to inject entitymanagerfactory in a session bean. I am using weblogic 10.0 . I could do a jndi lookup for jboss and oracle servers but not for weblogic I have tried bel