Surprises on Replacing failed primary unit
Dear friends,
I have done failover for firewalls umpteen number of times but yesterday it failed for some reason.
I had replaced the failed primary unit with a fresh one and i had expected that it will detect the secondary unit as active and try to begin config replication from it but rather it wiped off the secondary unit's config. I dont think that i faulted in the sequence but let me share with you what i did:
1. Put the four or five lines of failover configuration (except the failover command) and did a no shut on the failover interface (management0/0)
2. Ran the failover command
Instead of getting the config from the active unit, it started forcing the configs to the other unit. To restore, i had to reload the active unit to restore its config. After that i reloaded the fresh unit and now the failover happened as expected.
I think that i should forced a reload of the new unit before trying to establish failover.
Has anyone tried this in a fail-proof way during production hours? if yes, can you please share with me the steps?
I did not ask for downtime because i was confident but i resulted in bringing down the ASA for 5 minutes because of the unexpected failover action.
Thanks a lot
Gautam
Dear kureli,
Thanks a lot for the efforts you took. I really appreciate it.
Here's the exact sequence of steps that happened:
1. When primary unit failed, secondary got active and i dont remember if sh fail showed "secondary- not detected" or "secondary - failed"
2. I replaced the faulty primary unit with another primary unit and said no shut on the m0/0 failover interface and also put all the failover commands except "failover" command.
3. I made sure that the new primary unit runs the same code (i checked only the main code version, i did not check the asdm version similarity). The asdm versions were different on both boxes though.
4. After powering up the box and connecting cables, i said failover. It then prompted me saying that SSL license is not the same on both units and disabling failover.
5. I applied for an activiation key from [email protected] and then got the SSL license from them.
6. Next day i went back to the customer and installed the license key. After installing the license key, i said failover. It gave me the message "No response from mate"
7. I then said no failover to disable failover on the new primary unit.
8. I then went to secondary active unit and said failover as failover was disabled
9. I then went back to primary unit and said failover
10. This is where blank config replication started !!
11. Reloaded secondary unit to undo the blank running config
12. Went to Primary unit and disconnected the failover cable. Rebooted the primary unit and connected the failover cable.
13. Secondary came up as active, primary then came up, and this time primary honored the secondary as active and did config replication
14. All was well then!!
Not sure still why this happened and it was a bit shameful for me to see this happening after 3.5 years of firewalling experience.
Anyways, i am willing to learn and improve from now on.
Probably next time, i would try to make sure that i apply the failover configs, reload, and while reload connect the failover cable.
I think the learning lesson is that if the unit reloads, the reloaded unit always honors the currently active unit and does not try to override its role.
This is what worked for me.
Thanks a lot
Gautam
Similar Messages
-
Replacement of primary unit failed! (ASA5510 active/standby)
Hi all,
I have an issue bringing up my RMA'd primary ASA unit.
So what happened so far:
1. primary unit failed
2. secondary took over and is now secondary - active (as per sh fail)
2. requested RMA at Cisco
3. got ASA and checked that Lic (SSL), OS (8.2.2) and ASDM are at the same level as the secondary
4. issued wr erase and reloaded
5. copied the following commands to the new (RMA) primary unit:
failover lan unit primary
failover lan interface Failover Ethernet3
failover interface ip Failover 172.x.x.9 255.255.255.248 standby 172.x.x.10
int eth3
no shut
failover
wr mem
6. installed primary unit into rack
7. plugged-in all cables (network, failover, console and power)
8. fired up the primary unit
9. expected that the unit shows:
Detected an Active mate
Beginning configuration replication from mate.
End configuration replication from mate.
10. but nothing happened on primary unit
So can anyone give me assistance on what is a valid and viable approach in replacing a failed primary unit? Is there a missing step that hinders me to successfully replicate the secondary - active config to the primary - standby unit.
I was looking for help on the net but unfortunately I was not able to find anything related to ASA55xx primary unit replacement with a clear guideline or step by step instructions.
Any comments or suggestions are appreciated, and might help others who are in the same situation.
Thanks,
NicoHi Varun,
Thanks for catching-up this thread.
Here you go:
sh run fail on secondary - active:
failover
failover lan unit secondary
failover lan interface Failover Ethernet0/3
failover key *****
failover link Failover Ethernet0/3
failover interface ip Failover 172.x.x.9 255.255.255.248 standby 172.x.x.10
sh fail hist on secondary - active:
asa1# sh fail hist
==========================================================================
From State To State Reason
==========================================================================
23:47:15 CEST Feb 19 2011
Not Detected Negotiation No Error
23:47:19 CEST Feb 19 2011
Negotiation Cold Standby Detected an Active mate
23:47:21 CEST Feb 19 2011
Cold Standby Sync Config Detected an Active mate
23:47:36 CEST Feb 19 2011
Sync Config Sync File System Detected an Active mate
23:47:36 CEST Feb 19 2011
Sync File System Bulk Sync Detected an Active mate
23:47:50 CEST Feb 19 2011
Bulk Sync Standby Ready Detected an Active mate
10:34:09 CEDT Sep 3 2011
Standby Ready Just Active HELLO not heard from mate
10:34:09 CEDT Sep 3 2011
Just Active Active Drain HELLO not heard from mate
10:34:09 CEDT Sep 3 2011
Active Drain Active Applying Config HELLO not heard from mate
10:34:09 CEDT Sep 3 2011
Active Applying Config Active Config Applied HELLO not heard from mate
10:34:09 CEDT Sep 3 2011
Active Config Applied Active HELLO not heard from mate
==========================================================================
sh fail on secondary - active
asa1# show fail
Failover On
Failover unit Secondary
Failover LAN Interface: Failover Ethernet0/3 (up)
Unit Poll frequency 1 seconds, holdtime 15 seconds
Interface Poll frequency 5 seconds, holdtime 25 seconds
Interface Policy 1
Monitored Interfaces 2 of 110 maximum
Version: Ours 8.2(2), Mate 8.2(2)
Last Failover at: 10:34:09 CEDT Sep 3 2011
This host: Secondary - Active
Active time: 441832 (sec)
slot 0: ASA5510 hw/sw rev (2.0/8.2(2)) status (Up Sys)
Interface Outside (x.x.x.14): Normal (Waiting)
Interface Inside (x.x.x.11): Normal (Waiting)
slot 1: empty
Other host: Primary - Failed
Active time: 40497504 (sec)
slot 0: ASA5510 hw/sw rev (2.0/8.2(2)) status (Unknown/Unknown)
Interface Outside (x.x.x.15): Unknown
Interface Inside (x.x.x.12): Unknown
slot 1: empty
Stateful Failover Logical Update Statistics
Link : Failover Ethernet0/3 (up)
Stateful Obj xmit xerr rcv rerr
General 2250212 0 64800624 309
sys cmd 2250212 0 2249932 0
up time 0 0 0 0
RPC services 0 0 0 0
TCP conn 0 0 46402635 309
UDP conn 0 0 21248 0
ARP tbl 0 0 15921639 0
Xlate_Timeout 0 0 0 0
IPv6 ND tbl 0 0 0 0
VPN IKE upd 0 0 96977 0
VPN IPSEC upd 0 0 108174 0
VPN CTCP upd 0 0 19 0
VPN SDI upd 0 0 0 0
VPN DHCP upd 0 0 0 0
SIP Session 0 0 0 0
Logical Update Queue Information
Cur Max Total
Recv Q: 0 17 203259096
Xmit Q: 0 1 2250212
show ver on secondary - active
asa1# sh ver
Cisco Adaptive Security Appliance Software Version 8.2(2)
Device Manager Version 6.2(5)53
Compiled on Mon 11-Jan-10 14:19 by builders
System image file is "disk0:/asa822-k8.bin"
Config file at boot was "startup-config"
asa1 up 200 days 12 hours
failover cluster up 1 year 108 days
Hardware: ASA5510, 256 MB RAM, CPU Pentium 4 Celeron 1600 MHz
Internal ATA Compact Flash, 256MB
Slot 1: ATA Compact Flash, 64MB
BIOS Flash M50FW080 @ 0xffe00000, 1024KB
Encryption hardware device : Cisco ASA-55x0 on-board accelerator (revision 0x0)
Boot microcode : CN1000-MC-BOOT-2.00
SSL/IKE microcode: CNLite-MC-SSLm-PLUS-2.03
IPSec microcode : CNlite-MC-IPSECm-MAIN-2.04
0: Ext: Ethernet0/0 : address is 0022.55cf.7420, irq 9
1: Ext: Ethernet0/1 : address is 0022.55cf.7421, irq 9
2: Ext: Ethernet0/2 : address is 0022.55cf.7422, irq 9
3: Ext: Ethernet0/3 : address is 0022.55cf.7423, irq 9
4: Ext: Management0/0 : address is 0022.55cf.741f, irq 11
5: Int: Not used : irq 11
6: Int: Not used : irq 5
Licensed features for this platform:
Maximum Physical Interfaces : Unlimited
Maximum VLANs : 100
Inside Hosts : Unlimited
Failover : Active/Active
VPN-DES : Enabled
VPN-3DES-AES : Enabled
Security Contexts : 2
GTP/GPRS : Disabled
SSL VPN Peers : 10
Total VPN Peers : 250
Shared License : Disabled
AnyConnect for Mobile : Disabled
AnyConnect for Cisco VPN Phone : Disabled
AnyConnect Essentials : Disabled
Advanced Endpoint Assessment : Disabled
UC Phone Proxy Sessions : 2
Total UC Proxy Sessions : 2
Botnet Traffic Filter : Disabled
This platform has an ASA 5510 Security Plus license.
Serial Number: xxx
Running Activation Key:xxxx
Configuration register is 0x1
Configuration last modified by enable_1 at 10:05:32.149 CEDT Fri Jul 15 2011 -
Correct procedure to replace failed secondary ASA unit
Hello
i just received a RMA for failed ASA 5520 that was acting as secondary unit in multicontext configuration. What would be correct procedure to install it back in production? Do i need to restore backed up config of the fallen unit or is it just enough to enable multimode and connect to existing (primary) unit? Any good link for documentation that deal with this issues would be also appreciated.
Thanks in advanceConfigure the ASA for failover communication and as a secondary unit. This is done from the system context so yes you need to switch it into the multiple context routed mode. Power the asa on and connect only the failover communication interface. This will make sure that it is seen by the primary as failed. Once the failover communication is up, and the configutation synchronisation and connection replicatiin are over, connect the traffic interfaces.
This is pretty much it. Hope it helps.
Sent from Cisco Technical Support iPad App -
REPLACING THE PRIMARY INTERNAL BATTERY 601
MY PAVILION dv6-2150ev tells me i must replace the primary internal battery 601.
i do no know which is this battery and how to change itHi,
This is a VERY misleading message. It's actually the main battery. Before buying a new battery, please try the following test:
http://h10025.www1.hp.com/ewfrf/wc/document?cc=us&lc=en&docname=c00821536
Hope the test and calibrate help otherwise you have to buy new battery. You can buy from HP or from:
http://www.top-laptop-battery.com.au/6cell-hp-pavilion-dv62150ev-battery-108v-5200mah-p-9983.html
Regards.
BH
**Click the KUDOS thumb up on the left to say 'Thanks'**
Make it easier for other people to find solutions by marking a Reply 'Accept as Solution' if it solves your problem. -
Replacing the primary site server, SCCM 2012 R2
Hi,
Setup is currently SCCM 2012 R2 with the primary site server on Windows 2008 R2 on a Hyper-V VM with a separate physical SQL server on 2008 R2 (Windows and SQL). I'm not making any changes to the SQL server at this stage.
I want to completely replace the primary site server VM completely with a brand new VM using Windows 2012 R2.
I was looking to run the SCCM setup and using the Expand an existing stand-alone primary into a hierarchy option but it wont let me do that due to "The site code you specified should not be the same as the primary site's site code".
So seems this Expand an existing option isn't the right option for me?
Do I just add the new server in the SCCM console and add all roles and test and then delete the roles from the old server?
Thanks
EDIT: I have 7 Secondary sites, so I dont want to just start over by doing some sort of backup and restore to the new server unless someone can tell me that wont break my hierarchy.a backup and recovery is the correct way to do this, but there are a few things you should keep in mind.
Do a backup of your current environment and shut everything down.
install new vm, ensure the configuration is identical, (dns, hostname, etc...); install the same software and pre-requisites,
do a recovery
check the following links on how to do a backup and restore:
http://www.windows-noob.com/forums/index.php?/topic/7403-how-can-i-backup-system-center-2012-configuration-manager/
and
http://technet.microsoft.com/en-us/library/gg712697.aspx -
Why do we have to flashback the failed primary DB to make it the new standby ?
RDBMS Version: 11.2.0.3
Platform : RHEL 5.8
Standby Type : Physical Standby
Primary DB name : berlin
Standby DB Unique name : lisbon
Lets say berlin DB goes down due to a hardware failure so lisbon is made the Primary DB. Now lisbon is running without a standby DB.
berlin crashed --------> at SCN 4000
lisbon takes over ------> from SCN 4000 and proceeds
In the below url, some steps involving flashback are mentioned on how to make the failed primary (berlin) the new physical standby.
http://docs.oracle.com/cd/E11882_01/server.112/e25608/scenarios.htm#i1050055
berlin DB crashed at SCN 4000. So, to make berline the new physical standby , can't we just run
SQL> ALTER DATABASE CONVERT TO PHYSICAL STANDBY;
and mount the DB and start the Redo apply ?
Why do we need to flashback the old primary DB to SCN 4000. Wasn't berlin DB already at SCN 4000 when it crashed?Hello;
The key is the SCN must be at exactly the same point the Standby became Primary.
Generally you want to use Flashback so the "Old primary" can be at a point where it can become the new Standby. Once you are there you can use Switchover to return the Data Guard setup.
To get the correct SCN I use :
SELECT TO_CHAR(STANDBY_BECAME_PRIMARY_SCN) FAILOVER_SCN FROM V$DATABASE;
Example
SQL> FLASHBACK DATABASE TO SCN 1011295;
Flashback complete.
SQL>ALTER DATABASE CONVERT TO PHYSICAL STANDBY;
Database altered
If you don't have flashback you can use RMAN to perform the same thing.
Best Regards
mseberg -
Replacing a "primary" DC Win2K3 ?
Hello,
I've two DC servers (AD, DHCP, DNS and WINS), running Windows 2003R2 SP3, and the first of them (primary/master) died.
I'll replace it with a new Windows 2003 server (I can not use a newer version for the moment because we don't have 2008/2012 license).
What is the best practice to replace a "primary" DC?
Do I need to transfer FSMO roles to my second server, then add the new server and switch back the FSMO roles to the new server?
Thanks,
ChrisIf the DC that previously held the FSMO roles really has died and you have no backup, then you will need to seize the roles to the surviving DC before you promote a new DC. The procedure for seizing the roles is described in the following article.
http://support2.microsoft.com/kb/255504/en-nz
Also bear in mind that Windows Sever 2003 reaches end of extended support in July 2015.
Tony www.activedir.org Blog: www.open-a-socket.com -
How can I remove replace my primary address to another address?
How can I remove replace my primary address to another address? I have problem...it seems someone used my primary for their personalities and I just want to delete this primary account. Thank you in advance!
Hi AnaMusic,
The problem is that the button "Edit" doesn't exists at "Apple ID and Primary Email Address" section. Do you know how to make this happen? I absolutely need help!
Apple ID and Primary Email Address -
from journalctl -b, I get some warning messages like this.
Oct 26 12:52:48 myhost dbus-daemon[359]: dbus[359]: [system] Activating via systemd: service name='org.bluez' unit='dbus-org.bluez.service'
Oct 26 12:52:48 myhost dbus[359]: [system] Activating via systemd: service name='org.bluez' unit='dbus-org.bluez.service'
Oct 26 12:52:48 myhost dbus-daemon[359]: dbus[359]: [system] Activation via systemd failed for unit 'dbus-org.bluez.service': Unit dbus-org.
Oct 26 12:52:48 myhost dbus[359]: [system] Activation via systemd failed for unit 'dbus-org.bluez.service': Unit dbus-org.bluez.service fail
Oct 26 12:52:48 myhost pulseaudio[1234]: [pulseaudio] bluetooth-util.c: org.bluez.Manager.ListAdapters() failed: org.freedesktop.systemd1.Lo
Oct 26 12:52:49 myhost rtkit-daemon[1235]: Successfully made thread 1260 of process 1260 (/usr/bin/pulseaudio) owned by '1000' high priority
Oct 26 12:52:49 myhost rtkit-daemon[1235]: Supervising 2 threads of 2 processes of 1 users.
Oct 26 12:52:49 myhost pulseaudio[1260]: [pulseaudio] pid.c: Daemon already running.
I want to disalbe bluez.service,but failed,
#systemctl disable dbus-org.bluez.service
Failed to issue method call: No such file or directory
I have no bluetooth device in my laptop, how to resolve this porblem?
Thanks in advance.
Last edited by eastpeace (2012-10-26 06:12:53)AND additional information
# systemctl status dbus-org.bluez.service
dbus-org.bluez.service
Loaded: error (Reason: No such file or directory)
Active: inactive (dead) -
Hello
Im trying to replace first primary server.
Get error 22 on moving Certificate Authority Role to my new server "zman cai ......."
(Error:Could not find the object "Role ApplianceServer is not valid, server type: Primary")
Old server is a version 10.3.3 physical sles10 32-bitars, my new server is a virtual appliance 10.3.0 upgraded to 10.3.3 sles10 x64.
The documentation says that its not possible to move CA between Windoes and Linux but how about moving between x86 and x64?
/msvmsv,
It appears that in the past few days you have not received a response to your
posting. That concerns us, and has triggered this automated reply.
Has your problem been resolved? If not, you might try one of the following options:
- Visit http://support.novell.com and search the knowledgebase and/or check all
the other self support options and support programs available.
- You could also try posting your message again. Make sure it is posted in the
correct newsgroup. (http://forums.novell.com)
Be sure to read the forum FAQ about what to expect in the way of responses:
http://forums.novell.com/faq.php
If this is a reply to a duplicate posting, please ignore and accept our apologies
and rest assured we will issue a stern reprimand to our posting bot.
Good luck!
Your Novell Product Support Forums Team
http://forums.novell.com/ -
REG : HOT CODE REPLACE FAILED
Hi,
I am able to build and deploy and it says deployment is successfull from IDE. But when i tried to write some new code into my application or debug it,I am getting the error message saying:
"Java HotSpot(TM) Server VM{PSC-PC11741:500021](may be out of synch) was unable to replace the running code with the code in the work space.
Reason:
Hot code replace failed - VM may be inconsistent
Anybody has faced this problem?
Please let me know if you have faced this problem.
Regards,
AnuSynched and Rebuild it .Created a new activity after reverting back the tasks done before.
-
Hot code replace failed - Delete method not implemented
I debug my Java application with Eclipse (3.1). When I change my code during debugging, I get the following error message:
...MyApp at localhost:4540 (may be out of synch) was unable to replace the running code with the code in the workspace.
Reason:
Hot code replace failed - Delete method not implemented
What does that mean and how can I fix this?This means you changed a class while it was debugging an application and it could not update the class for the application while it was running.
The error suggests you may be running an older JVM, i.e. pre-1.4.2 but this error can occur with any JVM if the change is incompatible with the previous version of the class.
The hot replace often does not work for non trivial changes to code. -
Hot Code replace failed- VM inconsistent
When i am trying to deploy the application on WAS through NWDS, it is throwing Hot Code replace failed- VM inconsistent.
Please give your input.
thanks,Hi,
> It ask me for SDM password, than it throws the error out.
I do not really understand why the SDM password should be checked for debugging...
(Double-)Check the WD-Debugging infos here: http://help.sap.com/saphelp_nw04/helpdata/en/cc/9cb34d9d11f74c98644df2b96b90f1/frameset.htm
And check for standard HotCodeReplacement issues this for example: NetWeaver Portal Debugging - 2, HotSwap your classes
Maybe putting this question within the WD forum could bring out more...
Hope it helps
Detlev -
Hot code replace failed-VM may be inconsistent
Java HotSpot(TM)Server VM(micserver:50021)(may be out of synch)
was unable to replace the running code with the code in the
workspace
Reason:
Hot code replace failed-VM may be inconsistent
i face the following message and the code keep on retain to the old code.
this cause me have the debugging problemHi Yzme,
This happens since you save the changes to your code which is being debugged and your debugger session is still ON.
To avoid this close your debugger before making any changes to your code.
Regards,
Shubham -
Why don't apple repair iPod screen instead of replacing the whole unit?
I dropped my iPod a while back now and while it was initially superficial damage it's got to the point where 75% of the screen is unreadable and the touchscreen is completely broken.
Why do apple charge £100+ to replace the whole unit when there are other services that can repair the screen for less?All any of us here can say is "because that's how they decided to handle it". It's not uncommon for manufacturers of small electronic devices to just replace them for a flat fee regardless of the problem, though.
Regards.
Maybe you are looking for
-
1 to 1 Mapping using Indirection is causing a StackOverflowError exception
I am trying to map a database column that contains a BLOB (i.e. PDF) to a persistent object's attribute that will contain the BLOB as a "1 to 1 mapping" utilizing "Indirection". The type of the persistent object's attribute is ValueHolderInterface an
-
ITunes 10.5 won't recognize iPhone or iPad (Win7)
I've done a search and noticed many others with this issue but none of the resolutions have worked for me. My iPhone 4 and iPad are not recognized by iTunes, but my iPod is. I've uninstalled iTunes, deleted folders and registration entries, then rei
-
Adding the file name to each page of a pdf
I need to find a script that will allow me to add the file name of a pdf to each page within the pdf. I need the file name to appear in one of the corners of the pdf. for example if the file name is 001.pdf then I need to somehow attach that so it ap
-
How to get status of a ScheduledTask
<p>When someone cancels a running schedule manually from the GUI,that schedule is not listed by using the listJobStatus of theScheduler class. Even in some cases a failed schedule doesn'tappear in in the list of ExecutionStatus that we get by using
-
Error ORA-00911- while runing DB SCript in pl/sql
hi friends, i am getting the error , while running the script. set feedback off set define off alter table TSS_FID_TRADING_DATA disable all triggers; alter table TSS_FOREIGN_BOND_PRICE disable all triggers; set feedback on set define on thanks in adv