Surprises on Replacing failed primary unit

Dear friends,
I have done failover for firewalls umpteen number of times but yesterday it failed for some reason.
I had replaced the failed primary unit with a fresh one and i had expected that it will detect the secondary unit as active and try to begin config replication from it but rather it wiped off the secondary unit's config. I dont think that i faulted in the sequence but let me share with you what i did:
1. Put the four or five lines of failover configuration (except the failover command) and did a no shut on the failover interface (management0/0)
2. Ran the failover command
Instead of getting the config from the active unit, it started forcing the configs to the other unit. To restore, i had to reload the active unit to restore its config. After that i reloaded the fresh unit and now the failover happened as expected.
I think that i should forced a reload of the new unit before trying to establish failover.
Has anyone tried this in a fail-proof way during production hours? if yes, can you please share with me the steps?
I did not ask for downtime because i was confident but i resulted in bringing down the ASA for 5 minutes because of the unexpected failover action.
Thanks a lot
Gautam

Dear kureli,
Thanks a lot for the efforts you took. I really appreciate it.
Here's the exact sequence of steps that happened:
1.  When primary unit failed, secondary got active and i dont remember if sh fail showed "secondary- not detected" or "secondary - failed"
2.  I replaced the faulty primary unit with another primary unit and said no shut on the m0/0 failover interface and also put all the failover commands except "failover" command.
3. I made sure that the new primary unit runs the same code (i checked only the main code version, i did not check the asdm version similarity). The asdm versions were different on both boxes though.
4. After powering up the box and connecting cables, i said failover. It then prompted me saying that SSL license is not the same on both units and disabling failover.
5. I applied for an activiation key from [email protected] and then got the SSL license from them.
6. Next day i went back to the customer and installed the license key. After installing the license key, i said failover. It gave me the message "No response from mate"
7. I then said no failover to disable failover on the new primary unit.
8. I then went to secondary active unit and said failover as failover was disabled
9. I then went back to primary unit and said failover
10. This is where blank config replication started !!
11. Reloaded secondary unit to undo the blank running config
12. Went to Primary unit and disconnected the failover cable. Rebooted the primary unit and connected the failover cable.
13. Secondary came up as active, primary then came up, and this time primary honored the secondary as active and did config replication
14. All was well then!!
Not sure still why this happened and it was a bit shameful for me to see this happening after 3.5 years of firewalling experience.
Anyways, i am willing to learn and improve from now on.
Probably next time, i would try to make sure that i apply the failover configs, reload, and while reload connect the failover cable.
I think the learning lesson is that if the unit reloads, the reloaded unit always honors the currently active unit and does not try to override its role.
This is what worked for me.
Thanks a lot
Gautam

Similar Messages

  • Replacement of primary unit failed! (ASA5510 active/standby)

    Hi all,
    I have an issue bringing up my RMA'd primary ASA unit.
    So what happened so far:
    1. primary unit failed
    2. secondary took over and is now secondary - active (as per sh fail)
    2. requested RMA at Cisco
    3. got ASA and checked that Lic (SSL), OS (8.2.2) and ASDM are at the same level as the secondary
    4. issued wr erase and reloaded
    5. copied the following commands to the new (RMA) primary unit:
    failover lan unit primary
    failover lan interface Failover Ethernet3
    failover interface ip Failover 172.x.x.9 255.255.255.248 standby 172.x.x.10
    int eth3
    no shut
    failover
    wr mem
    6. installed primary unit into rack
    7. plugged-in all cables (network, failover, console and power)
    8. fired up the primary unit
    9. expected that the unit shows:
    Detected an Active mate
    Beginning configuration replication from mate.
    End configuration replication from mate.
    10. but nothing happened on primary unit
    So can anyone give me assistance on what is a valid and viable approach in replacing a failed primary unit? Is there a missing step that hinders me to successfully replicate the secondary - active config to the primary - standby unit.
    I was looking for help on the net but unfortunately I was not able to find anything related to ASA55xx primary unit replacement with a clear guideline or step by step instructions.
    Any comments or suggestions are appreciated, and might help others who are in the same situation.
    Thanks,
    Nico

    Hi Varun,
    Thanks for catching-up this thread.
    Here you go:
    sh run fail on secondary - active:
    failover
    failover lan unit secondary
    failover lan interface Failover Ethernet0/3
    failover key *****
    failover link Failover Ethernet0/3
    failover interface ip Failover 172.x.x.9 255.255.255.248 standby 172.x.x.10
    sh fail hist on secondary - active:
    asa1# sh fail hist
    ==========================================================================
    From State                 To State                   Reason
    ==========================================================================
    23:47:15 CEST Feb 19 2011
    Not Detected               Negotiation                No Error
    23:47:19 CEST Feb 19 2011
    Negotiation                Cold Standby               Detected an Active mate
    23:47:21 CEST Feb 19 2011
    Cold Standby               Sync Config                Detected an Active mate
    23:47:36 CEST Feb 19 2011
    Sync Config                Sync File System           Detected an Active mate
    23:47:36 CEST Feb 19 2011
    Sync File System           Bulk Sync                  Detected an Active mate
    23:47:50 CEST Feb 19 2011
    Bulk Sync                  Standby Ready              Detected an Active mate
    10:34:09 CEDT Sep 3 2011
    Standby Ready              Just Active                HELLO not heard from mate
    10:34:09 CEDT Sep 3 2011
    Just Active                Active Drain               HELLO not heard from mate
    10:34:09 CEDT Sep 3 2011
    Active Drain               Active Applying Config     HELLO not heard from mate
    10:34:09 CEDT Sep 3 2011
    Active Applying Config     Active Config Applied      HELLO not heard from mate
    10:34:09 CEDT Sep 3 2011
    Active Config Applied      Active                     HELLO not heard from mate
    ==========================================================================
    sh fail on secondary - active
    asa1# show fail
    Failover On
    Failover unit Secondary
    Failover LAN Interface: Failover Ethernet0/3 (up)
    Unit Poll frequency 1 seconds, holdtime 15 seconds
    Interface Poll frequency 5 seconds, holdtime 25 seconds
    Interface Policy 1
    Monitored Interfaces 2 of 110 maximum
    Version: Ours 8.2(2), Mate 8.2(2)
    Last Failover at: 10:34:09 CEDT Sep 3 2011
            This host: Secondary - Active
                    Active time: 441832 (sec)
                    slot 0: ASA5510 hw/sw rev (2.0/8.2(2)) status (Up Sys)
                      Interface Outside (x.x.x.14): Normal (Waiting)
                      Interface Inside (x.x.x.11): Normal (Waiting)
                    slot 1: empty
            Other host: Primary - Failed
                    Active time: 40497504 (sec)
                    slot 0: ASA5510 hw/sw rev (2.0/8.2(2)) status (Unknown/Unknown)
                      Interface Outside (x.x.x.15): Unknown
                      Interface Inside (x.x.x.12): Unknown
                    slot 1: empty
    Stateful Failover Logical Update Statistics
            Link : Failover Ethernet0/3 (up)
            Stateful Obj    xmit       xerr       rcv        rerr
            General         2250212    0          64800624   309
            sys cmd         2250212    0          2249932    0
            up time         0          0          0          0
            RPC services    0          0          0          0
            TCP conn        0          0          46402635   309
            UDP conn        0          0          21248      0
            ARP tbl         0          0          15921639   0
            Xlate_Timeout   0          0          0          0
            IPv6 ND tbl     0          0          0          0
            VPN IKE upd     0          0          96977      0
            VPN IPSEC upd   0          0          108174     0
            VPN CTCP upd    0          0          19         0
            VPN SDI upd     0          0          0          0
            VPN DHCP upd    0          0          0          0
            SIP Session     0          0          0          0
            Logical Update Queue Information
                            Cur     Max     Total
            Recv Q:         0       17      203259096
            Xmit Q:         0       1       2250212
    show ver on secondary - active
    asa1# sh ver
    Cisco Adaptive Security Appliance Software Version 8.2(2)
    Device Manager Version 6.2(5)53
    Compiled on Mon 11-Jan-10 14:19 by builders
    System image file is "disk0:/asa822-k8.bin"
    Config file at boot was "startup-config"
    asa1 up 200 days 12 hours
    failover cluster up 1 year 108 days
    Hardware:   ASA5510, 256 MB RAM, CPU Pentium 4 Celeron 1600 MHz
    Internal ATA Compact Flash, 256MB
    Slot 1: ATA Compact Flash, 64MB
    BIOS Flash M50FW080 @ 0xffe00000, 1024KB
    Encryption hardware device : Cisco ASA-55x0 on-board accelerator (revision 0x0)
                                 Boot microcode   : CN1000-MC-BOOT-2.00
                                 SSL/IKE microcode: CNLite-MC-SSLm-PLUS-2.03
                                 IPSec microcode  : CNlite-MC-IPSECm-MAIN-2.04
    0: Ext: Ethernet0/0         : address is 0022.55cf.7420, irq 9
    1: Ext: Ethernet0/1         : address is 0022.55cf.7421, irq 9
    2: Ext: Ethernet0/2         : address is 0022.55cf.7422, irq 9
    3: Ext: Ethernet0/3         : address is 0022.55cf.7423, irq 9
    4: Ext: Management0/0       : address is 0022.55cf.741f, irq 11
    5: Int: Not used            : irq 11
    6: Int: Not used            : irq 5
    Licensed features for this platform:
    Maximum Physical Interfaces    : Unlimited
    Maximum VLANs                  : 100
    Inside Hosts                   : Unlimited
    Failover                       : Active/Active
    VPN-DES                        : Enabled
    VPN-3DES-AES                   : Enabled
    Security Contexts              : 2
    GTP/GPRS                       : Disabled
    SSL VPN Peers                  : 10
    Total VPN Peers                : 250
    Shared License                 : Disabled
    AnyConnect for Mobile          : Disabled
    AnyConnect for Cisco VPN Phone : Disabled
    AnyConnect Essentials          : Disabled
    Advanced Endpoint Assessment   : Disabled
    UC Phone Proxy Sessions        : 2
    Total UC Proxy Sessions        : 2
    Botnet Traffic Filter          : Disabled
    This platform has an ASA 5510 Security Plus license.
    Serial Number: xxx
    Running Activation Key:xxxx
    Configuration register is 0x1
    Configuration last modified by enable_1 at 10:05:32.149 CEDT Fri Jul 15 2011

  • Correct procedure to replace failed secondary ASA unit

    Hello
    i just received a RMA for failed ASA 5520 that was acting as secondary unit in multicontext configuration. What would be correct procedure to install it back in production? Do i need to restore backed up config of the fallen unit or is it just enough to enable multimode and connect to existing (primary) unit? Any good link for documentation that deal with this issues would be also appreciated.
    Thanks in advance

    Configure the ASA for failover communication and as a secondary unit. This is done from the system context so yes you need to switch it into the multiple context routed mode. Power the asa on and connect only the failover communication interface. This will make sure that it is seen by the primary as failed. Once the failover communication is up, and the configutation synchronisation and connection replicatiin are over, connect the traffic interfaces.
    This is pretty much it. Hope it helps.
    Sent from Cisco Technical Support iPad App

  • REPLACING THE PRIMARY INTERNAL BATTERY 601

    MY PAVILION dv6-2150ev tells me i must replace the primary internal battery 601.
    i do no know which is this battery and how to change it

    Hi,
    This is a VERY misleading message. It's actually the main battery. Before buying a new battery, please try the following test:
        http://h10025.www1.hp.com/ewfrf/wc/document?cc=us&lc=en&docname=c00821536
    Hope the test and calibrate help otherwise you have to buy new battery. You can buy from HP or from:
        http://www.top-laptop-battery.com.au/6cell-hp-pavilion-dv62150ev-battery-108v-5200mah-p-9983.html
    Regards.
    BH
    **Click the KUDOS thumb up on the left to say 'Thanks'**
    Make it easier for other people to find solutions by marking a Reply 'Accept as Solution' if it solves your problem.

  • Replacing the primary site server, SCCM 2012 R2

    Hi,
    Setup is currently SCCM 2012 R2 with the primary site server on Windows 2008 R2 on a Hyper-V VM with a separate physical SQL server on 2008 R2 (Windows and SQL). I'm not making any changes to the SQL server at this stage.
    I want to completely replace the primary site server VM completely with a brand new VM using Windows 2012 R2.
    I was looking to run the SCCM setup and using the Expand an existing stand-alone primary into a hierarchy option but it wont let me do that due to "The site code you specified should not be the same as the primary site's site code".
    So seems this Expand an existing option isn't the right option for me?
    Do I just add the new server in the SCCM console and add all roles and test and then delete the roles from the old server?
    Thanks
    EDIT: I have 7 Secondary sites, so I dont want to just start over by doing some sort of backup and restore to the new server unless someone can tell me that wont break my hierarchy. 

    a backup and recovery is the correct way to do this, but there are a few things you should keep in mind.
    Do a backup of your current environment and shut everything down.
    install new vm, ensure the configuration is identical, (dns, hostname, etc...); install the same software and pre-requisites,
    do a recovery
    check the following links on how to do a backup and restore:
    http://www.windows-noob.com/forums/index.php?/topic/7403-how-can-i-backup-system-center-2012-configuration-manager/
    and
    http://technet.microsoft.com/en-us/library/gg712697.aspx

  • Why do we have to flashback the failed primary DB to make it the new standby ?

    RDBMS Version: 11.2.0.3
    Platform : RHEL 5.8
    Standby Type : Physical Standby
    Primary DB name        : berlin
    Standby DB Unique name : lisbon
    Lets say berlin DB goes down due to a hardware failure so lisbon is made the Primary DB. Now lisbon is running without a standby DB.
    berlin crashed --------> at SCN 4000
    lisbon takes over ------> from SCN 4000 and proceeds
    In the below url, some steps involving flashback are mentioned on how to make the failed primary (berlin) the new physical standby.
    http://docs.oracle.com/cd/E11882_01/server.112/e25608/scenarios.htm#i1050055
    berlin DB crashed at SCN 4000. So, to make berline the new physical standby , can't we just run
    SQL> ALTER DATABASE CONVERT TO PHYSICAL STANDBY;
    and mount the DB and start the Redo apply ?
    Why do we need to flashback the old primary DB to SCN 4000. Wasn't berlin DB already at SCN 4000 when it crashed?

    Hello;
    The key is the SCN must be at exactly the same point the Standby became Primary.
    Generally you want to use Flashback so the "Old primary" can be at a point where it can become the new Standby. Once you are there you can use Switchover to return the Data Guard setup.
    To get the correct SCN I use :
    SELECT TO_CHAR(STANDBY_BECAME_PRIMARY_SCN) FAILOVER_SCN FROM V$DATABASE;
    Example
    SQL> FLASHBACK DATABASE TO SCN 1011295;
    Flashback complete.
    SQL>ALTER DATABASE CONVERT TO PHYSICAL STANDBY;
    Database altered
    If you don't have flashback you can use RMAN to perform the same thing.
    Best Regards
    mseberg

  • Replacing a "primary" DC Win2K3 ?

    Hello,
    I've two DC servers (AD, DHCP, DNS and WINS), running Windows 2003R2 SP3, and the first of them (primary/master) died.
    I'll replace it with a new Windows 2003 server (I can not use a newer version for the moment because we don't have 2008/2012 license).
    What is the best practice to replace a "primary" DC?
    Do I need to transfer FSMO roles to my second server, then add the new server and switch back the FSMO roles to the new server?
    Thanks,
    Chris

    If the DC that previously held the FSMO roles really has died and you have no backup, then you will need to seize the roles to the surviving DC before you promote a new DC.  The procedure for seizing the roles is described in the following article.
    http://support2.microsoft.com/kb/255504/en-nz
    Also bear in mind that Windows Sever 2003 reaches end of extended support in July 2015.
    Tony www.activedir.org Blog: www.open-a-socket.com

  • How can I remove replace my primary address to another address?

    How can I remove replace my primary address to another address? I have problem...it seems someone used my primary for their personalities and I just want to delete this primary account. Thank you in advance!

    Hi AnaMusic,
    The problem is that the button "Edit" doesn't exists at "Apple ID and Primary Email Address" section. Do you know how to make this happen? I absolutely need help!
    Apple ID and Primary Email Address

  • Dbus[359]: [system] Activation via systemd failed for unit 'dbus-org.b

    from  journalctl -b, I get some warning messages like this.
    Oct 26 12:52:48 myhost dbus-daemon[359]: dbus[359]: [system] Activating via systemd: service name='org.bluez' unit='dbus-org.bluez.service'
    Oct 26 12:52:48 myhost dbus[359]: [system] Activating via systemd: service name='org.bluez' unit='dbus-org.bluez.service'
    Oct 26 12:52:48 myhost dbus-daemon[359]: dbus[359]: [system] Activation via systemd failed for unit 'dbus-org.bluez.service': Unit dbus-org.
    Oct 26 12:52:48 myhost dbus[359]: [system] Activation via systemd failed for unit 'dbus-org.bluez.service': Unit dbus-org.bluez.service fail
    Oct 26 12:52:48 myhost pulseaudio[1234]: [pulseaudio] bluetooth-util.c: org.bluez.Manager.ListAdapters() failed: org.freedesktop.systemd1.Lo
    Oct 26 12:52:49 myhost rtkit-daemon[1235]: Successfully made thread 1260 of process 1260 (/usr/bin/pulseaudio) owned by '1000' high priority
    Oct 26 12:52:49 myhost rtkit-daemon[1235]: Supervising 2 threads of 2 processes of 1 users.
    Oct 26 12:52:49 myhost pulseaudio[1260]: [pulseaudio] pid.c: Daemon already running.
    I want to disalbe bluez.service,but failed,
    #systemctl disable dbus-org.bluez.service
    Failed to issue method call: No such file or directory
    I have no bluetooth device in my laptop, how to resolve this porblem?
    Thanks in advance.
    Last edited by eastpeace (2012-10-26 06:12:53)

    AND additional information
    # systemctl status dbus-org.bluez.service
    dbus-org.bluez.service
          Loaded: error (Reason: No such file or directory)
          Active: inactive (dead)

  • Replacing first primary

    Hello
    Im trying to replace first primary server.
    Get error 22 on moving Certificate Authority Role to my new server "zman cai ......."
    (Error:Could not find the object "Role ApplianceServer is not valid, server type: Primary")
    Old server is a version 10.3.3 physical sles10 32-bitars, my new server is a virtual appliance 10.3.0 upgraded to 10.3.3 sles10 x64.
    The documentation says that its not possible to move CA between Windoes and Linux but how about moving between x86 and x64?
    /msv

    msv,
    It appears that in the past few days you have not received a response to your
    posting. That concerns us, and has triggered this automated reply.
    Has your problem been resolved? If not, you might try one of the following options:
    - Visit http://support.novell.com and search the knowledgebase and/or check all
    the other self support options and support programs available.
    - You could also try posting your message again. Make sure it is posted in the
    correct newsgroup. (http://forums.novell.com)
    Be sure to read the forum FAQ about what to expect in the way of responses:
    http://forums.novell.com/faq.php
    If this is a reply to a duplicate posting, please ignore and accept our apologies
    and rest assured we will issue a stern reprimand to our posting bot.
    Good luck!
    Your Novell Product Support Forums Team
    http://forums.novell.com/

  • REG : HOT CODE REPLACE FAILED

    Hi,
    I am able to build and deploy and it says deployment is successfull from IDE. But when i tried to write some new code into my application or  debug it,I am getting the error message  saying:
    "Java HotSpot(TM) Server VM{PSC-PC11741:500021](may be out of synch) was unable to replace the running code with the code in the work space.
    Reason:
    Hot code replace failed - VM may be inconsistent
    Anybody has faced this problem?
    Please let me know if you have faced this problem.
    Regards,
    Anu

    Synched and Rebuild it .Created a new activity after reverting back the tasks done before.

  • Hot code replace failed - Delete method not implemented

    I debug my Java application with Eclipse (3.1). When I change my code during debugging, I get the following error message:
    ...MyApp at localhost:4540 (may be out of synch) was unable to replace the running code with the code in the workspace.
    Reason:
    Hot code replace failed - Delete method not implemented
    What does that mean and how can I fix this?

    This means you changed a class while it was debugging an application and it could not update the class for the application while it was running.
    The error suggests you may be running an older JVM, i.e. pre-1.4.2 but this error can occur with any JVM if the change is incompatible with the previous version of the class.
    The hot replace often does not work for non trivial changes to code.

  • Hot Code replace failed- VM inconsistent

    When i am trying to deploy the application on WAS through NWDS, it is throwing Hot Code replace failed- VM inconsistent.
    Please give your input.
    thanks,

    Hi,
    > It ask me for SDM password, than it throws the error out.
    I do not really understand why the SDM password should be checked for debugging...
    (Double-)Check the WD-Debugging infos here: http://help.sap.com/saphelp_nw04/helpdata/en/cc/9cb34d9d11f74c98644df2b96b90f1/frameset.htm
    And check for standard HotCodeReplacement issues this for example: NetWeaver Portal Debugging - 2, HotSwap your classes
    Maybe putting this question within the WD forum could bring out more...
    Hope it helps
    Detlev

  • Hot code replace failed-VM may be inconsistent

    Java HotSpot(TM)Server VM(micserver:50021)(may be out of synch)
    was unable to replace the running code with the code in the
    workspace
    Reason:
    Hot code replace failed-VM may be inconsistent
    i face the following message and the code keep on retain to the old code.
    this cause me have the debugging problem

    Hi Yzme,
    This happens since you save the changes to your code which is being debugged and your debugger session is still ON.
    To avoid this close your debugger before making any changes to your code.
    Regards,
    Shubham

  • Why don't apple repair iPod screen instead of replacing the whole unit?

    I dropped my iPod a while back now and while it was initially superficial damage it's got to the point where 75% of the screen is unreadable and the touchscreen is completely broken.
    Why do apple charge £100+ to replace the whole unit when there are other services that can repair the screen for less?

    All any of us here can say is "because that's how they decided to handle it". It's not uncommon for manufacturers of small electronic devices to just replace them for a flat fee regardless of the problem, though.
    Regards.

Maybe you are looking for

  • 1 to 1 Mapping using Indirection is causing a StackOverflowError exception

    I am trying to map a database column that contains a BLOB (i.e. PDF) to a persistent object's attribute that will contain the BLOB as a "1 to 1 mapping" utilizing "Indirection". The type of the persistent object's attribute is ValueHolderInterface an

  • ITunes 10.5 won't recognize iPhone or iPad (Win7)

    I've done a search and noticed many others with this issue but none of the resolutions have worked for me. My iPhone 4 and iPad are not recognized by iTunes, but my iPod is.  I've uninstalled iTunes, deleted folders and registration entries, then rei

  • Adding the file name to each page of a pdf

    I need to find a script that will allow me to add the file name of a pdf to each page within the pdf. I need the file name to appear in one of the corners of the pdf. for example if the file name is 001.pdf then I need to somehow attach that so it ap

  • How to get status of a ScheduledTask

    <p>When someone cancels a running schedule manually from the GUI,that schedule is not listed by using the listJobStatus of theScheduler class.  Even in some cases a failed schedule doesn'tappear in in the list of ExecutionStatus that we get by using

  • Error ORA-00911- while runing DB SCript in pl/sql

    hi friends, i am getting the error , while running the script. set feedback off set define off alter table TSS_FID_TRADING_DATA disable all triggers; alter table TSS_FOREIGN_BOND_PRICE disable all triggers; set feedback on set define on thanks in adv