Manual or Automatic failover?

For SQL Server (2005/2008/R2) cluster, after failover happened from one node to another, how do I tell whether it was an automatic failover or somebody failed it over manually. This would help further troubleshoot what caused the failover. Please
consider both Windows Server 2003 and Windows Server 2008 scenario. Thanks.

Hi yarkandstar,
Based on your description, you can check the SQL Server Error logs to find if it is an automatic failover or a manual failover in  SQL Server cluster. In the Error logs, if SQL Server restart multiple times in the same node and then come online in the
other node, it should ideally be an automatic failover. If SQL Server is running on a different node after the first restart, it is a manual failover.
Also, as Balmukund’s post, you can check the cluster log  to know whether SQL Server has an automatic failover or a manual failover.
For more details about checking cluster log in Windows server 2003 cluster, please review this blog:
How to find whether SQL Server had an Automatic Failover or a Manually initiated Failover in cluster.
For more details about checking cluster log in Windows server 2008 cluster, please review this similar thread:
Need to know if it was a manual or an automatic failover - WIndows server 2008 R2.
Thanks,
Lydia Zhang

Similar Messages

  • Failover Occured manually or automatically (SQL Server AlwayOn with Availability Groups)

    Hello everyone,
    we had a Failover today on a Windows Server 2012 Failover Cluster with SQL Server Always On and now I'd like to know if the failover was done by the Cluster (or SQL?) or manually from a user...
    In the Cluster Log I can see following entry: INFO  [RCM] rcm::RcmApi::MoveGroup: (CVGID2, 1, 0, MoveType::Manual )
    So I assume that the failover was done by a user... But: I can see only Movetypes of "Manual" in the cluster log of this always on system...
    So my question is: Is it possible that since SQL Server (Always On) is kind of "driving" the cluster, that if it makes an automatic failover the Cluster will log this as a "manual" failover because it was not done by the Cluster service
    itself?
    Thanks for any help!
    Ville

    Hi Ville,
    When we meet this event it indicate it is not an issue since the failover was triggered manually.
    The related third party article:
    dbi services Blog
    http://www.dbi-services.com/index.php/blog/entry/wsfc-manual-or-automatic-failover-that-is-the-question
    I’m glad to be of help to you!
    *** This response contains a reference to a third party World Wide Web site. Microsoft is providing this information as a convenience to you. Microsoft does not control these
    sites and has not tested any software or information found on these sites; therefore, Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. There are inherent dangers in the use
    of any software found on the Internet, and Microsoft cautions you to make sure that you completely understand the risk before retrieving any software from the Internet. ***
    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Support, contact [email protected]

  • HACMP Clustering Script for SAP ECC 6.0 (SR1) - Automatic Failover

    Hello,
    I have installed the SAP ECC 6.0 (SR1) under AIX 5.3 / DB2 V8 FP12 with HACMP Clustering environment. Manual Failover is working fine. Central System has been installed in share drive with Virtual IP and Virtual name inNode A. Dialog Instance is loaded locally in Node B. I want to get HACMP Clustering script(automatic failover script) for Automation. Please help me if you have. It is single package clustering. If Node A fails, Node B will take care ( Central System and Dialog instance will run in Node B)
    Thanks
    Gautam Poddar

    this post is duplicated at Upgrade to ERP 2005/ECC 6.0 from  R/3 4.72/Basis 640 on Z/OS 1.4 DB2 8.1

  • HACMP Clustering Script for SAP ECC 6.0 (SR1) - Automatic Failover-Oracle10

    Hello,
    I have installed the SAP ECC 6.0 (SR1) under AIX 5.3 / Oracle 10g with HACMP Clustering environment. Manual Failover is working fine. ASCS and Database instances are loaded in share drive with Virtual IP and Virtual name. Central Instance and Dialog Instance are loaded locally in Node A and Node B. I want to get HACMP Clustering script(automatic failover script) for Automation. Please help me if you have.
    Thanks
    Gautam Poddar

    Here are HA stop & start scripts that you should be able adapt for your particular circumstances. Based on earlier versions of SAP / Oracle but assume should be a reasonable guide
    Script to start SAP is start_sap_prd
    #!/bin/ksh
    Script:         /usr/local/bin/cluster/start_sap_prd
    Comments:       HACMP Application START script for PRD
    Show me obvious information in hacmp.out
    banner "Starting"
    banner "PRD SAP"
    Set the oracle and sap owner.
    ORASID="PRD"
    SAPADM="prdadm"
    ORAUSR="oraprd"
    VIRTUALHOST="vhost"
    DEVHOST="vhostdev"
    Get the volume groups for this resource group
    RG=$( /usr/es/sbin/cluster/utilities/cllsgrp | grep -i $ )
    VG_LIST=$( /usr/es/sbin/cluster/utilities/cllsres -g $ | \
            grep "VOLUME_GROUP=" | \
            awk -F\" '{ print $2 }' )
    Check the transport directory is mounted.
    if mount | grep -w "/usr/sap/trans"
      then
            print "Transport directory is already mounted."
      else
            cd /tmp
            print "Attempting a background mount of the transport directory."
            nohup mount -o intr,bg,soft :/usr/sap/trans1 /usr/sap/trans &
    fi
    #Start SAP and Oracle
    #Start listener
    su - $ -c /rprd/oracle/PRD/920_64/bin/lsnrctl start
    rc=$?
    if [ $? != 0 ]
      then
            echo "ERROR: Listener failed to start\n"
    fi
    #Start Database
    su - $ -c "/rprd/oracle/PRD/bin/start_database_PRD.sh"
    sleep 20
    Standard sapstart script
    su - $ -c startsap $
    Script:       /usr/local/bin/cluster/stop_sap_prd
    Dated:        01/11/06
    Application:  Oracle/SAP
    Comments:     HACMP Application STOP script for SAP / Oracle PRD
    Show me obvious information in hacmp.out
    Set the oracle and sap owner.
    rc=$?
    if [ $? != 0 ]
    then
            echo "ERROR: Failed to start SAP\n"
    fi
    exit 0
    Script to stop SAP is stop_sap_prd
    #!/bin/ksh
    set -x
    banner "stopping"
    banner "PRD SAP"
    ORASID="PRD"
    SAPADM="prdadm"
    ORAUSR="oraprd"
    VIRTUALHOST="vhost"
    #Stop SAP/Oracle
    su - $ -c stopsap $
    rc=$?
    if [ $? != 0 ]
    then
            echo "ERROR: Failed to stop SAP and Oracle\n"
            break
    fi
    Stop SAP collector and Oracle listener.
    su - $ -c /usr/sap/PRD/SYS/exe/run/saposcol -k
    rc=$?
    if [ $? != 0 ]
    then
            echo "ERROR: Failed to stop SAPOSCOL \n"
    fi
    su - $ -c /rprd/oracle/PRD/920_64/bin/lsnrctl stop
    rc=$?
    if [ $? != 0 ]
    then
            echo "ERROR: Listener failed to stop\n"
    fi
    if mount | grep -w "/usr/sap/trans"
      then
            print "Transport directory is mounted."
            /usr/es/sbin/cluster/events/utils/cl_nfskill -k -u /usr/sap/trans
            sleep 1
            /usr/es/sbin/cluster/events/utils/cl_nfskill -k -u /usr/sap/trans
            sleep 1
            umount -f /usr/sap/trans &
      else
            print "Transport directory is not mounted."
    fi
    exit 0

  • Availability group Automatic failover

    Hi
    setup a simple 2 node AG, sync. (SQL 2014 enterprise on windows 2012R2 standard)
    if I set it as manual failover everything works as expected. however when I switch to automatic failover and stop SQL service on the primary node the AG resource in cluster does offline and doesn't failover to secondary node.
    both nodes are available to the cluster resourse.
    would appreciate your feedback as to what might be the reason
    Regards
    Shaunt

    Hi,
    I would verify if Database Availability Group means AlwaysOn Availability Group.
    How did you set the FailureConditionLevel?
    Whether the diagnostic data and health information returned by sp_server_diagnostics warrants an automatic failover depends on the failure-condition level of the availability group. The failure-condition level specifies what failure conditions
    trigger an automatic failover. There are five failure-condition levels, which range from the least restrictive (level one) to the most restrictive (level five). For details about failure-conditions level, see:
    http://msdn.microsoft.com/en-us/library/hh710061.aspx#FClevel
    There are two useful articles may be helpful:
    SQL 2012 AlwaysOn Availability groups Automatic Failover doesn’t occur or does it – A look at the logs
    http://blogs.msdn.com/b/sql_pfe_blog/archive/2013/04/08/sql-2012-alwayson-availability-groups-automatic-failover-doesn-t-occur-or-does-it-a-look-at-the-logs.aspx
    SQL Server 2012 AlwaysOn – Part 7 – Details behind an AlwaysOn Availability Group
    http://blogs.msdn.com/b/saponsqlserver/archive/2012/04/24/sql-server-2012-alwayson-part-7-details-behind-an-alwayson-availability-group.aspx
    Thanks.
    Tracy Cai
    TechNet Community Support
    Hi,
    Thanks for the reply.
    It's an AlwaysOn Availability Group.
    In my test lab, I have changed the quorum configuration to a file share witness and that has allowed an automatic failover when I turn the primary replica server off (rather than power it off).
    I'll take a look at the links you provided.
    Regards,
    Bob

  • Oracle 9i Dataguard automatic failover

    Hi Gurus,
    I wanted to setup automatic failover from primary to physical standby database in oracle 9i . Could you please advice me what are necessary things required for this setup. Right now we are manually doing the activity.
    thanks in advance
    regards,
    Shaan

    Hi,
    there is thread with the same question and I put there one link.
    Re: DGMGRL switchover automatically without manual intervention
    Regards,
    Tom
    http://oracledba.cz

  • DataGuard automatic failover

    Say an automatic failover has happened.
    The former primary DB is down.
    The former standby DB is now the primary DB and is up and running.
    I assume the clients have to reconnect to the second DB.
    Now, how can we make the automatic failover for the clients?
    (At least that you do not have to manually change the clients' tnsnames.ora to the new DB.)

    Take a look into the NET8 manual. The simple way is to define multiple addresses in the address list. That way you get a connect time failover: if the first instance isn't reachable the scond one will be used and so on.
    DB =
    (DESCRIPTION =
    (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.1)(PORT = 1521))
    (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.2)(PORT = 1521))
    (CONNECT_DATA =
    (SERVICE_NAME = ORCL)
    )

  • Configuring Automatic Failover for EPM Planning Cluster

    We are trying to test automatic failover using a Planning(11.1.2.2)/weblogic cluster containing 2 physical servers and a Weblogic proxy plug-in for OHS.
    I understand that to enable this we must configure in-memory replication of HTTP session states and to do this, (according to various sources including ID 779350.1) the weblogic.xml file must include a descriptor set up as follows:
    <session-descriptor>
       <session-param>
        <param-name>PersistentStoreType</param-name>
        <param-value>replicated</param-value>
    </session-param>
    </session-descriptio>
    Where should weblogic.xml be created or amended (if it already exists) for a Planning cluster in a standard scaled out EPM deployment in order to effect failover between the two servers.
    Thanks

    yes, it can be load balanced in hyperion registry i believe, seen it once only drawback is ,if a JVM goes down while processing a request it needs to be manually started, however the url will switch automatically   

  • Do phones automatically failover when a Subscriber is down?

    We are running CUCM 8.6 with one publisher and two subscribers. One of the subscribers has failed and we are now running off only one subscriber. In this mode, we are experiencing a number of minor problems with the phones including:
    Time on handsets using incorrect format. (shows as 24hour when it should be 12 hour)
    Phones keep ringing after the extension has beenanswered.
    One-Way voice.
    Calls dropping out when put on hold
    Unable to pickup call when it's put on hold.
    Are the phones meant to automatically failover to the other subscriber when one fails? Or do we need to manually reboot the handsets for this to take place?
    Also, is all functionality meant to be available when running in the degraded mode?

    Hi Robert,
    The phones should automatically failover to the backup subscriber if the backup subscriber is added in the Callmanager group which is assigned to the device pool of the IP phone. For issues like one way audio and dropped calls you also need to check that the media resources like Transcoder , CFB , MTP etc have backup servers defined, else you will experience issues where these resources are requested but they are unregistered as the only subscriber which is defined for them is down.
    HTH
    Manish

  • Dual internet connection automatic failover?

    Hello,
    we have two internet connections at home, and both are not very reliable, so i tought about configuring them for automatic failover.
    I'm using them on my W520 and both connections are coming from separate routers, which are always running.
    One is conected to w520 via ethernet cable, second via wifi.
    Both are on different subnets and normally both routers IPs are set as default gateway.
    The problem is, that the routers are always on, so the w520 has usually still connectivity to both routers (i.e. ping to both 192.168.100.252 and 192.168.1.1 is always working).
    If one internet connection drops, this is on router side and w520 and its windows doesn't notice this as the connection from w520 to router is still OK.
    So if the active connection fails (on router side) i have to manually disconnect it on w520, otherwise i get no internet (as the traffic is still routed via same network interface which reaches only the router [that without internet]).
    I read something to use different metric for interfaces, but this is already configured "automatically" and this seem not to have any effect in case that connection to routers still work.
    C:\>route print
    IPv4 Route Table
    ===========================================================================
    Active Routes:
    Network Destination        Netmask          Gateway       Interface  Metric
              0.0.0.0          0.0.0.0  192.168.100.252    192.168.100.9     20
              0.0.0.0          0.0.0.0      192.168.1.1      192.168.1.5     30
    I tried also to configure "Access Connections" for both connections simultaneously hoping it will choose the second in case of failure, but this doesn't work too:
    (it maybe works in case the one connection fails to connect on w520 (e.g. router down), but this is not my case)
    Is there any way how to configure this failover within OS (windows 7) only, without some dual wan router?
    Thanks

    As long as Windows have a working connection (interface link) either lan or wlan Windows don't know that one of the routers do not have a working internet connection. As far as i know there is no such functionality in Windows to make this work automatically. There might be some software available that can help you out, but not sure how this software should be aware of the missing internet access from one of the routers. That software will have to change the metric in the routing table (route print) on your Windows machine when one link is down.
    With regards to the metric and default gateway, you can use several interfaces all configured with a default gateway, but only the one with the lowest metric will be used. The default gateway is the last resort and normally this is the only one used on a client computer even if it's possible to add more specific static routes. Two interfaces with a default gateway and same metric won't work and don't think it's even possible to set in Windows. In that case how should Windows know what default gateway to use. So basically set more than one default gateway in Windows is pointless unless you want Windows to switch when the link goes down (link to the PC).
    I think the best option is to do this on the router and buy the required equipment for such functionality and maybe it's the only working options as well. You can also create two batchfiles, one that change the metric and default to use connection 1 and the other to use connection 2. That way you can switch manually just by a doubleclick on a icon. You use "route add...", "route delete..." and "route change..." to add, change and delete routes like the default gateway. It's still not automatic though.
    -gan

  • Do manually installed Agents failover?

    Hi, a very quick question that I can't find an answer for.
    Does automatic failover work (once configured) for a SCOM agent that is manually installed?
    I think they can't but would like confirmation.
    Cheers

    For SCOM 2012, all agents will automatic failover whether it is installed manually or not. you may also use the following powershell cmdlet to verify the auto failover of the agent
    #Verify Failover for Agents reporting to MS1
     $Agents = Get-SCOMAgent | where {$_.PrimaryManagementServerName -eq 'MS1.DOMAIN.COM'}
     $Agents | sort | foreach {
     Write-Host "";
     "Agent :: " + $_.Name;
     "--Primary MS :: " + ($_.GetPrimaryManagementServer()).ComputerName;
     $failoverServers = $_.getFailoverManagementServers();
     foreach ($managementServer in $failoverServers) {
     "--Failover MS :: " + ($managementServer.ComputerName);
     Write-Host "";
    Roger

  • The time on my iphone 5C is constantly changing, regardless of me setting the correct time manually or automatically, all iOS is up to date and phone is not damaged, any ideas?

    The time will stay correct for an hour or so, then jump backwards for no reason. I have tried setting the time manually and automatically and still get the same problem, and have tried with wifi on and off and it still does it. All my software is up to date and tried contacting EE who said to ask Apple. Phone is about 11 months old and have had no previous problems.

    Hi jennehbear,
    Thank you for using Apple Support Communities.
    Your time jumping backwards an hour sounds like an issue with the time zone. Take a look at this article to troubleshoot the issue.
    iOS: Troubleshooting issues with date and time - Apple Support
    Regards,
    Jeff D. 

  • Automatic failover doesn't failback to the first server if the second server is lost.

    Hi Everybody,
       We use the database mirroring a lot in our product solutions and we have recently experienced a strange behaviour in our failover tests with SQL2008R2.
    We have 2 servers running Windows 2008 R2 standard and SQL 2008 R2 standard SP2. (let's call them DB1 and DB2)
    We also have a Witness workstation running SQL 2008 Express on a Windows 7
    A database from DB1 is mirrored to DB2 in "safety full" mode, with witness. At this stage, the database is principal on DB1 and mirror on DB2
    To test the automatic failover, we first restart the DB1 server which has the database in principal mode
    After a few seconds, the database on DB2 becomes principal, which is normal , that's exactly what we want.
    After a few minutes, DB1 comes back online and its database takes the mirror role (still OK). At this stage then, the database is principal on DB2 and mirror on DB1
    when the monitoring application shows that the mirror is synchronized and that both servers are connected to the witness, we restart DB2 to trigger an automatic failover to DB1.
    What we see is that DB1 never takes the principal role and the database stays in mirror.
    In the DB1 Errorlog, I only see these 2 lines when DB2 disappears, no other message related to the mirroring session.
    2014-01-22 08:57:26.91 spid43s     Starting up database 'Test123'.
    2014-01-22 08:57:26.95 spid43s     Bypassing recovery for database 'Test123' because it is marked as a mirror database, which cannot be recovered. This is an informational message only. No user action is required.
    When DB2 comes back online, the database on DB2 keeps its principal status and the database on DB1 stays mirror.
    And what is really really strange is that, if I restart DB2 once again, directly after that, DB1 failover normally and the database on DB1 takes the principal role after a few seconds. without any configuration changes between the 2 restarts.
    DB1 errorlog shows then :
    2014-01-22 09:00:37.53 spid29s     Error: 1474, Severity: 16, State: 1.
    2014-01-22 09:00:37.53 spid29s     Database mirroring connection error 4 'An error occurred while receiving data: '64(The specified network name is no longer available.)'.' for 'TCP://DB2:5022'.
    2014-01-22 09:00:37.53 spid18s     Database mirroring is inactive for database 'Test123'. This is an informational message only. No user action is required.
    2014-01-22 09:00:42.37 spid32s     The mirrored database "Test123" is changing roles from "MIRROR" to "PRINCIPAL" due to Auto Failover.
    2014-01-22 09:00:42.39 spid32s     Recovery is writing a checkpoint in database 'Test123' (7). This is an informational message only. No user action is required.
    2014-01-22 09:00:42.39 spid32s     Recovery completed for database Test123 (database ID 7) in 78 second(s) (analysis 0 ms, redo 0 ms, undo 7 ms.) This is an informational message only. No user action is required.
    So, if I summarize, 
    - a first failover from DB1 to DB2 always work
    - then, a restart of DB2 never failover to DB1
    - a second restart of DB2 always failover to DB1
    This is pretty much systematic on one our server couple.
    Any explanation for this or any idea where I can search to find the reason of this strange behavior ?
    Thanks a lot for your help
    Seb

    Thank you Tom
    But I have already checked that and reported the Errorlog abstracts in my original post.
    When DB01 disapears for the first time, nothing in the DB01 ERRORLOG (it is restarting :-) )
    AND no particular error message in the DB02 ERRORLOG (nothing related to the fact that DB01 is not reachable anymore !!! )
    Only these two lines
    2014-01-22 08:57:26.91 spid43s     Starting
    up database 'Test123'.
    2014-01-22 08:57:26.95 spid43s     Bypassing recovery
    for database 'Test123' because it is marked as a mirror database, which cannot be recovered. This is an informational message only. No user action is required.
    So my main question remains Why DB02 doesn't detect that DB01 disapears (and the first time only) and why the failover mechanism doesn't trigger the failover ?
    Thank you
    Seb

  • The iTunes update will not install either manually or automatically.  I have Windows XP

    The iTunes update will not install either manually or automatically.  I have Windows XP pro

    Follow the directions of tt2 in https://discussions.apple.com/thread/5822086 to get the install to work.

  • SQL 2005 mirroring : Abrupt Automatic failover

    hi All, 
    We have a SQL 2005 SP4 mirroring  setup of 15 DBs with Principal(P), Mirror(M) & Witness (W).
    We have now seen abrupt DB failovers for some of the databases (yest it was 4 out of 15) from P to M.
    Errors were seen on Witness server as follows for all Dbs that failed over:
    Date 07/01/2015 11:07:48 PM
    Log SQL Server (Current - 08/01/2015 12:00:00 AM)
    Source spid19s
    Message
    The mirroring connection to "TCP://<server.domain.com>:5022" has timed out for database "<DBName>" after 10 seconds without a response.  Check the service and network connections.
    Actions taken:
    1. Network and Firewall team reverted that no error detected and no network traffic between the witness server and db server during the db auto failover period.
    2. On the system side, we have verified that no hardware error found on either VM or SAN storage, and no Symantec SQL backup jobs running nor anti virus scanning during the db auto failover period too.
    3. We did see some high amount of IO activity on P server around failover time. Some IO errors similar to below were seen, however point to note is these errors were not only for the DBs that failed over, but for others including TEMPDB:
    Date 07/01/2015 11:07:38 PM
    Log SQL Server (Current - 08/01/2015 4:06:00 AM)
    Source spid2s
    Message
    SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [R:\SQLDATA\MSSQL.1\MSSQL\Data\<DBName>.mdf] in database [DBName] (5).  The OS file handle is 0x000000000000095C.  The offset of the
    latest long I/O is: 0x0000054ff22000
    Questions:
    1. I assumed that the Witness keeps polling P & M on DB mirroring endpoints (in our case 5022) to check that the DBs are online, but Network team says there is no activity on that port, is my understanding correct?
    2. Is there any other reason for DB failover ? 
    Link referred:  
    http://dba.stackexchange.com/questions/22402/what-can-cause-a-mirroring-session-to-timeout-then-failover-sql-server-2005
    http://msdn.microsoft.com/en-us/library/ms179344(v=sql.90).aspx
    Any help is highly appreciated!!!
    Regards,
    Mandar

    This is common with Mirroring server it is not as resilient to changes as log shipping. Are you aware about
    below fact although not directly related to your question
    If you plan to use high-safety mode with automatic failover, the normal load on each failover partner should be less than 50 percent of the CPU. If your work load overloads the CPU, a failover partner might be unable to ping the other server instances in
    the mirroring session. This causes a unnecessary failover. If you cannot keep the CPU usage under 50 percent, we recommend that you use either high-safety mode without automatic failover or high-performance mode.
    Now to your problem
    The mirroring connection to "TCP://<server.domain.com>:5022" has timed out for database "<DBName>" after 10 seconds without a response.  Check the service and network connections.
    I would say there was network dip for more than 10 seconds and since default failover time is 10 seconds and for few databases witness thought principal cannot be reached it initiated failover.
    Network team is incorrect to say there was no dip (its common with NOC team not to take responsibility)
    This Support Article is worth reading specially the network part
    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it
    My Technet Wiki Article
    MVP

Maybe you are looking for