AlwaysOn Failover Scenarios

Hi,
I have implemented AlwaysOn feature between two standalone SQL Server instances hosted on two Clustered nodes in two different subnets (multisite clustering with Node and File Share Majority quorum). I have configured AlwaysOn for
automatic failover between the primary and the only secondary replica. There are two databases in it. The implementation did went successful.
Now, I before going live, I wanted to test the failover scenarios. First one was to manually failover between the nodes from SQL Server side, as well as from Failover Cluster Manager consoles. Both went perfect.
But, the second test was to stop the SQL Server service, and see the result. When I stopped the primary, the resource group in the Cluster Manager failed, and the databases were also not connecting. I expected the Availability group should have
failed over to the secondary node, and the db's should have been up and running.
Did I miss something in the implementation part above, or it is an expected behavior of AlwaysON? If yes, then what does automatic failover imply?
Thanks  & Regards

Hi,
I have implemented AlwaysOn feature between two standalone SQL Server instances hosted on two Clustered nodes in two different subnets (multisite clustering with Node and File Share Majority quorum). I have configured AlwaysOn for
automatic failover between the primary and the only secondary replica. There are two databases in it. The implementation did went successful.
Now, I before going live, I wanted to test the failover scenarios. First one was to manually failover between the nodes from SQL Server side, as well as from Failover Cluster Manager consoles. Both went perfect.
But, the second test was to stop the SQL Server service, and see the result. When I stopped the primary, the resource group in the Cluster Manager failed, and the databases were also not connecting. I expected the Availability group should have
failed over to the secondary node, and the db's should have been up and running.
Did I miss something in the implementation part above, or it is an expected behavior of AlwaysON? If yes, then what does automatic failover imply?
Thanks  & Regards
upon re-reading the question.. i think we need some clarity..what do you mean failover  node..you do not fail over over the node,, you fail over the services running on the node but in your case, you said you have  stand alone sql instances - which
is required for always on.. so, by fail over node - you mean you took node OFFLINE - either take the sql service OFFLINE and/or entire NODE offline... what did the Always ON status show when you connected to the sql instance running on another node. it should
be now primary.
when you take first node offline or one sql service offline, the sql service will not fail over because is not sql cluster, it will fail over only the database that are set up for always on. not even all the database,just the ones set up for it... 
in other words, it will failover the "always on" service...which takes care of the AG databases on the node to determine which is priiary/secondary
you dashboard on the secondary instance - AG dashboard should tell you the status,,,
Hope it Helps!!

Similar Messages

  • Failover scenarios for AlwaysOn

    What are the failover scenarios/situations in which the failover happens when the Databases are configured with ALways-On?
    In our situation windows failover cluster server node 1 fails over to node 2 but the databases are still pointing to node 1.
    thanks

    Hi,
    When failure occurs, whether an availability group will failover immediately depends on both
    the failover mode and the availability mode of the replica.
    Please check below articles for more information.
    http://msdn.microsoft.com/en-us/library/hh213151.aspx#Overview
    After a failover, client applications that need to access the primary databases must connect to the new primary replica. Also, if the new secondary replica is configured to allow read-only access, read-only client applications can connect to it. For information
    about how clients connect to an availability group, see
    Availability Group Listeners, Client Connectivity, and Application Failover (SQL Server).
    Hope the information helps.
    Tracy Cai
    TechNet Community Support

  • Exploring AlwaysOn Failover Cluster Instances in SQL Server 2014 Tecnet Labs

    I like the Labs and have found them very usful. I have a problem with: Exploring AlwaysOn Failover Cluster Instances in SQL Server 2014. The lab starts but hangs on opening SQLONE and the Manual never loads. I have tried coming out and starting again but
    get returned to the same failing session. I thought I would mention it here as the is no way to feedback a response as the appraisal only happens once the lab is completed, i.e. no apparent link on the lab site. The session ends when the HOL is reported
    as not responding.

    Hi,
    I'd try this first:
    https://vlabs.holsystems.com/vlabs/SystemRequirements.aspx
    If that's no help, try the 'Help and Support' link on this page:
    https://vlabs.holsystems.com/vlabs/technet?eng=VLabs&auth=none&src=vlabs&altadd=true&labid=12693#
    Don't retire TechNet! -
    (Don't give up yet - 13,085+ strong and growing)

  • Solaris 10 Cluster 3.2 with  2 zones in a failover scenario

    Hi
    Looking for the best way to set up things for the following scenario
    I have 2 M5000 servers with internal storage and a 6140 array for shared storage
    I need to create 2 zones on each in a failover scenario (active /standby)
    On Server1 3 out of 4 cpus for Oracle Database Server 11g and 1 out of 4 cpus for Oracle Application Server
    On Server2 3 out of 4 cpus for Oracle Application Server and 1 out of 4 cpus for Oracle Database Server 11g
    Database files will be placed on the shared storage. In case of failure of Server1 Oracle Database will fail over to Server2 and in case Server2 is down Oracle Application Server will fail to Server1.
    Would a zone cluster using clzonecluster be better?if yes how can i achieve the difference in cpu power in case of failure.
    where is best to keep the zone root path on the internal storage or on the shared storage?
    What about the swap space for both zones?
    Better use exclusive ips or shared will be fine?
    Will it be better to have sparse zone installation for the zone or do the whole thing?
    What is the best way to achieve the cpu assignments needed and how much should be left for the global zone?
    Thanks in advance
    vangelis

    Hi Vangelis,
    Building a cluster, requires some planning and understanding the concepts.
    A good start would be reading some of the documents linked to in this url: http://docs.sun.com/app/docs/doc/819-2969/gcbkf?a=view
    Regards,
    Davy

  • Cluster Adobe Document Services for Failover scenario

    Hello All,
    We have ADS installed and working in our Dev/QA environment deployed on its own standalone Web AS Java. Going forward, we want to set up our production systems in a Clustered environment for HA/Failover scenario.
    From the notes I have read on marketplace, it is possible to cluster a standalone Web AS java by manually clustering SCS and installing the java CI on one physical node(using local disk) and a java dialog instance on other physical node.
    I would like to know once we have clustered the standalone java manually, will there be any difference deploying ADS on this java since only SCS is clustered and CI would be installed on a physical host???
    Has someone already implemented this kind of scenario???
    ECC5.0, WebAS 6.40
    Database: SQL Server 2005
    OS: Win 2003
    Please help.Your replies will be greatly appreciated with lots of points.
    Thanks,
    Fahad

    Hello Samrat,
    thank you for your quick reply. You really helped me out.
    Unfortunately, I haven't marked this thread as a question. So I cannot give you any points. Is there any possibility to change this thread into a question?
    Cheers,
    Matthias

  • Alter IP address of SQL Server AlwaysOn Failover clustering Iistener

    Hi,
    Is there a possibility to change the IP address of SQL Server AlwaysOn Failover clustering Listener IP address without deleting the exiting one and also retaining the same DNS name.
    NOTE: Only the IP address needs to be altered rest all (DNS, port) should remain the same.
    Regards
    Vijay

    Hi Karthikkk,
    As your description, you want to modify the listener IP address in AlwaysOn availability group without changing other settings including DNS and port. To achieve this, you could modify a listener IP address on primary replica using the following statement:
    ALTER AVAILABILITY GROUP group_name
    MODIFY LISTENER ‘dns_name’
    ADD IP { (‘four_part_ipv4_address’,  ‘four_part_ipv4_mask’) | (‘dns_nameipv6_address’) }
    For more information about the process, please refer to the article:
    http://msdn.microsoft.com/en-us/library/ff878601.aspx
    Regards,
    Michelle Li

  • Uplink failover scenarios - The correct behavior

    /* Style Definitions */
    table.MsoNormalTable
    {mso-style-name:"Table Normal";
    mso-tstyle-rowband-size:0;
    mso-tstyle-colband-size:0;
    mso-style-noshow:yes;
    mso-style-priority:99;
    mso-style-qformat:yes;
    mso-style-parent:"";
    mso-padding-alt:0in 5.4pt 0in 5.4pt;
    mso-para-margin:0in;
    mso-para-margin-bottom:.0001pt;
    mso-pagination:widow-orphan;
    font-size:11.0pt;
    font-family:"Calibri","sans-serif";
    mso-ascii-font-family:Calibri;
    mso-ascii-theme-font:minor-latin;
    mso-hansi-font-family:Calibri;
    mso-hansi-theme-font:minor-latin;}
    Hello Dears,
    I’m somehow confused about the failover scenarios related to the uplinks and the Fabric Interconnect (FI) switches, as we have a lot of failover points either in the vNIC , FEX , FI or uplinks.
    I have some questions and I hope that someone can clear this confusion:
    A-     Fabric Interconnect failover
    1-      As I understand when I create a vNIC , it can be configured to use FI failover , which means if FI A is down , or the uplink from the FEX to the FI is down , so using the same vNIC it will failover to the other FI via the second FEX ( is that correct , and is that the first stage of the failover ?).
    2-      This vNIC will be seen by the OS as 1 NIC and it will not feel or detect anything about the failover done , is that correct ?
    3-      Assume that I have 2 vNICs for the same server (metal blade with no ESX or vmware), and I have configured 2 vNICs to work as team (by the OS), does that mean that if primary FI or FEX is down , so using the vNIC1 it will failover to the 2nd FI, and for any reason the 2nd vNIC is down (for example if uplink is down), so it will go to the 2nd vNIC using the teaming ?
    B-      FEX failover
    1-      As I understand the blade server uses the uplink from the FEX to the FI based on their location in the chassis, so what if this link is down, does that mean FI failover will trigger, or it will be assigned to another uplink ( from the FEX to the FI)
    C-      Fabric Interconnect Uplink failover
    1-      Using static pin LAN group, the vNIC is associated with an uplink, what is the action if this uplink is down ? will the vNIC:
    a.       Brought down , as per the Network Control policy applied , and in this case the OS will go for the second vNIC
    b.      FI failover to the second FI , the OS will not detect anything.
    c.       The FI A will re-pin the vNIC to another uplink on the same FI with no failover
    I found all theses 3 scenarios in a different documents and posts, I did not have the chance it to test it yet, so it will be great if anyone tested it and can explain.
    Finally I need to know if the correct scenarios from the above will be applied to the vHBA or it has another methodology.
    Thanks in advance for your support.
    Moamen

    Moamen
    A few things about Fabric Failover (FF)  to keep in mind before I try to address your questions.
    FF is only supported on the M71KR and the M81KR.
    FF is only applicable/supported in End Host Mode of Operation and applies only to ethernet traffic. For FC traffic one has to use multipathing software (the way FC failover has worked always). In End Host mode, anything along the path (adapter port, FEX-IOM link, uplinks) fails and FF is initiated for ethernet traffic *by the adapter*.
    FF is an event which is triggered by vNIC down i.e a vNIC is triggered down and the adapter initiates the failover i.e it sends a message to the other fabric to activate the backup veth (switchport) and the FI sends our gARPs for the MAC as part of it. As it is adapter driven, this is why FF is only available on a few adapters i.e for now the firmware for which is done by Cisco.
    For the M71KR (menlo's) the firmware on the Menlo chip is made by Cisco. The Oplin and FC parts of the card, and Intel/Emulex/Qlogic control that.
    The M81KR is made by Cisco exclusively for UCS and hence the firmware on that is done by us.
    Now to your questions -
    >1-      As I understand when I create a vNIC , it can be configured to use FI failover , which means if FI A is down , or the uplink from the FEX to the >FI is down , so using the same vNIC it will failover to the other FI via the second FEX ( is that correct , and is that the first stage of the failover ?).
    Yes
    > 2-      This vNIC will be seen by the OS as 1 NIC and it will not feel or detect anything about the failover done , is that correct ?
    Yes
    >3-      Assume that I have 2 vNICs for the same server (metal blade with no ESX or vmware), and I have configured 2 vNICs to work as team (by the >OS), does that mean that if primary FI or FEX is down , so using the vNIC1 it will failover to the 2nd FI, and for any reason the 2nd vNIC is down (for >example if uplink is down), so it will go to the 2nd vNIC using the teaming ?
    Instead of FF vNICs you can use NIC teaming. You bond the two vNICs which created a bond interface and you specify an IP on it.
    With NIC teaming you will not have the vNICs (in the Service Profile) as FF. So the FF will not kick in and the vNIC will be down for the teaming software to see on a fabric failure etc for the teaming driver to come into effect.
    > B-      FEX failover
    > 1-      As I understand the blade server uses the uplink from the FEX to the FI based on their location in the chassis, so what if this link is down, > >does that mean FI failover will trigger, or it will be assigned to another uplink ( from the FEX to the FI)
    Yes, we use static pinning between the adapters and the IOM uplinks which depends on the number of links.
    For example, if you have 2 links between IOM-FI.
    Link 1 - Blades 1,3,5,7
    Link 2 - Blades 2,4,6,8
    If Link 1 fails, Blade 1,3,5,7 move to the other IOM.
    i.e it will not failover to the other links on the same IOM-FI i.e it is no a port-channel.
    The vNIC down event will be triggered. If FF is initiated depends on the setting (above explanation).
    > C-      Fabric Interconnect Uplink failover
    > 1-      Using static pin LAN group, the vNIC is associated with an uplink, what is the action if this uplink is down ? will the vNIC:
    > a.       Brought down , as per the Network Control policy applied , and in this case the OS will go for the second vNIC
    If you are using static pin group, Yes.
    If you are not using static pin groups, the same FI will map it to another available uplink.
    Why? Because by defining static pinning you are purposely defining the uplink/subscription ratio etc and you don't want that vNIC to go to any other uplink. Both fabrics are active at any given time.
    > b.      FI failover to the second FI , the OS will not detect anything.
    Yes.
    > c.       The FI A will re-pin the vNIC to another uplink on the same FI with no failover
    For dynamic pinning yes. For static pinning NO as above.
    >I found all theses 3 scenarios in a different documents and posts, I did not have the chance it to test it yet, so it will be great if anyone tested it and >can explain.
    I would still highly recommend testing it. Maybe its me but I don't believe anything till I have tried it.
    > Finally I need to know if the correct scenarios from the above will be applied to the vHBA or it has another methodology.
    Multipathing driver as I mentioned before.
    FF *only* applies to ethernet.
    Thanks
    --Manish

  • SC3.2, S10, V40z - failover scenarios/troubles

    Hi,
    Few days ago I finished the above setup (2 nodes, sc3.2, sol10x86 - all updates).
    There are two shared storages connected to the cluster: T3(fiber) and 3310(scsi raid)
    Configured atm is a single MySQL instance in active-standby mode.
    Later, an NFS service might be added.
    There are three global mounts.
    The system need to get into production asap but there were some problems while testing the failover scenarios.
    = Disconnecting any combination of interconnect cables - WORKS
    = Disconnecting any combination of network cables - WORKS
    = Shutting down any node - WORKS
    = Powering off any node - WORKS
    = RGs and resources are switched as expected and any node is able to take ownership
    The one test which failed was when the SCSI and FC cables were unplugged.
    In both cases, both nodes were rebooted almost instantly.
    Is this behavior configurable or expected?
    Any suggestions for another test scenario?
    I found a single forum thread with similar problem described which was tracked down to bad grounding ... anyone else having experience with that?
    I can send more details if anyone is able to help.
    Thanks in advance !!!
    Paul.

    OK, I repeated the scenario yesterday.
    For some reason only the node which was mastering the RG was rebooted after panicking about state database records.
    I'm not sure if this is a natural behavior or if there is some misconfiguration.
    Please, see the logs and cluster info below:
    ========
    Cluster Info
    ========
    -- Cluster Nodes --
    Node name Status
    Cluster node: CLNODE2 Online
    Cluster node: CLNODE1 Online
    -- Cluster Transport Paths --
    Endpoint Endpoint Status
    Transport path: CLNODE2:ce1 CLNODE1:ce1 Path online
    Transport path: CLNODE2:bge1 CLNODE1:bge1 Path online
    -- Quorum Summary --
    Quorum votes possible: 3
    Quorum votes needed: 2
    Quorum votes present: 3
    -- Quorum Votes by Node --
    Node Name Present Possible Status
    Node votes: CLNODE2 1 1 Online
    Node votes: CLNODE1 1 1 Online
    -- Quorum Votes by Device --
    Device Name Present Possible Status
    Device votes: /dev/did/rdsk/d7s2 1 1 Online
    -- Device Group Servers --
    Device Group Primary Secondary
    Device group servers: new_mysql CLNODE1 CLNODE2
    Device group servers: new_ibdata CLNODE1 CLNODE2
    Device group servers: new_binlog CLNODE1 CLNODE2
    -- Device Group Status --
    Device Group Status
    Device group status: new_mysql Online
    Device group status: new_ibdata Online
    Device group status: new_binlog Online
    -- Multi-owner Device Groups --
    Device Group Online Status
    -- Resource Groups and Resources --
    Group Name Resources
    Resources: mysql-failover-rg mysql-has mysql-lh mysql-res
    -- Resource Groups --
    Group Name Node Name State Suspended
    Group: mysql-failover-rg CLNODE2 Offline No
    Group: mysql-failover-rg CLNODE1 Online No
    -- Resources --
    Resource Name Node Name State Status Message
    Resource: mysql-has CLNODE2 Offline Offline
    Resource: mysql-has CLNODE1 Online Online
    Resource: mysql-lh CLNODE2 Offline Offline - LogicalHostname offline.
    Resource: mysql-lh CLNODE1 Online Online - LogicalHostname online.
    Resource: mysql-res CLNODE2 Offline Offline
    Resource: mysql-res CLNODE1 Online Online - Service is online.
    -- IPMP Groups --
    Node Name Group Status Adapter Status
    IPMP Group: CLNODE2 sc_ipmp0 Online ce0 Online
    IPMP Group: CLNODE2 sc_ipmp0 Online bge0 Online
    IPMP Group: CLNODE1 sc_ipmp0 Online ce0 Online
    IPMP Group: CLNODE1 sc_ipmp0 Online bge0 Online
    =========
    Devices
    =========
    ===
    DIDs
    ===
    CLNODE1:root[]didadm -L
    1 CLNODE2:/dev/rdsk/c0t0d0 /dev/did/rdsk/d1 INTERNAL DISKS/HARDWARE RAID
    2 CLNODE2:/dev/rdsk/c1t0d0 /dev/did/rdsk/d2 INTERNAL DISKS/HARDWARE RAID
    3 CLNODE2:/dev/rdsk/c4t60020F200000C51E48874D0D000DA3EEd0 /dev/did/rdsk/d3 FC VOLUMES
    3 CLNODE1:/dev/rdsk/c4t60020F200000C51E48874D0D000DA3EEd0 /dev/did/rdsk/d3 FC VOLUMES
    4 CLNODE2:/dev/rdsk/c4t60020F200000C51E48874D49000686F0d0 /dev/did/rdsk/d4 FC VOLUMES
    4 CLNODE1:/dev/rdsk/c4t60020F200000C51E48874D49000686F0d0 /dev/did/rdsk/d4 FC VOLUMES
    5 CLNODE1:/dev/rdsk/c5t1d0 /dev/did/rdsk/d5 SCSI RAID
    5 CLNODE2:/dev/rdsk/c5t1d0 /dev/did/rdsk/d5 SCSI RAID
    6 CLNODE1:/dev/rdsk/c5t0d0 /dev/did/rdsk/d6 SCSI RAID
    6 CLNODE2:/dev/rdsk/c5t0d0 /dev/did/rdsk/d6 SCSI RAID
    7 CLNODE2:/dev/rdsk/c4t60020F200000C51E48874DA900088862d0 /dev/did/rdsk/d7 FC VOLUMES
    7 CLNODE1:/dev/rdsk/c4t60020F200000C51E48874DA900088862d0 /dev/did/rdsk/d7 FC VOLUMES
    8 CLNODE2:/dev/rdsk/c4t60020F200000C51E48874DDD000CE109d0 /dev/did/rdsk/d8 FC VOLUMES
    8 CLNODE1:/dev/rdsk/c4t60020F200000C51E48874DDD000CE109d0 /dev/did/rdsk/d8 FC VOLUMES
    11 CLNODE1:/dev/rdsk/c0t0d0 /dev/did/rdsk/d11 INTERNAL DISKS/HARDWARE RAID
    12 CLNODE1:/dev/rdsk/c1t0d0 /dev/did/rdsk/d12 INTERNAL DISKS/HARDWARE RAID
    ===
    metasets
    ===
    CLNODE1:root[]metaset -s new_binlog
    Set name = new_binlog, Set number = 8
    Host Owner
    CLNODE1 Yes
    CLNODE2
    Driv Dbase
    d5 Yes
    d6 Yes
    CLNODE1:root[]metaset -s new_mysql
    Set name = new_mysql, Set number = 5
    Host Owner
    CLNODE1 Yes
    CLNODE2
    Driv Dbase
    d3 Yes
    d4 Yes
    CLNODE1:root[]metaset -s new_ibdata
    Set name = new_ibdata, Set number = 7
    Host Owner
    CLNODE1 Yes
    CLNODE2
    Driv Dbase
    d7 Yes
    d8 Yes
    ===
    metadb info
    ===
    CLNODE1:root[]metadb -s new_binlog
    flags first blk block count
    a m luo r 16 8192 /dev/did/dsk/d5s7
    a luo r 16 8192 /dev/did/dsk/d6s7
    CLNODE1:root[]metadb -s new_mysql
    flags first blk block count
    a m luo r 16 8192 /dev/did/dsk/d3s7
    a luo r 16 8192 /dev/did/dsk/d4s7
    CLNODE1:root[]metadb -s new_ibdata
    flags first blk block count
    a m luo r 16 8192 /dev/did/dsk/d7s7
    a luo r 16 8192 /dev/did/dsk/d8s7
    ===
    md.tab - 3 configured mirrors mounted as global
    ===
    d110 -m d103 d104
    new_mysql/d110 -m new_mysql/d103 new_mysql/d104
    d120 -m d115 d116
    new_binlog/d120 -m new_binlog/d115 new_binlog/d116
    d130 -m d127 d128
    new_ibdata/d130 -m new_ibdata/d127 new_ibdata/d128
    =========
    Log at time of failure - CLNODE1 is the master - SCSI cable disconnected - CLNODE2 takes over RG after CLNODE1's panic
    =========
    Aug 7 14:16:27 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@0,0 (sd81):
    Aug 7 14:16:27 CLNODE1 disk not responding to selection
    Aug 7 14:16:28 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@1,0 (sd82):
    Aug 7 14:16:28 CLNODE1 disk not responding to selection
    Aug 7 14:16:33 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@0,0 (sd81):
    Aug 7 14:16:33 CLNODE1 disk not responding to selection
    Aug 7 14:16:35 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@1,0 (sd82):
    Aug 7 14:16:35 CLNODE1 disk not responding to selection
    Aug 7 14:16:38 CLNODE1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,0/pci1022,7450@1/pci1000,1010@1/sd@0,0 (sd81):
    Aug 7 14:16:38 CLNODE1 disk not responding to selection
    Aug 7 14:16:38 CLNODE1 md: [ID 312844 kern.warning] WARNING: md: state database commit failed
    Aug 7 14:16:39 CLNODE1 cl_dlpitrans: [ID 624622 kern.notice] Notifying cluster that this node is panicking
    Aug 7 14:16:39 CLNODE1 unix: [ID 836849 kern.notice]
    Aug 7 14:16:39 CLNODE1 ^Mpanic[cpu1]/thread=fffffe800030bc80:
    Aug 7 14:16:39 CLNODE1 genunix: [ID 268973 kern.notice] md: Panic due to lack of DiskSuite state
    Aug 7 14:16:39 CLNODE1 database replicas. Fewer than 50% of the total were available,
    Aug 7 14:16:39 CLNODE1 so panic to ensure data integrity.
    Aug 7 14:16:39 CLNODE1 unix: [ID 100000 kern.notice]
    Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bb80 md:mddb_commitrec_wrapper+8c ()
    Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bbc0 md_mirror:process_resync_regions+16a ()
    Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bbf0 md_mirror:check_resync_regions+df ()
    Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bc50 md:md_daemon+10b ()
    Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bc60 md:start_daemon+e ()
    Aug 7 14:16:39 CLNODE1 genunix: [ID 655072 kern.notice] fffffe800030bc70 unix:thread_start+8 ()
    Aug 7 14:16:39 CLNODE1 unix: [ID 100000 kern.notice]
    Aug 7 14:16:39 CLNODE1 genunix: [ID 672855 kern.notice] syncing file systems...
    Aug 7 14:16:39 CLNODE1 genunix: [ID 733762 kern.notice] 1
    Aug 7 14:16:40 CLNODE1 genunix: [ID 904073 kern.notice] done
    Aug 7 14:16:41 CLNODE1 genunix: [ID 111219 kern.notice] dumping to /dev/dsk/c1t0d0s1, offset 429391872, content: kernel
    Aug 7 14:16:52 CLNODE1 genunix: [ID 409368 kern.notice] ^M100% done: 148178 pages dumped, compression ratio 4.10,
    Aug 7 14:16:52 CLNODE1 genunix: [ID 851671 kern.notice] dump succeeded
    Aug 7 14:19:39 CLNODE1 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.10 Version Generic_118855-36 64-bit
    Aug 7 14:19:39 CLNODE1 genunix: [ID 172907 kern.notice] Copyright 1983-2006 Sun Microsystems, Inc. All rights reserved.
    =========================================================
    Anyone?
    Thanks in advance!

  • Need test documents for RAC failover Scenarios

    Hello friends...
    By the end of this week i have to produce sum test documents for RAC and Database server including Sun Cluster Failover Scenarios.
    Can sumone guide me to a link where i can get enough help.
    I have already managed to get enough information.. but i want to see to it that i cover most of the topics.
    Thanks, Regards
    Monu Koshy

    Please check the following links.
    http://download-uk.oracle.com/docs/cd/B19306_01/rac.102/b14197/toc.htm
    http://download-uk.oracle.com/docs/cd/B19306_01/install.102/b14205/toc.htm
    -aijaz

  • SQL Server AlwaysOn Failover threshold and Lease Timeout

    Hi experts, 
      I found each time when I were building/restoring another log shipping(standby) server. It would cause ERR [RES] SQL Server Availability Group: [hadrag] Failure detected, diagnostics heartbeat is lost(in cluster log) and A connection timeout has occurred on a previously established connection to availability replica 'DL980-4' with id(in errorlog). I google and find a document (http://download.microsoft.com/download/0/F/B/0FBFAA46-2BFD-478F-8E56-7BF3C672DF9D/Troubleshooting%20SQL%20Server%20AlwaysOn.pdf ) indicated that “This may be a performance issue”.  I  run restore database and AlwaysOn synchronizing on the same 10GbE link at the same time.
      Should I increase
    leaseTimeout from 20000 to 100000 and
    HealthCheckTimeout from 30000 to 300000?
    Does it work to prevent unnecessary failover.
    Please refer to
    (http://blogs.msdn.com/b/psssql/archive/2012/09/07/how-it-works-sql-server-alwayson-lease-timeout.aspx )
    parag
    10-18-2013 3:19 AM
    Hi Denzil
    we seem to see lease expires very frequently when the server is under very high cpu pressure .. our failure condition level is 1
    is it possible to prevent this situation . the problem is when lease expires,all the current connections seem to be dropped . wondering if there is a way to prevent this ..
    Also is it possible to affitinize the always on health check process to a particular core
    Thanks for your help!

    Hi Dennis
      Should I increase
    leaseTimeout from 20000 to 100000 and
    HealthCheckTimeout from 30000 to 300000?
    Does it work to prevent unnecessary failover.
    1. you can increase the timeout parameter, but when the fail-over time slight delay will be there (its a work around solution)
    2. You may require to check the Network connections. (Cluster Heartbeat & public network)
    3. Have you update the latest patches of OS & DB?
    4.  If possible raise the ticket to Microsoft. they may give some update for Cluster resource update based on your issue
    Regards
    Sriram

  • Dataguard site failover scenario

    Dear Gurus,
    I am preparing a operation runbook with scenarios of Production environment database DR event. I have got sets of workable switchover and failover procedures but just get confused about when to trigger the failover steps? I can think about some scenarios,
    1. Web/App server tier failure triggered site failover - this means database layer is healthy so no doubt we should do switchover
    2. Database layer problem triggered site failover
    2.a Primary site database problem
    2.b Secondary site database problem
    for scenario 2 I believe it can further drill down into detailed categories. I intend to write steps to fix the problem on whatever primary or secondary site then perform switchover, what do you think?
    we are using Oracle EE 11gR2
    Best

    Hi rac100g.
    Really I dont undertant your quetion clearly.
    Dow you want steps automati client failover?
    I think my video tutorial with helpful for you.
    Please watch : http://www.mahir-quluzade.com/2012/05/oracle-data-guard-11g-overview-client.html
    And you can watch my all videos about Data Guard from : http://www.mahir-quluzade.com/p/oracle-videos.html
    Regards
    Mahir M. Quluzade
    Edited by: Mahir M. Quluzade on Jun 6, 2012 12:13 PM

  • Understanding Flexconnect - Local vs Central Switching, and WLC failover scenario ??

    Hello Experts
    We have one WLC 5508 in Building1, few 2700 Series AP in Building1, and one 1252AG in Building2. The LAN subnet is same for both Buildings connected via a dark fiber.
    My requirement is to have Central Switching in Building1 since WLC is located locally, and Local Switching in Building2 to avoid inter-building traffic, for both Buildings we already one VLAN/IP Subnet. (Both Buildings access resources from a central Datacenter which hosts all the servers.)
    Questions:
    1. Is the above scenario possible using single SSID ? My understanding is that one WLAN+SSID can't have both Local and Central switching enabled.
    2. In Flexconnect Central Switching mode, during WLC failure, does the switching change to Local switching automatically ?
    3. When I choose Local Switching for a specific WLAN, does it Locally switch always , or does it Locally switch only when WLC is down ?
    4. We want to use Microsoft PEAP using AD User Authentication. When Local Authentication is enabled on WLC, I understand that when WLC fails (and RADIUS Server is still reachable), can we still have the AP directly contact RADIUS server as a direct client and provide 802.1X Microsoft PEAP authentication. Guess this is Primary Backup Radius Server configuration. Is this understanding correct ?
    Thanks.

    Hi
    The LAN subnet is same for both Buildings connected via a dark fiber.
    If this is the case there is no need of FlexConnet, as you have enough bandwidth & same L2 extended in those two buildings. Typically FlexConnect is for branch deployment where WAN link bandwidth is a concern.
    Anyway if you want to do this & here is the answer for your specific queries.
    1. Is the above scenario possible using single SSID ? My understanding is that one WLAN+SSID can't have both Local and Central switching enabled.
    You can have both local switching & central switching available for a given SSID. Only FlexConnect mode AP will do Local switching & all Local mode AP will do central switching, though both using the same SSID.
    2. In Flexconnect Central Switching mode, during WLC failure, does the switching change to Local switching automatically ?
    No, if it is central switching SSID, when WLC is not available client won't able to join this SSID. It is not fall back to Local switching.
    3. When I choose Local Switching for a specific WLAN, does it Locally switch always , or does it Locally switch only when WLC is down ?
    This is applicable only to FlexConnect mode APs & it always do local switching if that configured. If WLC is not reachable AP will go on "standalone mode" & still do local switching.
    4. We want to use Microsoft PEAP using AD User Authentication. When Local Authentication is enabled on WLC, I understand that when WLC fails (and RADIUS Server is still reachable), can we still have the AP directly contact RADIUS server as a direct client and provide 802.1X Microsoft PEAP authentication. Guess this is Primary Backup Radius Server configuration. Is this understanding correct ?
    Yes, when this option configured & WLC is not reachable (but RADIUS is reachable) then AP will act as Authenticator & pass radius messages to Auth Server directly.
    This is a very good Ciscolive presentation you should see as it describe lots of these features & which WLC codes they introduced.
    BRKEWN-2016 - Architecting Network for Branch Offices with Cisco Unified Wireless
    HTH
    Rasika
    **** Pls rate all useful responses ****

  • Controller Failover Scenarios - 5508

    I am putting a design together for a resilient wireless network.
    I have 2 main data center sites
    Site 1 I will have either:
    1 x 5508
    1 x 5508 + HA
    2 x 5508 in N+1 failover
    Site 2 will have just one 5508 controller.
    What failover models are available to me?
    Can I have an option of N+1 with also a failover to Site 2 if Site 1 is down
    From my initial research I think I can can only configure AP's to have a primary and secondary controller configured.
    So think the best model is an HA pair in Site 1 and the 5508 in Site 2
    What I don't understand yet is the controller to controller failover?
    I will be running a guest network out of Site 1 and require the controllers there to be the anchor
    Any advice is appreciated.
    Thanks
    Roger

    You can have various designs:
    Option 1:
    You can have AP's on both WLC's to off load traffic
    Site 1:
    5508 with license
    Site 2:
    5508 with license
    Option 2:
    You have AP's on one WLC and the other is backup
    Site 1:
    5508 with license
    Site 2:
    5508 HA sku
    Option 3:
    You run N+1
    Site 1:
    5508 with license
    5508 HA sku N+1 (Secondary)
    Site 2:
    5508 HA sku N+1 (Tertiary)
    Option 4:
    You run AP and Client SSO
    Site 1:
    5508 license
    5508 HA sku AP SSO
    Site 2:
    5508 HA sku N+1
    Option 5:
    Run both sites with AP SSO
    Site 1:
    5508 license (Primary)
    5508 HA sku AP SSO
    Site 2:
    5508 license (Secondary)
    5508 HA sku AP SSO
    Scott

  • Why is flashback needed in Failover scenario?

    Hello. Why do we need to turn flashback ON on primary and standby databases for fail-over to work?

    IMHO Failover doesn't need flashback to be enabled.
    This link lists the prerequisites for failover Role Transitions .
    In addition to that you should check if standby is running in maximum protection mode and have to switch it over to maximum performance till the transition completes.
    Could you post the error messages you came across?
    CSM

  • UCCX 8.5.1 Failover scenario

    Hi All,
    Would someone please be kind enough to explain how the latest version of UCCX failover? We found that when the primary server fails, everything
    failover to the standby server correctly. The confusing bit is when the primary server comes back online, when testing in our lab and on the day of the switch over (which was OOHs) the agents, supervisors and all calls moved back to the primary. I know this is not the norm in previous version. The issue I'm facing now is the customer had a power cut during hours and the everything failover correctly. When the primary came back up nothing moved across back to the primary. Can someone please confirm if the primary will only assume control if there are no active calls in the system?
    Thank you
    Brett

    With the new version it seems as there is no master, it's whatever server comes up first becomes the faster and it will failover when there is an issue with the master.  If there's an error and the standby becomes master, then it will not go back to the other server unless it too suffers an error.
    david

Maybe you are looking for

  • Comparing iMovieHD vs. iMovie08

    I loaded iMovieHD today and noticed that encodings are so much more faster with HD than 08. So I started trying to figure out why. I noticed that a 2 minute clip imported is HUGE in the project where as in 08 it is about actual size(regardless of imp

  • Simple but not ....

    Hi, I have two databases 'A' and 'B' and I have a 'EMP' table on 'A' database and I want to execute all DML commands from database 'B'. How can I create a link. Both the databases are on the same server 'sub.ax.com'. thanks in advance ravi

  • Source File Synchronization

    Hi, i dont know if the subject line makes any sense so allow me to elaborate. I have these two bytecodes, x.src and y.src. I use jcreator pro to write, compile and run these files. Is there any way i can set y.src to run as soon as x.src starts runni

  • Multiple oracle homes resolution

    hey guys, I installed oracle 10.2.0.1 on rhel 4 .I set all the environment variables. I installed oracle developer suite 10.1.2 on the same machine. I set a different oracle home for it.But after installation all the files have ORACLE_HOME as oracle

  • Referenced masters and automounted filesystems in Aperture 2.1

    Folks, I've recently set up an automount for an nfs share from my raid for use in storing masters for my aperture library. Unfortunately, Aperture doesn't seem to be fully prepared for automounted filesystems. Here's the behavior I've noticed: Scenar