Cluster failover but Availability Group do not
We recently had an issue where a file share was unavailable. It caused the cluster to failover. However, the Availability Group did not failover. The application was trying to connect to the secondary database which it couldn't since the failover had not
happened. We are set up using Quorum with a File Share Witness. We have manually failed this over successfully. Any ideas on why the Availability Group did not failover would be most appreciated. We do not have any experts in this area so please help us figure
out where to look.
Availability listener are a good way of providing a fault tolerance for the client and in addition providing read-intent or read-only access to the secondary replicas. If you need it, you can go through these documentations. They show the why and how to
set them up. If you have any additinal questions let me know.
http://msdn.microsoft.com/en-us/library/hh213417.aspx
Always make sure to configure AG relevenat parts through SSMS not the cluster Service manager itself. There are some points that you might want to tweak in the cluster manager, but most parts should be configured (and for most cases they are sufficient)
in the SQL Server related tools (SSMS)
-Jens
Jens K. Suessmeyer http://blogs.msdn.com/Jenss
Similar Messages
-
I have a scenario with the three nodes with server 2012 standard, each running an instance of SQL Server 2012 enterprise, participate in a
single Windows Server Failover Cluster (WSFC) that spans two data centers.
If the nodes in the primary data center are unavailable due to data center outage. Then how I can able to access node in the WSFC (Windows Server Failover Cluster) in the secondary disaster recovery data center automatically with some script.
I want to write script that can be able to check primary data center by pinging some IP after every 5 or 10 minutes.
If that IP is unable to respond then script can be able to Perform Forced Manual Failover of Availability Group (SQL Server) and WSFC (Windows Server Failover Cluster)
Can you please guide me for script writing for automatic failover in case of primary datacenter outage?You are trying to implement manually what should be happening automatically in the cluster. If the primary SQL Server becomes unavailable in the data center, it should fail over to the secondary SQL Server automatically. Is that not working?
You also might want to run this configuration by some SQL experts. I am not a SQL expert, but if you have both hosts in the data center in a cluster, there is no need for replication between those two nodes as they would be accessing
the database from some form of shared storage. Then it looks like you are trying to implement Always On to the DR site. I'm not sure you can mix both types of failover in a single configuration.
FYI, it would make more sense to establish a file share witness in your DR site instead of placing a third node in the data center for Node Majority quorum.
. : | : . : | : . tim -
I have a scenario with the three nodes with server 2012 standard, each running an instance of SQL Server 2012 enterprise, participate in a
single Windows Server Failover Cluster (WSFC) that spans two data centers.
If the nodes in the primary data center are unavailable due to data center outage. Then how I can able to access node in the WSFC (Windows Server Failover Cluster) in the secondary disaster recovery data center automatically with some script.
I want to write script that can be able to check primary data center by pinging some IP after every 5 or 10 minutes.
If that IP is unable to respond then script can be able to Perform Forced Manual Failover of Availability Group (SQL Server) and WSFC (Windows Server Failover Cluster)
Can you please guide me for script writing for automatic failover in case of primary datacenter outage?please post you question on failover clusters in the cluster forum. THey will explain how this works and point you at scipts.
You should also look in the Gallery for cluster management scripts.
¯\_(ツ)_/¯ -
Hi,
We are starting our monthly patching and have a situation with SQL2012 SP2 CU1 Availability Groups databases not synchronizing after patching the secondary replica. This seems a little like
http://support2.microsoft.com/kb/3033492/en-us which was for CU3 and CU4 builds. The dashboard on the Primary shows the secondary as not synchronizing. The errorlogs on both nodes show the connection
has recovered but the dashboard shows critical errors. This is running on Windows 2012.
Thanks
ChrisLydia,
I am referring to MS15-009 to MS15-017 that came out on Tuesday. These are the only patches we have applied. On our two node Availability Group setup we had the patches applied to the secondary node and then it was restarted. It came up fine and SQL is running
as expected.
This is from the primary node errorlog:-
AlwaysOn Availability Groups connection with secondary database terminated for primary database 'XXX_DEVL' on the availability replica with Replica ID: {e6468c4d-9431-4052-88c0-07d3b3eb428c}. This is an informational message only. No user action is required.
A connection for availability group 'YYYYY50_AG' from availability replica 'YYYYY50AV' with id [0E53235A-0FBB-4E18-8C40-A0D72F30C36A] to 'YYYYY50BV' with id [E6468C4D-9431-4052-88C0-07D3B3EB428C] has been successfully established. This is an
informational message only. No user action is required.
AlwaysOn Availability Groups connection with secondary database established for primary database 'XXX_DEVL' on the availability replica with Replica ID: {e6468c4d-9431-4052-88c0-07d3b3eb428c}. This is an informational message only. No user action is required.
On the secondary node errorlog:-
Skipping the default startup of database 'XXX_DEVL' because the database belongs to an availability group (Group ID: 65541). The database will be started by the availability group. This is an informational message only. No user action is required.
The state of the local availability replica in availability group 'YYYYY50_AG' has changed from 'RESOLVING_NORMAL' to 'SECONDARY_NORMAL'. The replica state changed because of either a startup, a failover, a communication issue, or a cluster error. For more
information, see the availability group dashboard, SQL Server error log, Windows Server Failover Cluster management console or Windows Server Failover Cluster log.
AlwaysOn Availability Groups data movement for database 'XXX_DEVL' has been suspended for the following reason: "system" (Source ID 5; Source string: 'SUSPEND_FROM_RESTART'). To resume data movement on the database, you will need to resume the
database manually. For information about how to resume an availability database, see SQL Server Books Online.
Nonqualified transactions are being rolled back in database XXX_DEVL for an AlwaysOn Availability Groups state change. Estimated rollback completion: 100%. This is an informational message only. No user action is required.
AlwaysOn Availability Groups connection with primary database terminated for secondary database 'XXX_DEVL' on the availability replica with Replica ID: {0e53235a-0fbb-4e18-8c40-a0d72f30c36a}. This is an informational message only. No user action is required.
Now I see the suspend message. Running the select on sys.dm_exec_requests there are NO DB_STARTUP ones on either node or blocking.
Next action is probably to have the patches removed from the secondary node and see if all is well and then try patch one by one to see which one causes the issue.
Chris -
I have a scenario with the three nodes with server 2012 standard, each running an instance of SQL Server 2012 enterprise, participate in a
single Windows Server Fail-over Cluster (WSFC) that spans two data centers.
If the nodes in the primary data center are unavailable due to data center outage. Then how I can able to access node in the WSFC (Windows Server Fail-over Cluster) in the secondary disaster recovery data center automatically with some script.
I want to write script that can be able to check primary data center by pinging some IP after every 5 or 10 minutes.
If that IP is unable to respond then script can be able to Perform Forced Manual Fail-over of Availability Group (SQL Server) and WSFC (Windows Server Fail-over Cluster)
Can you please guide me for script writing for automatic fail-over in case of primary data-center outage?+1 to David's comment. I would not suggest to run a script automatically. During such failover you might have data loss and decision has to be made with business owners during disaster.
During such situation, you need to start cluster service in force quorum mode (/fq switch) and then perform manual failover of AG to DR site.
Balmukund Lakhani
Please mark solved if I've answered your question, vote for it as helpful to help other users find a solution quicker
This posting is provided "AS IS" with no warranties, and confers no rights.
My Blog |
Team Blog | @Twitter
| Facebook
Author: SQL Server 2012 AlwaysOn -
Paperback, Kindle -
Scripting cluster failover for patching (2008 server not R2)
We have a number of Windows 2008 (not R2) servers running sql. We currectly patch using a third party tool (Shavlik) and would like to automate the process. eg. failover, patch, failback, check status. I started looking at Powershell
but it appears the cluster modules are for 2008R2 and not plain 2008. Anyone have a solution? Thanks.As noted above, your best option is to upgrade to the latest OS - lots of good reasons to do that anyway. Otherwise, with 2008 you will need to script things with the cluster.exe and wmi commands executed via winrm (if you want to perform them remotely).
You would most likely put as much time, effort, and money into creating something that works in 2008 as it would take to upgrade to the latest Windows Server 2012 R2. And then, by having 2012 R2 in place it prepares you for v.Next which will enable a
rolling upgrade capability within a cluster.
. : | : . : | : . tim -
Solaris 10 cluster:failover project or zone can not have same name?
Oracle on Solaris 10 cluster: two node SUN cluster fail over, SA advised using different account (oracle01 for node01, oracle02 for node02) to failover cluster, why can't I create same 'oracle' account on both node?
failover different project or zone can not have same user or group account name?
thanks.Hi Vangelis,
Building a cluster, requires some planning and understanding the concepts.
A good start would be reading some of the documents linked to in this url: http://docs.sun.com/app/docs/doc/819-2969/gcbkf?a=view
Regards,
Davy -
I have an iPhone 4 16GB. Was recording a video the other day, and got this msg: "Warning: You are running out of disk space. Please delete some photos or videos". Ok, so I had like 5,000+ images in my "camera roll", don't have any other folders. ALOT were from the Internet, when I would hold down on an image, and pick "save image". So, using common sense, I went into camera, under camera roll, then hit arrow icon in bottom left corner, and proceeded to check mark images. Once I had a group, I'd hit delete. I did this several times, until I was left with only about 1/2 the amount of images, which was about 3,000. I rebooted my iphone. I then went into settings> general> about, and looked under available, which still showed only about 100MB available out of 14GB. How is this possible? Now, when I go into camera, the lense won't even open, and I keep getting the same error msg. Maybe 3,000 is still alot, but remember, I had almost 6,000 images before, with NO problems till just a few days ago. So, how do I get my iPhone available memory back, if already deleted images, as instructed???
You need to connect the iPhone to the computer and import those photos to the computer. Photos are not really supposed to be stored in the camera roll as it can cause problems, as you have already seen. Also, should the iPhone crash, you could lose all of the photos without having a backup of them. Once you have downloaded all the pictures, you should be able to go into Explorer and delete everything out of that folder. Then reset the iPhone and it should clear the memory. The iPhone is probably holding the memory because the Library hasn't updated. A sync would help as it tends to reset things. I've read of other issues with keeping too many photos in the camera roll and issues with saving from the Internet as well, however much of that was in earlier versions of iOS. I believe that has been corrected.
Once you have imported the photos and deleted them from the camera roll, you can sync them back through iTunes in the Photos tab if you want them on the phone. -
WHY: Scan VIP does failover, but Scan Listener does not!
Hi,
I have a two node cluster based on 11.2.0.4 (no patches yet). When one node is down, the scan vip does failover to the other node, but however the scan listener does not:
All up and running:
[oracle@rzsolv236 ~]$ crsctl stat res -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DG_GRID.dg
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.LISTENER.lsnr
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.asm
ONLINE ONLINE rzsolv236 Started
ONLINE ONLINE rzsolv237 Started
ora.gsd
OFFLINE OFFLINE rzsolv236
OFFLINE OFFLINE rzsolv237
ora.net1.network
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.ons
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.registry.acfs
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
Cluster Resources
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rzsolv236
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE rzsolv237
ora.cvu
1 ONLINE ONLINE rzsolv236
ora.oc4j
1 ONLINE ONLINE rzsolv236
ora.rzsolv236.vip
1 ONLINE ONLINE rzsolv236
ora.rzsolv237.vip
1 ONLINE ONLINE rzsolv237
ora.scan1.vip
1 ONLINE ONLINE rzsolv236
ora.scan2.vip
1 ONLINE ONLINE rzsolv237
One node is down:
[oracle@rzsolv236 bin]# ./crsctl stat res -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DG_GRID.dg
ONLINE ONLINE rzsolv236
ora.LISTENER.lsnr
ONLINE ONLINE rzsolv236
ora.asm
ONLINE ONLINE rzsolv236 Started
ora.gsd
OFFLINE OFFLINE rzsolv236
ora.net1.network
ONLINE ONLINE rzsolv236
ora.ons
ONLINE ONLINE rzsolv236
ora.registry.acfs
ONLINE ONLINE rzsolv236
Cluster Resources
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rzsolv236
ora.LISTENER_SCAN2.lsnr
1 ONLINE OFFLINE
ora.cvu
1 ONLINE ONLINE rzsolv236
ora.oc4j
1 ONLINE ONLINE rzsolv236
ora.rzsolv236.vip
1 ONLINE ONLINE rzsolv236
ora.rzsolv237.vip
1 ONLINE INTERMEDIATE rzsolv236 FAILED OVER
ora.scan1.vip
1 ONLINE ONLINE rzsolv236
ora.scan2.vip
1 ONLINE INTERMEDIATE rzsolv236 FAILED OVER
It is the same with manual relocate, for the scan vip only it works:
[oracle@rzsolv236 ~]$ srvctl relocate scan -i 2 -n rzsolv236
[oracle@rzsolv236 ~]$ crsctl stat res -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DG_GRID.dg
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.LISTENER.lsnr
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.asm
ONLINE ONLINE rzsolv236 Started
ONLINE ONLINE rzsolv237 Started
ora.gsd
OFFLINE OFFLINE rzsolv236
OFFLINE OFFLINE rzsolv237
ora.net1.network
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.ons
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.registry.acfs
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
Cluster Resources
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rzsolv236
ora.LISTENER_SCAN2.lsnr
1 ONLINE OFFLINE
ora.cvu
1 ONLINE ONLINE rzsolv236
ora.oc4j
1 ONLINE ONLINE rzsolv236
ora.rzsolv236.vip
1 ONLINE ONLINE rzsolv236
ora.rzsolv237.vip
1 ONLINE ONLINE rzsolv237
ora.scan1.vip
1 ONLINE ONLINE rzsolv236
ora.scan2.vip
1 ONLINE INTERMEDIATE rzsolv236 FAILED OVER
[oracle@rzsolv236 ~]$ srvctl relocate scan -i 2 -n rzsolv237
[oracle@rzsolv236 ~]$ crsctl stat res -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DG_GRID.dg
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.LISTENER.lsnr
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.asm
ONLINE ONLINE rzsolv236 Started
ONLINE ONLINE rzsolv237 Started
ora.gsd
OFFLINE OFFLINE rzsolv236
OFFLINE OFFLINE rzsolv237
ora.net1.network
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.ons
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
ora.registry.acfs
ONLINE ONLINE rzsolv236
ONLINE ONLINE rzsolv237
Cluster Resources
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rzsolv236
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE rzsolv237
ora.cvu
1 ONLINE ONLINE rzsolv236
ora.oc4j
1 ONLINE ONLINE rzsolv236
ora.rzsolv236.vip
1 ONLINE ONLINE rzsolv236
ora.rzsolv237.vip
1 ONLINE ONLINE rzsolv237
ora.scan1.vip
1 ONLINE ONLINE rzsolv236
ora.scan2.vip
1 ONLINE ONLINE rzsolv237
[oracle@rzsolv236 ~]$
But it does not for the scan listener:
[oracle@rzsolv236 ~]$ srvctl relocate scan_listener -i 2 -n rzsolv236
PRCR-1105 : Failed to relocate resource ora.LISTENER_SCAN2.lsnr to node rzsolv236
PRCR-1089 : Failed to relocate resource ora.LISTENER_SCAN2.lsnr.
CRS-2674: Starten von "ora.scan2.vip" auf "rzsolv236" nicht erfolgreich
[oracle@rzsolv236 ~]$ srvctl relocate scan -i 2 -n rzsolv236
[oracle@rzsolv236 ~]$ srvctl relocate scan_listener -i 2 -n rzsolv236
PRCR-1090 : Failed to relocate resource ora.LISTENER_SCAN2.lsnr. It is not running.
[oracle@rzsolv236 ~]$
Any ideas?
RobertHi,
thank you, but this does not match to my problem:
- those socket files are owned by oracle:oinstall on both nodes, I deleted them to give it a try but no change
- I can start the LISTENER_SCAN2 on node 1 using lsnrctl when node 2 is down, but I cannot using crsctl and GI does not check it when I start the listener manually
- GI seems to be kind of confused about the state of the LISTENER_SCAN2 while node 2 is down (is already in the INTERMEDIATE state on server 'rzsolv236' vs. OFFLINE state in crsctl stat res -t)
see output below ...
[oracle@rzsolv236 ~]$ crsctl stat res -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DG_GRID.dg
ONLINE ONLINE rzsolv236
ora.LISTENER.lsnr
ONLINE ONLINE rzsolv236
ora.asm
ONLINE ONLINE rzsolv236 Started
ora.gsd
OFFLINE OFFLINE rzsolv236
ora.net1.network
ONLINE ONLINE rzsolv236
ora.ons
ONLINE ONLINE rzsolv236
ora.registry.acfs
ONLINE ONLINE rzsolv236
Cluster Resources
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rzsolv236
ora.LISTENER_SCAN2.lsnr
1 ONLINE OFFLINE
ora.cvu
1 ONLINE ONLINE rzsolv236
ora.oc4j
1 ONLINE ONLINE rzsolv236
ora.rzsolv236.vip
1 ONLINE ONLINE rzsolv236
ora.rzsolv237.vip
1 ONLINE INTERMEDIATE rzsolv236 FAILED OVER
ora.scan1.vip
1 ONLINE ONLINE rzsolv236
ora.scan2.vip
1 ONLINE INTERMEDIATE rzsolv236 FAILED OVER
[oracle@rzsolv236 ~]$
[oracle@rzsolv236 ~]$ crsctl start res ora.LISTENER_SCAN2.lsnr -n rzsolv23
CRS-2800: Cannot start resource 'ora.scan2.vip' as it is already in the INTERMEDIATE state on server 'rzsolv236'
CRS-4000: Command Start failed, or completed with errors.
[oracle@rzsolv236 ~]$
[oracle@rzsolv236 ~]$ lsnrctl status LISTENER_SCAN2
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 28-NOV-2013 16:12:15
Copyright (c) 1991, 2013, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN2)))
TNS-12541: TNS:no listener
TNS-12560: TNS:protocol adapter error
TNS-00511: No listener
Linux Error: 2: No such file or directory
[oracle@rzsolv236 ~]$
[oracle@rzsolv236 ~]$ lsnrctl start LISTENER_SCAN2
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 28-NOV-2013 16:12:32
Copyright (c) 1991, 2013, Oracle. All rights reserved.
Starting /ora01/app/grid/product/GRID_11.2.0.4/bin/tnslsnr: please wait...
TNSLSNR for Linux: Version 11.2.0.4.0 - Production
System parameter file is /ora01/app/oracle/network/admin/listener.ora
Log messages written to /ora01/app/oracle/diag/tnslsnr/rzsolv236/listener_scan2/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2)))
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN2)))
STATUS of the LISTENER
Alias LISTENER_SCAN2
Version TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date 28-NOV-2013 16:12:32
Uptime 0 days 0 hr. 0 min. 1 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /ora01/app/oracle/network/admin/listener.ora
Listener Log File /ora01/app/oracle/diag/tnslsnr/rzsolv236/listener_scan2/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2)))
The listener supports no services
The command completed successfully
[oracle@rzsolv236 ~]$
[oracle@rzsolv236 ~]$ lsnrctl status LISTENER_SCAN2
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 28-NOV-2013 16:12:37
Copyright (c) 1991, 2013, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN2)))
STATUS of the LISTENER
Alias LISTENER_SCAN2
Version TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date 28-NOV-2013 16:12:32
Uptime 0 days 0 hr. 0 min. 4 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /ora01/app/oracle/network/admin/listener.ora
Listener Log File /ora01/app/oracle/diag/tnslsnr/rzsolv236/listener_scan2/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2)))
The listener supports no services
The command completed successfully
[oracle@rzsolv236 ~]$
[oracle@rzsolv236 ~]$
[oracle@rzsolv236 ~]$ crsctl stat res -t
NAME TARGET STATE SERVER STATE_DETAILS
Local Resources
ora.DG_GRID.dg
ONLINE ONLINE rzsolv236
ora.LISTENER.lsnr
ONLINE ONLINE rzsolv236
ora.asm
ONLINE ONLINE rzsolv236 Started
ora.gsd
OFFLINE OFFLINE rzsolv236
ora.net1.network
ONLINE ONLINE rzsolv236
ora.ons
ONLINE ONLINE rzsolv236
ora.registry.acfs
ONLINE ONLINE rzsolv236
Cluster Resources
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rzsolv236
ora.LISTENER_SCAN2.lsnr
1 ONLINE OFFLINE
ora.cvu
1 ONLINE ONLINE rzsolv236
ora.oc4j
1 ONLINE ONLINE rzsolv236
ora.rzsolv236.vip
1 ONLINE ONLINE rzsolv236
ora.rzsolv237.vip
1 ONLINE INTERMEDIATE rzsolv236 FAILED OVER
ora.scan1.vip
1 ONLINE ONLINE rzsolv236
ora.scan2.vip
1 ONLINE INTERMEDIATE rzsolv236 FAILED OVER
[oracle@rzsolv236 ~]$
Regards,
Robert -
Available Group Wikis not update after group is removed
I removed group from workgroup manager and directory service. But group wiki is still shown on Groups home page and I can even login with old member's id and password. How I can remove this?
Yep, I'm having a similar issue.
I've deleted the wiki site, created a new one (same domain) and added new groups. When I view the site the old groups and all the old data is all I can see. Obviously Groups bear no resemblance to the data - maybe they just provide authentication?
Anyone know how to delete a Wiki and have it permanently removed?
Any help would be much appreciated.
Cheers -
WSFC Cluster and Availability Group Alert Variables
Is there a way to include the SQL instance name and/or the server name(s) in the alert for Cluster offline and Availability Group offline alerts?
Usually the Full Path Name contains the server name, but for these alerts the server name is nowhere to be found. The WSFC Cluster offline and Availability Group offline alerts (there could be others) only show the Cluster or Availability Group name
(For Full Path Name and Source). There's no info in the alert about the SQL instance name or the server(s) associated with the Cluster/Availability Group. This isn't very helpful and doesn't make the alert very useful if you have a lot of clusters and availability
groups that you manage. Sure, you can get the info from the Diagram View, but that means you have to login to the console, find the alert and open another view...not very useful if you need relevant data quickly to try to resolve the issue.Hi Yan,
Thanks for the link, but none of those actually supply the name of the server of the Cluster or Availability group that generated the error. I thought the Full Name might work, but it doesn't. I also tried the Windows Computer host under the General section
($Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$), and that just generates an error when you
try to save the Channel (it says it failed to verify module reference). The alerts are kind of useless unless we can identify the servers. Why? Well, we have some staging servers and some production servers that have Clusters and Availability groups that have
the same name, so when an alert gets generated there's no way to tell which Cluster or Availability group it relates to.
Anyone have a way to include the server info in the Cluster and Availability group alerts? -
We have requirement like to Install the SSAS as Cluster instance where SQL Server Database engine was installed with Always On availability group.Please help me to how configure it
Currently we have following configuration in Current steup.
Node1 and Node2 are in windows cluster
Node1 has SQL Server Database engine Instance1 as Standalone
Node2 has SQL Server Database engine Instance2 as Standalone
Instance1 and Instance2 configured for Always On availability group with Listener.
Now we have to steup SSAS instance with High availibilty. I know we have only option is to install cluster SSAS instance.
Can some one provide information below.
1. How to steup Cluster SSAS instance in this servers.
2. Is this will have any dependancy on exisiting Listner name.
3. Is this affects the availiblity groups, if SSAS instance failed over to another node.
Thanks in Advance
SriramYou will need to have SSAS installed as a clustered instance with shared storage. Refer to the whitepaper from this MSDN article
How To Cluster SQL Server Analysis Services
Availability Group is in it's own Role/Resource Group. When you create the clustered SSAS, it will create its own Role/Resource Group. This means that it will require its own virtual network name and virtual IP address and will not affect the existing Availability
Group. You can also have it on the existing Availability Group if you want to. However, you need to decide if you want SSAS to failover with the Availability Group or not. Your design choices will depend on that decision.
Edwin Sarmiento SQL Server MVP | Microsoft Certified Master
Blog |
Twitter | LinkedIn
SQL Server High Availability and Disaster Recover Deep Dive Course -
RDP Services do not accept connection after cluster failover
Hi guys,
i am having weird behaviour on my Windows Server 2012 R2.
server 1
- 10.100.1.201
server 2 - 10.100.1.203
VIP - 10.100.1.202
when i perform remote desktop session to server 1 and server 2 after both servers are being rebooted, they are working perfectly fine. during the remote desktop session, i perform a cluster node failover switching node to server 2. immediately i perform
the task, my server 1 connection will hang and not able to login anymore.
strangely, when i am connection from the same server zone and perform remote desktop, they work perfectly fine and will not disconnect me from neither of the server 1 and 2.
i am suspecting the network routing mess up during the cluster failover, but from the route print, there are identical and has no problem with it.
any one here has the same problem i experience?
zhiyuanHi,
sorry if i lose you ... here is the story
node 1 - 10.100.1.201
node 2 - 10.100.1.203
vip - 10.100.1.202
when both server restarted, i can remote to both servers, no problem.
1. RDP to both node 1 and 2 together on their physical IP. Connected successfully.
2. checked the Active node on node 1. Perform fail over from node 1 to 2, node 1 RDP session loss connection immediately. checked on node 2, cluster node active on node 2. no errors.
3. perform node 2 to node 1 fail over. Node 2 RDP session loss connection immediately, node 1 session came back. checked cluster node active on node 1. no errors.
4. in order to have both can continue to rdp, perform restart on node 2 (the node cannot reconnect), after reboot, rdp back to normal.
5. firewall team confirm connection has reach server, server not responding to rdp apparently. -
SQL 2012 Always On Availability Groups
Has anyone configured FIM SYnc, FIM Service and MSF in SQL 2012 Always On Availability groups
I do not believe we can configure the SQL connection string for FIM Sync or FIM Service to include "multisubnetfailover"
TIA
NigelI commented on this here:https://social.technet.microsoft.com/Forums/en-US/64f55628-3b5d-4d16-9044-dcbe7053581d/lack-of-support-for-fim-database-mirroring?forum=ilm2
Additional comments:
SQL Server Always On Failover Cluster Instances should be supported just as SQL Clustering is supported.
SQL Server Always On Availability Groups are not supported, just as Database Mirroring in synchronous mode (which is required for automatic failover) is not supported. I can't find authoritative statements of this on the web. But I know this to be the
case.
Ultimately, the reason for lack of support for mirroring (for automatic failover or in synchronous mode) is that the product group has said so (in conferences and webinars). Meaning the product group has not tested it or has and decided that it doesn't
work or adds risk.
Possible underlying reasons:
1) To do automatic failover with mirroring or Availability groups you must edit the connection string. The way FIM builds the connection string out of the components in the registry don't permit this.
2) Running mirroring in synchronous mode slows down performance in two ways: first is the additional traffic to send it to the mirror partner (or replica), second and most important is that synchronous requires that the primary not truly commit the transaction
until it has been committed to the secondary which means transactions take longer. For some products this can result in an unacceptable performance degradation.
Hopefully, MIM will support Availability groups.
David Lundell, Get your copy of FIM Best Practices Volume 1 http://blog.ilmbestpractices.com/2010/08/book-is-here-fim-best-practices-volume.html -
If you use the WCF-SQL adapter it is recommend that you set UseAmbientTransaction to true if you are changing data. I think this requires MSDTC to be enabled on the SQL server that you are changing the data on. (http://msdn.microsoft.com/en-us/library/dd787981.aspx)
I think that Availability groups does not support MSDTC. (http://msdn.microsoft.com/en-us/library/ms366279.aspx).
How can you change data on a SQL 2012 application database that uses availability groups from BizTalk server?Hi,
Yes, Availability groups doesn't support MSDTC. Please refer to the similar discusison which maybe helpfull:
http://dba.stackexchange.com/questions/47108/alwayson-ag-dtc-with-failover
http://stackoverflow.com/questions/17179221/msdtc-in-always-on-availability-groups
Maybe you are looking for
-
How to do 543 for the sub-contracting components in HUM storage location
Hello, I'm doing Sub-Contracting Goods receipt with 101 mvt type. It happens that when i use a HUM storage location for the receipt of the materials, while posting the Goods receipt, it creates an inbound delivery. Inbound delivery follows, Packing,
-
Hello friends, I dont know What is User Exits-Menu Exits-Function Exits. Can anyone tell me how to use it and how to make code for it. Give Example so i can run here and understand. Thanks in advance. Regards, Nimesh Master
-
I have hooked my ipad up to two different computers now and it will not let me fully restore it. It goes all the way through the download process then says processing file. Then it will say it lost network connection and to try again. Any ideas?
-
I can´t upload more than 300 photos
i can´t upload more than 300 photos
-
Which patches are already applied
Hi, how can I requeste from the database which patches are already applied ? select * from TABLENAME ; I have forgotten the name of table or view. Many thanks before.