Cluster Service Monitoring - Is There An Alert When a Volume is Available?
We've seen some alerts that show a shared volume is no longer available. They look something like this
Alert: Shared Volume IO is paused
Source: Cluster Service
Path: Host.domain.com
Last modified by: System
Last modified time: 2/14/2011 7:16:10 AM Alert description: Cluster Shared Volume 'Volume4' ('Exchange Mail Data') is no longer available on this node because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will temporarily be queued until a path to
the volume is reestablished.
We're wondering if there is a way to generate an alert that tells us the volume is available again.
Orange County District Attorney
Hi,
Based on my research, this monitor is based on the Cluster Shared Volume related Events:
Event Log Rules
http://technet.microsoft.com/en-us/library/dd491018.aspx
Please also see the Events listed:
Cluster Shared Volume Functionality
http://technet.microsoft.com/en-us/library/ee830309(WS.10).aspx
However, I could not find the Events means the “cluster
shared volume is available again”; therefore, I suspect this cannot be monitored based on Event Log.
In addition, I just noticed the status of a cluster shared volume can be queried by PowerShell script. Hope this can give you some hints:
Get-ClusterSharedVolume
http://technet.microsoft.com/en-us/library/ee460981.aspx
Thanks.
Nicholas Li - MSFT
Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.
Similar Messages
-
Hi All,
I am using SCOM 2007 R2 CU4 in my environment. I want to do a service monitoring on specific agents. When i try creating this service monitor which comes under the Management pack template when i select the service which i want to monitor it does not
show all the services in the drop down.
For example - I have Windows audio service which is present on the machine, But it is not showing in the service list.
So from the services stating from "W" i see only 5 in SCOM and in services.msc in the Agent i see more than 5.
Below is the screen shot.
Can any one please help.
Gautam.75801Hi Yan Li,
So based on your above suggestion, If the services are already managed / monitored by a specific
management pack those services will not appear here in the Wizard while creating this type of a Management pack object alert right ?
If that is the case why does not the same reflect here in the operations console in the Services monitor
TAB ?
Gautam.75801 -
CUCMS Service Monitor 8.0 and 1040
Hi there
We have installed CUCMS 8.0, we also have Service Monitor 8.0 installed.
When trying to configure a 1040 sensor we are unable to add the IP address of the 1040 sensor since, there is no option to add an IP address of the sensor. when we go to edit the configuration we see that the IP address of the sensor is the mac address and we are unable to edit this information as well. Therefore we suspect that this is a bug.
We are unable to pick up the 1040 sensor.
Attached is the screenshot of our problem.
Please advise is this is a bug?
Many thanks
ShabeerThe sensors do a DHCP.requestto get IP, TFTP server addres
So you need to setup a server that will respond with an IP for this mac address and give the IP address of a TFTP server (option 150).
You Service Monitor or CallManager can be such a TFTP server.
There on the TFTP server you must place a file called QOV.CFG
In Service Monitor -> Configuration -> Sensors you can make config this file, then copy it to the TFTP server
It tells the sensor what images to download and which service monitor it should register with.
Cheers,
Michel -
Team,
got a typical query from my app owner.
we have a service monitor which is very critical. when the service is down how much time actually it takes SCOM to acknowledge it and notify it.
in my case it is taking 2-3 sec is their a way to reduce this.
RajKumarHi Raj,
Can you confirm if it is 2 - 3 seconds or minutes you get notified ?
As of i know 2 - 3 seconds is not bad and also even if you write a powershell script to fetch the service failure capture on the server where the service is located that also will take 2-3 seconds.
I would give a suggestion you create a Event based alerting rule --> Use system log for the log location and then search for the event id and description what it appears when it crashes.
Event id: XXX
Event source: XX
EventDescription: The Critical service crashed / or what your Event body contains for that.
Try and see if Event based rule is faster than service monitor.
Gautam.75801 -
Why virtual interfaces added to ManagementOS not visible to Cluster service?
Hello All,
I"m starting this new thread since the one before is answered by our friend Udo. My problem in short is following. Diagram will be enough to explain what I'm trying to achieve. I've setup this lab to learn Hyper-V clustering with 2 nodes. It is Hyper-V
server 2012. Both nodes have 3x physical NIcs, 1 in each node is dedicated to managing the Node. Rest of the two are used to create a NIC team. Atop of that NIC team, a virtual switch is created with -AllowManagementOS
$False. Next I created and added following virtual interfaces to host partition, and plugged them into virtual switch created atop of teamed interface. These virtual interfaces should serve the purpose of various networks available.
For SAN i'm running a Linux VM which has iSCSI target server and clustering service has no problem with that. All tests pass ok.
The problem is......when those virtual interfaces added to hosts; do not appear as available networks
to cluster service; instead it only shows the management NIC as the available network to leverage.
This is making it difficult to understand how to setup a cluster of 2x Hyper-V Server nodes. Can someone help please?
Regards,
Shahzad.Shahzad,
I've read this thread a couple of times and I don't think I'm clear on the exact question you're asking.
When the clustering service goes out to look for "Networks", what it does is scan the IP addresses on each node. Every time it finds an IP in a unique subnet, that subnet is listed as a network. It can't see virtual switches and doesn't care about
virtual vs. teamed vs. physical adapters or anything like that. It's just looking at IP addresses. This is why I'm confused when you say, "it won't show virtual interfaces available as networks". "Networks" in this context are IP subnets.
I'm not aware of any context where a singular interface would be treated like a network.
If you've got virtual adapters attached to the management operating system
and have assigned IPs to them, the cluster should have discovered those networks. If you have multiple adapters on the same node using IPs in the same subnet, that network will only appear once and the cluster service will only use
one adapter from that subnet on that node. The one it picked will be visible on the "Network Connections" tab at the bottom of Failover Cluster Manager when you're on the Networks section.
Eric Siron Altaro Hyper-V Blog
I am an independent blog contributor, not an Altaro employee. I am solely responsible for the content of my posts.
"Every relationship you have is in worse shape than you think."
Hello Eric and friends,
Eric, much appreciated about your interest about the issue and yes I agree with you when you said... "When the clustering service goes out to look for "Networks",
what it does is scan the IP addresses on each node. Every time it finds an IP in a unique subnet, that subnet is listed as a network. It can't see virtual switches and doesn't care about virtual vs. teamed vs. physical adapters or anything like that. It's
just looking at IP addresses. This is why I'm confused when you say, "it won't show virtual interfaces available as networks". "Networks" in this context are IP subnets. I'm not aware of any context where a singular interface would be treated
like a network."
By networks I meant to say subnets. Let me explain what I've configured so far:
Node 1 & Node 2 installed with 3x NICs. All 3 NICs/node plugged into same switch.
Node1: 131.107.0.50/24
Node2: 131.107l.0.150/24
A Core Domain controller VM running on Node 1: 131.107.0.200/24
A JUMPBOX (WS 2012 R2 Std.) VM running on Node 1: 131.107.0.100/24
A Linux SAN VM running on Node 2: 10.1.1.100/8
I planed to configured following networks:
(1) Cluster traffic: 10.0.0.50/24 (IP given to virtual interface for Cluster traffic in Node1)
Cluster traffic: 10.0.0.150/24 (IP given to virtual interface for Cluster traffic in Node2)
(2) SAN traffic: 10.1.1.50/8 (IP given to virtual interfce for SAN traffic in Node1)
SAN traffic: 10.1.1.150/8 (IP given to virtual interfce for SAN traffic in Node2)
Note: Cluster service has no problem accessing the SAN VM (10.1.1.100) over this network, it validates SAN settings and comes back OK. This is an indication that virtual interface is
working fine.
(3) Migration traffic: 172.168.0.50/8 (IP given to virtual interfce for
Migration traffic in Node1)
Migration traffic: 172.168.0.150/8 (IP given to virtual interfce for
Migration traffic in Node2)
All these networks (virtual interfaces) are made available through two virtual switches which are configured EXACTLY identical on both Node1/Node2.
Now after finishing the cluster validation steps (which comes all OK), when create cluster wizard starts, it only shows one network; i.e. network of physical Layer 2 switch i.e. 131.107.0.0/24.
I wonder why it won't show IPs of other networks (10.0.0.0/8, 10.1.1.0/8 and 172.168.0.0/8)
Regards,
Shahzad -
Prime Infrastructure 2.0 - Email alert when device is unreachable
Is there a way for PI to send email alerts when a device becomes unreachable? I know there are alerts when an interface goes down, but I can't find how to get prime to alert me ONLY when a device becomes unreachable.
Does anyone know how to do this? I can't believe such a simple feature would be so difficult to configure...PI also generates an event when it finds unreachable device. An event is an occurrence or detection of some condition in or around the network. An event is a distinct incident that occurs at a specific point in time. Examples of events include:
Port status change
Device reset
Device becomes unreachable by the management station
You can view the list of events using the Event Browser.
Choose Operate > Alarms & Events , then click Events to access the Events Browser page.
Prime Infrastructure discovers events by automatically polling devices and discovering changes; for example, device unreachable.
For more details on Events, please check here :
http://www.cisco.com/c/en/us/td/docs/net_mgmt/prime/infrastructure/2-0/user/guide/prime_infra_ug/alarms.html#pgfId-1054357
-Thanks
Vinod
**Rating Encourages contributors, and its really free. ** -
Hey All -
I have the need to monitor one service that exist on 3 servers, but the service only runs on one of the servers at a time, and the servers are not set up as a cluster. Any ideas around how to accomplish this?Hi,
Create a service monitor as previously described and then create a distributed application and add the new service monitor for all servers to a component group. Then set the health rollup to Best state of any member - I think that should work...
Further details around DA's
http://thoughtsonopsmgr.blogspot.co.uk/2011/06/distributed-applications-das-part-ii.html
http://social.technet.microsoft.com/Forums/systemcenter/en-US/9e932038-32c6-4403-8489-1535de4214be/distributed-applications-health-rollup?forum=operationsmanagerauthoring
Hope this helps.. -
Having Issue with Service Monitor Report when using Oracle DB
Hi,
I set up Oracle XE to store monitoring and message data for Service Monitor. I configured the Database Connection in Policy Studio, and deployed the configuration. When starting the Service Monitor, I got "There are no reporting nodes configured." error.
Here is the detailed message:
INFO 27/Jan/2012:13:13:54.155 [ed46fa00] thread set netsvc threadpool drained
INFO 27/Jan/2012:13:13:55.074 [ed46fa00] rolling file logs/ConfigurationManagementAuditTrail.xml stopped
INFO 27/Jan/2012:13:13:55.078 [ed46fa00] Shutting down Policy Director Manager
INFO 27/Jan/2012:13:13:55.093 [ed46fa00] rolling file trace/ServiceMonitor.trc stopped
[oracle@EDCPR16P0 bin]$ ./oegservicemonitor
INFO 27/Jan/2012:13:13:59.896 [911baa00] Attempting to connect to entity store at federated:file:////u01/app/oracle/oeg11g/oegservicemonitor/conf/fed/configs.xml
INFO 27/Jan/2012:13:14:03.014 [911baa00] Realtime monitoring disabled
INFO 27/Jan/2012:13:14:03.016 [911baa00] Storing metrics in database disabled
INFO 27/Jan/2012:13:14:03.781 [911baa00] cert store configured
INFO 27/Jan/2012:13:14:03.785 [911baa00] keypairs configured
INFO 27/Jan/2012:13:14:03.791 [911baa00] Initializing server
INFO 27/Jan/2012:13:14:04.423 [911baa00] Attempting to connect to repConfig entity store at file:/u01/app/oracle/oeg11g/oegservicemonitor/conf/repConfig.xml
INFO 27/Jan/2012:13:14:04.455 [911baa00] There are no reporting nodes configured.
INFO 27/Jan/2012:13:14:04.456 [911baa00] Initializing report file format transformer
INFO 27/Jan/2012:13:14:13.005 [911baa00] Server initialized
INFO 27/Jan/2012:13:14:13.206 [911baa00] TCP interface
INFO 27/Jan/2012:13:14:13.206 [911baa00] checking invariants for interface *:8040
INFO 27/Jan/2012:13:14:13.206 [911baa00] listen on address 0.0.0.0/8040
It looks like the repConfig.xml file needs to be updated. If so, what elements? Anything else i am missing?You need to add a Gateway node the service monitor.
To do this browse to http://HOST:8040/
Using the default port, you can connect to the Service Monitor interface in a browser, where HOST points to the IP address or hostname of the machine on which Service Monitor is installed.
Right mouse click and add the Gateway node. -
Displaying hostname in Basic Service Monitor alerts
Hi all,
I've created a new Unit Monitor (specifically, a Basic Service Monitor) to alert when a particular service I stopped on any server in a group that I've defined. When I stop the service to test the alert, I don't see the affected server's hostname anywhere in
the alert. I see the source is set to the name of the group. This isn't all that useful, as I have to check every server until I find the one that the alert was generated for.
I've tried changing values in the Alert Description on the monitor, but haven't had any success here. It seems to me like the answer should be pretty obvious, but I'm stumped.
Thanks!First - monitors or rules targeting groups is a
bad practice.
http://blog.scomskills.com/best-practices-targeting-intro/
Also, getting hostname in the alert is a
discovery issue.
http://blogs.technet.com/b/jonathanalmquist/archive/2010/06/18/why-does-source-not-show-me-the-computer-name-that-generated-the-alert.aspx
Jonathan Almquist | SCOMskills, LLC (http://scomskills.com) -
Windows Service Monitor - False Alerts
Scenario : We have a custom Service Monitor that is created long back and enabled for all Servers. I just noted that the service is not present in all servers except few. I also noticed that there are some Overrides for 5 servers which does
not make any sense to me as it was overridden to enable this monitor.
Issue : Now the issue is, I got alerts for 5 servers for which the service is not installed at all. How is that possible ?
S.Arun Prasath HP ARDE TEAMHOw to you create this service monitor?
You can create your service monitor either
1) From Authoring workspace --> Management pack Templates --> Windows services
http://www.bictt.com/blogs/bictt.php/2011/03/17/scom-monitoring-a-service-part3
OR
2) Authoring workspace --> Monitors
http://www.bictt.com/blogs/bictt.php/2011/03/17/scom-monitoring-a-service-part2
You understanding is correct if you using method 1) to create service monitor. Otherwise, you should enable the monitor for computer which has te service present.
Roger -
I have a service monitor setup to monitor up down of a service. The service yesterday starting to terminate and restart it self but the monitor is not catching that it is going down. The monitor works fine we we take it down manually. I think the restart
is happening too quickly. Has anyone seen something like this and is there a fix for it?Hi,
The default frequency of the basic service monitor is 60 Seconds if I have it in mind correctly so you may not got an alert for every service restart.
If you want to get an alert for every service restart you may have to use an Event Monitor to check the Service Start/Stop Events.
Have a look on this thread also:
http://www.systemcentercentral.com/forums-archive/topic/question-about-the-poll-time-of-the-basic-service-monitor/
Cheers,
Christoph
Blog: http://blog.cmaresch.at/ Twitter:
LinkedIn:
XING:
Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose. -
Is there a way to create an AUDIO alert when my MacBook Air (Mavericks) battery runs low (say 10%)? I often miss the visual power percentage in the top bar, so would like an audio alert as well.
For global alerts, yes:
There is no need to do anything before the alert appears, and the MBA will not lose data should it go to sleep. If it goes to sleep at 0% you still have several hours to plug it in before it shuts down.
I wouldn't even bother displaying the percentage value. -
Why there's no beep sound alert when charging my iphone5 from totally drained battery?
Why there's no beep sound alert when charging my iphone5 from totally drained battery?
Thanks for your reply. But there's still no beep sound alert when it start to charge?
-
Service Monitoring in SCOM which triggers an alert after a wait time
Hi ,
I have a requirement to amend on of our services monitors to alert after waiting for 3 minutes after the service goes down.
Is this possible using SCOM or do we need to script this.
Please advise.
JestyFor configure that using script, you can refer below link
http://blogs.technet.com/b/fesiro/archive/2012/11/26/how-to-configure-command-notification-in-scom-2012-with-powershell-script.aspx
Please remember, if you see a post that helped you please click "Vote As Helpful" and if it answered your question, please click "Mark As Answer"
Mai Ali | My blog: Technical | Twitter:
Mai Ali -
I have a test 2-node Failover cluster using Server 2012 R2
As of last night the cluster service on one of the 2 nodes is down with this error:
The Cluster Service service terminated with the following service-specific error:
Cannot create a file when that file already exists.
EventID 7024
The Cluster service waits 60 sec, tries to start, and the same error occurs again.
Any idea where to look to identify which file this error is referring to, or how to go about identifying root cause and getting a solution?
thank you.
sambHi Yeswanth
Then you can try with a "Add Counter". This will create new file each time with the same name but a counter will be added to the file name at the end specifying the number of times it is created.
You can also the specify the format to create the counter once select this option u can correspondingly fill the Format and step fields.
Will this be fine.
Regards
Ashmi
Maybe you are looking for
-
How can i edit Iweb published page to open as a Popup?
I want to edit my iWeb site so that when a link is clicked it opens a different page as a popup with no other toolbars, scrollbars etc - just the image i want to show. I found some code online (http://www.quackit.com/html/codes/html_popup_window_code
-
Problem with FrameMaker license
Our 3 FrameMaker products says that the license is expired. Why?
-
Getting error message and do not know how to address....."Time Machine error" Pro.sparsebundle is already in use (Latest successful backup @ 4:17PM yesterday April 9 2013. Looking for best actions to take to continue automatic backups without g
-
Creating & Using .jar files and manifests
I've read through hundreds of online tutorials on creating and using .jar's, but it's surprising how unhelpful they really are. I've been trying to make a java program of mine standalone, so I can distribute it to others and allow them to run it with
-
Hello, I have some short dumps about memory problems in my BI system (3.5) when I´m consulting a report through BEx. It shows the error STORAGE_PARAMETERS_WRONG_SET, and recommend to increase value of profile parameters: - abap/heap_area_dia - abap/h