SCOM 2012 - Event ID 6024 (Launching Restart Health Service. Health Service exceeded Process\Handle Count or Private Bytes threshhold.)

I am getting event ID 6024 (LaunchRestartHealthService.js : Launching Restart Health Service. Health Service exceeded Process\Handle Count or Private Bytes threshhold.) within an interval ranging from 12-17 minutes.
I am using SCOM (2012 SP1 and 2012 R2) on Windows Server (2008 R2 / 2012 / 2012 R2).
This issue is occurring only on agent managed computer (acting as proxy and discover managed objects on other computers setting is enabled) which i am using for monitoring my device. All discovery scripts (powershell) and monitors are targeted on this agent
managed computer.
There are total 80 discoveries and 900 monitors. 55 discoveries and 550 monitors are enabled by default and rest all are disabled.
I am seeing event id 6024 frequently only on agent managed computer. Can anyone help me to resolve this issue.
Thanks,
Mukul

To fix issue 6024, you can follow below steps:
1. Open SCOM console. Go to Monitors -> Agent -> Entity Health -> Performance -> Health Service Performance -> Health Service State.
2. Double click Health Service Handle Count Threshold monitor and go to Overrides page.
3. Click Override -> For a specific object of Class: Agent. Select the affected SCOM agent QMXServer.
4. Check on the parameter Agent Performance Monitor Type - Threshold. Change the default value 2000 to an appropriate value, like 4000. You can check the Health service handle count alert in SCOM console to get the value when the alert is generated. You
can also launch the health explorer against QMXServer to check the value when the monitor state is changed from healthy to critical.
Also you can refer below links
http://blogs.technet.com/b/omx/archive/2013/10/17/health-service-restarts-on-service-manager-servers-with-scom-agents.aspx
Please remember, if you see a post that helped you please click "Vote As Helpful" and if it answered your question, please click "Mark As Answer"
Mai Ali | My blog: Technical | Twitter:
Mai Ali

Similar Messages

  • SCOM 2012 Event ID 21006 and 21016

    I'm having a connection issue with a newly created gateway server to my management server, that sits in an untrusted DMZ. I have been able to get one gateway working from the DMZ but the one in question is receiving;
    Event ID 21006 :  The OpsMgr Connector could not connect to frw0725.gecio.corp.net:5723.  The error code is 11004L(The requested
    name is valid, but no data of the requested type was found.).  Please verify there is network connectivity, the server is running and has registered it's listening port, and there are no firewalls blocking traffic to the destination.
    Event ID 21016 : OpsMgr was unable to set up a communications channel to frw0725.gecio.corp.net and there are no failover hosts. 
    Communication will resume when frw0725.gecio.corp.net is available and communication from this computer is allowed.
    I have performed the following actions and verifications;
    Services have been restarted on both servers
    certimport has been completed and gatewayapproval has been completed.
    I was able to telnet from MS to GW and from GW to MS, so connection through the firewall is ok.
    DNS appears to be ok, ping’s issued from both servers and they resolve to the correct IP address, however they timeout which is expected
    Event ID 20053 is being received stating that the OpsMgr Connector has loaded the specified authentication certificate successfully.
    I checked the serial number for the personal certificate against what is listed in the registry (reversed) and it matches.
    The Private Key is in place and the cert path is correct
    I also verified in HKLM\Software\Microsoft\Microsoft Operations manager\3.0\Agent Management Groups\" that the correct configuration is being picked up
    I'm looking for some additional guidance or suggestions on what else I can check to get this gateway to show monitored from teh console. Thanks for the help.

    Please check if the certificate was stored in the GW server Computer Personal Store when you first installed it.
    Asuming that the certificate is ok since it is actually working in another GW, perhaps the certificate is in the wrong store (Current User Personal store instead of
    Computer's personal store). In that case you only need to move the certificate to the right store and run momcertimport.exe again. Check Link Below for a detailed step-by-step
    If you still want to clear certificates from the server's personal store, you can do it through both certificates mmc snap-in or certutil.exe -delstore command line
    Also you may want to check this great Step by Step article about installing an OpsMgr GW server:
    http://blogs.technet.com/b/pfesweplat/archive/2012/10/15/step-by-step-walkthrough-installing-an-operations-manager-2012-gateway.aspx 
      Regards

  • SCOM 2012 Event Log Alarming

    I currently am using a Unit Monitor - Windows Events - Simple Event Detection - Windows Event Reset Monitor.
    This monitor looks for event ID 3003 and looks for "Down" or "Up".  This will open or close an alert depending on its operational status.
    The question/problem I am having is....the application i am monitoring always writes events under event ID 3003.  If I have multiple devices go down at the same time and then go down or come up in a different order, how can I get scom to differentiate
    between the events so its properly opening and closing the correct alerts?
    Your help or ideas are greatly appreciated.
    Thanks!

    The only option is create a event monitor for every devices which spot the event id 3003, event source, I assume that event source is the indication of which device generate the event, and event description is "Up" or "Down". This means
    that you should has 10 rules if you has 10 devices, 20 rules f you has 20 devices.
    Roger

  • SCOM 2012 - Event log analyzing?

    Hey all,
    I am new in scom and have a question about event logs. it is possible to analyze the event logs of servers or can it only use for monitoring of events ?
    Did anybody have more information about that?
    Please help and thanks in advice.

    If you dump those logs on a server with a SCOM agent, you can create a monitor that will use the log reader module to parse through the log to find matching data and raise alerts.  The logs reader provider has a few different types of files it can read.
     Check it out.
    http://social.technet.microsoft.com/Forums/systemcenter/en-US/aa3ad3b8-9a28-48c2-959a-cb628db1d647/text-log-monitorrule-and-cleared-logfile?forum=operationsmanagerauthoring
    Regards, Blake Email: mengotto<at>hotmail.com Blog: http://discussitnow.wordpress.com/ If my response was helpful, please mark it as so, if it answered your question, then please also mark it accordingly. Thank you.

  • SCOM 2012 event id 10801 cluster disks don't discovered.

    Hello.
    I have errors in Operations Manager log:
    Discovery data couldn't be inserted to the database. This could have happened because  of one of the following reasons:
         - Discovery data is stale. The discovery data is generated by an MP recently deleted.
         - Database connectivity problems or database running out of space.
         - Discovery data received is not valid.
     The following details should help to further diagnose:
     DiscoveryId: 5a84ee62-20c2-46a2-10b9-3dedaff65df6
     HealthServiceId: 3aeaca7c-48de-c0fc-0441-ffd5ef7aa7c3
     Microsoft.EnterpriseManagement.Common.DiscoveryDataInvalidRelationshipTargetException,The relationship target specified in the discovery data item is not valid.
    Relationship target ID: 2478193e-1a5f-4087-1b5f-95459123321e
    Rule ID: 5a84ee62-20c2-46a2-10b9-3dedaff65df6
    Instance:
    <?xml version="1.0" encoding="utf-16"?><RelationshipInstance TypeId="{acfe2f40-0a73-6764-21a5-bf59c41b2844}" SourceTypeId="{00000000-0000-0000-0000-000000000000}" TargetTypeId="{00000000-0000-0000-0000-000000000000}"><Settings
    /><SourceRole><Settings><Setting><Name>5c324096-d928-76db-e9e7-e629dcc261b1</Name><Value>SQL-01</Value></Setting><Setting><Name>af13c36e-9197-95f7-393c-84aa6638fec9</Name><Value>\\.\PHYSICALDRIVE18</Value></Setting></Settings></SourceRole><TargetRole><Settings><Setting><Name>5c324096-d928-76db-e9e7-e629dcc261b1</Name><Value>PDC-S-SQL-01.sibgenco.local</Value></Setting><Setting><Name>af13c36e-9197-95f7-393c-84aa6638fec9</Name><Value>Disk
    #18, Partition #0</Value></Setting></Settings></TargetRole></RelationshipInstance>.
    SQL-01 is server with clusters disks, and cluster disks are don't discovered.

    Hi,
    Hope the below articles can be helpful:
    Cluster resource groups are not monitored! Is there anything I can do?
    http://blogs.msdn.com/b/mariussutara/archive/2008/05/03/cluster-resource-groups-are-not-monitored-is-there-anything-i-can-do.aspx
    Event ID 10801 and 33333 in Operations Manager log
    http://www.itbl0b.com/2014/02/event-id-10801-33333-operations-manager-log.html#.U-QwunmKBes
    Please Note: Since site is not hosted by Microsoft, the link may change without notice. Microsoft does not guarantee the accuracy of this information the web.
    Regards, Yan Li

  • SCOM 2012 R2 & Exchange Server 2010 SP3 - Mailbox services not monitored

    Having done everything depicted in Exchange Server 2010 MP I still have trouble with some Exchange services - they are in not monitored state (five days passed):
    Mailbox All Database Services - Default-First-Site-Name
    Mailbox Database Service - Management - test.local
    Mailbox Database Service - PublicFolders - test.local
    Mailbox Database Service - Users - test.local
    Mailbox Many Database Service - Default-First-Site-Name
    OfflineAddressBook Service - Default-First-Site-Name
    Why is this happening? Since monitoring exchange mailbox databases is one of the most important tasks SCOM should be doing I assume something wrong is going on here. However there are no errors or alerts in SCOM console nor in OpsMgr
    log on Exchange server itself.

    Hi There,
    Did you install the Microsoft Exchange monitoring correletion service while importing the Exchange 2010 management pack ?
    Does your MS have RMS Emulator role ? As many alerts in the Exchange 2010 MP are targetted to Root management server class.
    Also did you configure Test mailboxex and stuff post Exchange 2010 Import ?
    Did you analyse the Ops manager Event logs in the MS and Agent ? Can you post the critical and Warnign events here.
    Also is Agent proxy enabled ?
    Also is the Action account of the Agents System or Domain account ?
    Also refer: http://blogs.catapultsystems.com/jcowan/archive/2013/03/26/opsmgr-2012-and-2007-exchange-server-2010-monitoring-management-pack-how-to-perform-the-optional-configurations-for-synthetic-transactions-and-kerberos-authentication.aspx
    Gautam.75801

  • SCOM 2012 - Dynamic Group based on a windows service

    So here is the situation, I have 12 servers with a windows service called acidicalien.  What i'd like to do is dynamically group all the servers running this service together.  I kind of worked out I need a group and I need to define the dynamic
    criteria, but I can't seem to find which elements I need to select to get the group to dynamically include servers with just that service.

    Raju - I am trying to accomplish the same with this statement. Can someone help me with the right syntax or documentation. Microsoft does has very poor documentation on how to use Dynamic Groups in SCOM.
    ( ( Object is Windows Computer AND True ) ( Object is Windows Service AND ( Service Name
    Equals SnapDrive ) AND ( ( Service Name
    Equals SWSvc ) OR False ) ) OR False )

  • SCOM 2012 SP1 UR4 management servers grey state

    Hi,
    My SCOM environment is made up of the below :-
    SCOM 2012 SP1 UR4.
    3 SCOM Management Servers all on Windows 2008 R2 SP1.
    Shared SQL 2008 cluster with 2 Windows nodes also on same OS.
    Just recently all our SCOM management servers have been flipping in and out from grey to green state.  Gateways/agents all look ok as showing green.  Alerting from agents appears normal as can see lots of them in console.
    Have flushed the health state cache folder on all 3 SCOM MS's and still the same issue.
    Appreciate any help on this one.

    Event id: 7011 - Was your server recently patched (Installed by any automatic updates) ?
    IS SCCM Configured in your MS? If Yes disable and check?
    Is Windows update service running ? Stop if for one or two days and check if this issue still appears
    Reference threads:
    http://social.technet.microsoft.com/Forums/en-US/b86e5a3d-0c2e-4d5e-9d3d-905da91fc982/scom-2012-event-id-7011-service-control-manager-error-when-fep-definition-updates-apply?forum=configmanagersecurity
    http://stefanroth.net/2012/09/26/scom-2012-event-id-7011-service-control-manager-error/
    Solution also available in: http://technet.microsoft.com/en-us/library/cc756319(v=ws.10).aspx
    ===========================================
    For Event id 20026 - 
    1. Does your Operationsmanager database have enough space ? Check that first.
    What is you DB size ?
    How much is the free space left ?
    2. Was there any resent change in the SCOM Action accoutn password ? Or has the password expired. Try re entering the SCOM Action password by re directing your self to Administration tab --> Run as Config -- > Accounts --> SCOM Action account.
    The description would be - This is the user account under which all rules run by default on the agent.
    Right click and go to properties and re enter the account name and password there and check.
    Refer the below screen shot
    Check this article as well:
    http://social.technet.microsoft.com/Forums/systemcenter/en-US/102d443c-db0e-4bf2-b0d6-31d7f9328537/all-agents-greyed-out-multiple-event-errors-with-ids-20026-20028?forum=operationsmanagergeneral
    ============================================
    Event id : 11904 - As per my knowledge appears due to incorrect Alrting string display name selected in any Rule or monitor.
    Also the description what you pasted in Event id : 11904 says Microsoft.SystemCenter.HealthService.ActionAccountConfigured.Error as highlighted below.
    Suggest to re enter the action account password and tell the results.
    Also is the Healthservice on the MS running using System account or Domain account ?
    =================================================================
    Description : The Microsoft Operations Manager Expression filter Module failed to query the delivered item, item was dropped.
    Property Expression: Reachability/State
    Error : 0XC00EE22
    One or more workflows were affected by this. Workflow
    name: Microsoft.SystemCenter.HealthService.ActionAccountConfigured.Error
    Gautam.75801

  • Service Availability Report in SCOM 2012

    Hi, we are having SCOM 2012 SP1 and need a report on Service Availability for the below services running on the servers
    Active Directory
    DFS
    DNS
    PKI
    DHCP
    Also, would like to understand how the SCOM would report the service availability when there are network issues (primarily WAN Issues) preventing the SCOM Server from reaching the server Hosting the service.
    Thanks,
    Naren.
    Thanks & regards, Naren.

    Hi,
    In order to generate reports on the services you mentioned, you need to go through two steps for starters.
    1. You need to create Distributed Applications for the services (Active Directory excluded since that DA is already created when importing the AD MP) see https://technet.microsoft.com/en-us/library/hh457612.aspx for how it can be done.
    2. You also need to create SLA´s (Service Level Tracking in the Authoring pane of the console) see https://technet.microsoft.com/en-us/library/hh230719.aspx
    The behaviour can somehow be modified since you decide what should count as donwtime when creating your SLA. However, a server which become unreachable would most likely be classified as critical and therefore it would "bring down" your service
    in SCOM.
    Once you have created the DA´s you want, you can use the default report "Service Level Tracking Summary" Report under the category "Microsoft Service Level Report Library". Here, you can choose to go back to a certain time to display
    your SLA levels. Be aware though, you can never go futher back than to the date where you created the DA and the SLA since that´s when the availability starts measuring.
    Regards,
    Daniel
    Regards, Daniel

  • SCOM 2012 UR2 - Flush health service state causes infinite agent resets

    Appearing on several of our 2008R2 servers from our SCOM 2012 UR2 environment, set up with several management servers:
    If I select a monitored servers agent in the Operator Console and click the task
    Flush health service state and cache, the selected agent will restart as intended.
    However, it will restart again every 2-3 minutes forever ever after, unless  you stop the agent service and delete the health service cache folder on the monitored server.
    This is visible via the eventlog:
    event 103, Healthservice. A task to reset the health service store has been submitted.  The service store will be deleted and re-created.
    This was never the case with our SCOM2007R2 setup and started appearing with 2012.
    Have anyone seen this before? Any ideas?
    Regards / Jon

    Might save someone some time....
    Stopping the agent, removing the "Health Service State" directory + restarting doesn't appear to help.  I did this *and* deleted some agents from the console (then re-added them.  We turned auto approve off)  and it appears
    it might have fixed it.  I'll know more in the morning if this actually does the trick besides doing a full uninstall/reinstall of the agent.
    If you don't know what agents are constantly restarting, search all management servers for *many* file transfers (Event ID 2110 in Ops manager event log) all to the same Instance/GUID. Execute this powershell on a managemnt server: "get-SCOMClassInstance
    -Id <GUID>" to get the affected agent. Alternatively, make a SCOM rule to catch event 103 and look for many repeats from the same server/agent:
    Event Type: Warning
    Event Source: HealthService
    Event Category: Health Service
    Event ID: 103
    Description:
    A task to reset the health service store has been submitted.  The service store will be deleted and re-created.

  • SCOM 2012 R2 Exchange Correlation Service , we receive almost at every day in the Event log Application the Event720

    HI
    Since the SCOM was Upgrade to R2 
    Almost at every Day, we receive in the Event log application the Event 720 from the correlation service Source MSExchangeMonitoring Correlation
    This arrives always around 7:20AM, someday is at 7:19, other at 7:21. It is always approximately at the same hour, but we never have any problem during weekend
    The description of the Event
    Exceeded maximum time (15 minutes) to wait for completion of all CorrelateBatchTask threads.
    After that the correlation stop to work. At the Same time if we tried to open the SCOM Console on that server we was unable to open it. Also we was not able to open the SCOM PowerShell
    And also we cannot from that server to get which server is the RMS if we run get-SCOMRMSEmulator .  (This the RMS Server)
    When this arrive, the only thing we found, it to reboot the server or restart de SCOM service, after the Reboot the Correlation begin to work
    We got also many Event 714 Critical and after this Event 711 Warning
    Thank

    Have a look at: https://social.technet.microsoft.com/Forums/systemcenter/en-US/e75e84d9-0c9e-4d83-b3da-45a143757f85/exchange-2010-monitoring-with-scom-2012-correlation-service-issue
    One user reported an issue with the exchange correlation engine after upgrade and said that:
    I had issues with the corellation engine after upgrading scom 2012 to R2.
    The MomBidLdr.dll version changed in the SCOM directories, and needs to be updated in the:
    C:\Program Files\Microsoft\Exchange Server\v14\Bin directory.
    That seemed to stop the errors for me.
    Some troubleshooting steps listed here also:
    https://technet.microsoft.com/en-us/library/ff360495(v=exchg.140).aspx
    Cheers,
    Martin
    Blog:
    http://sustaslog.wordpress.com 
    LinkedIn:
    Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

  • System center data access service crashes with event id 26339 ad 26380 in SCOM 2012 Sp1 RTM

    hi all,
    I have deployed scom 2012 sp1 rtm on Os 2012 std and database is on another VM Machine having SQL 2012 sp1 with OS 2008 R2 SP1 std, suddenly sql vm get stopped and unable to start , so I have delete it from hyper-v 2012 console and imported the VM again
    in Hyper-0S 2012 cluster.
    Now, System centre data access service get crashed again and again with mentioned events
    event 26339, OpsMgr SDKService
    event 26380, opsmgr SDkService
    An exception was thrown while initializing the service container.
    Exception message: Initialize
    Full exception: HTTP could not register URL
    http://+:51905/ConnectorFramework/ because TCP port 51905 is being used by another application.
    The System Center Data Access service failed due to an unhandled exception. 
    The service will attempt to restart.
    Exception:
    System.ServiceModel.AddressAlreadyInUseException: HTTP could not register URL http://+:51905/ConnectorFramework/ because TCP port 51905 is being used by another application. ---> System.Net.HttpListenerException: The process cannot access the file
    because it is being used by another process
       at System.Net.HttpListener.AddAllPrefixes()
       at System.Net.HttpListener.Start()
       at System.ServiceModel.Channels.SharedHttpTransportManager.OnOpen()
       --- End of inner exception stack trace ---
       at System.ServiceModel.Channels.SharedHttpTransportManager.OnOpen()
       at System.ServiceModel.Channels.TransportManager.Open(TransportChannelListener channelListener)
       at System.ServiceModel.Channels.TransportManagerContainer.Open(SelectTransportManagersCallback selectTransportManagerCallback)
       at System.ServiceModel.Channels.HttpChannelListener`1.OnOpen(TimeSpan timeout)
       at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
       at System.ServiceModel.Dispatcher.ChannelDispatcher.OnOpen(TimeSpan timeout)
       at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
       at System.ServiceModel.ServiceHostBase.OnOpen(TimeSpan timeout)
       at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
       at Microsoft.EnterpriseManagement.ConnectorFramework.ServiceDataLayer.ConnectorFrameworkDataAccessChannel.Initialize()
       at Microsoft.EnterpriseManagement.ServiceDataLayer.DispatcherService.Initialize(InProcEnterpriseManagementConnectionSettings configuration)
       at Microsoft.EnterpriseManagement.ServiceDataLayer.DispatcherService.InitializeRunner(Object state)
       at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
       at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
    Kirpal Singh

    Hi Kirpal
    Can you try this it might help as its an known issue of getting event id 's 26361,26380 try out & lets see if it helps.
    The manifest files are located on the RMS at the \Program Files\System Center Operations Manager 2007\ root directory. The manifest files will need to be edited for the config and sdk service on affected RMS. The file names are:
    Microsoft.Mom.Sdk.ServiceHost.exe.config
    Microsoft.Mom.ConfigServiceHost.exe.config
    In between the EXISTING <runtime> and </runtime> lines – you need to ADD a NEW LINE with the following:
    <generatePublisherEvidence enabled="false"/>
    This solution disables CRL checking for the specified execute-ables, permanently.

  • Event id 31551 in scom 2012 server

    Dear Team,
    We are getting continuously the following error event id 31551 on our SCOM 2012 SP1 server as below.
    Please let me know how to resolve this.
    Log Name:      Operations Manager
    Source:        Health Service Modules
    Date:          02/01/2015 13:52:42
    Event ID:      31551
    Task Category: Data Warehouse
    Level:         Error
    Keywords:      Classic
    User:          N/A
    Computer:      LOXXXXXXX.XXXX
    Description:
    Failed to store data in the Data Warehouse. The operation will be retried.
    Exception 'SqlException': Cannot open database "OperationsManagerDW" requested by the login. The login failed.
    Login failed for user 'WREN\SVC-SC-OM12-DW'. 
    One or more workflows were affected by this.  
    Workflow name: Microsoft.SystemCenter.DataWarehouse.CollectEntityHealthStateChange 
    Instance name: LONSCOM001.wren.co.uk 
    Instance ID: {0F89A4D1-B7D5-8658-29A8-E0CAFAA602CF} 
    Management group: Brit Insurance
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Health Service Modules" />
        <EventID Qualifiers="49152">31551</EventID>
        <Level>2</Level>
        <Task>3</Task>
        <Keywords>0x80000000000000</Keywords>
        <TimeCreated SystemTime="2015-01-02T13:52:42.000000000Z" />
        <EventRecordID>933223</EventRecordID>
        <Channel>Operations Manager</Channel>
        <Computer>LOXXXXXXX.XXXX</Computer>
        <Security />
      </System>
      <EventData>
        <Data>Brit Insurance</Data>
        <Data>Microsoft.SystemCenter.DataWarehouse.CollectEntityHealthStateChange</Data>
        <Data>LONSCOM001.wren.co.uk</Data>
        <Data>{0F89A4D1-B7D5-8658-29A8-E0CAFAA602CF}</Data>
        <Data>SqlException</Data>
        <Data>Cannot open database "OperationsManagerDW" requested by the login. The login failed.
    Login failed for user 'WREN\SVC-SC-OM12-DW'.</Data>
      </EventData>
    </Event>
    Saravana Raja

    Hi,
    Based on the error message, the login failed for user 'WREN\SVC-SC-OM12-DW', have you changed its password? Please make sure the account can access the SQL server where your data warehouse installed.
    Or we may reset the account. And the article below should be helpful for changing password for data warehouse account:
    Changing Password on SCOM Data Warehouse run as accounts
    http://blogs.technet.com/b/randymonteleone/archive/2010/03/12/changing-password-on-scom-data-warehouse-run-as-accounts.aspx
    Regards,
    Yan Li
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

  • SCOM 2012 SP1 Datawarehouse event id 31553 and after 2 minutes successful id 31572

    Hello guys,
    I'm running SCOM 2012 SP1 and in one of my 3 Management groups i got the following events. I get them every few minutes and i cannot explain it. I have checked the aggregation of the DW and its fine. I Have extended the timeout from 5 to 15 minutes ... What
    should i do next. Please help with this. I have checked all 2007 posts but they are not related to my problem. I have tried a number of queries but they don't return any result. Do I have a SQL server performance issue ?
    Date and Time:
    10/27/2014 11:05:10 AM
    Log Name:
    Operations Manager
    Source:
    Health Service Modules
    Generating Rule:
    Data Warehouse related event collection
    Event Number:
    31553
    Level:
     Error
    Logging Computer:
    User:
    N/A
    Description:
    Data was written to the Data Warehouse staging area but processing failed on one of the subsequent operations. Exception 'SqlException': Transaction (Process ID 71) was deadlocked on lock resources with another process and has been chosen as the deadlock
    victim. Rerun the transaction.
    One or more workflows were affected by this.
    Workflow name: Microsoft.SystemCenter.DataWarehouse.CollectPerformanceData
    Instance name: server2.mymg.corp
    Instance ID: {EBD49B2A-D314-07DF-9E2C-B79CF86B1A72}
    Management group: MyMG
    Date and Time:
    10/27/2014 11:26:36 AM
    Log Name:
    Operations Manager
    Source:
    Health Service Modules
    Generating Rule:
    Data Warehouse related event collection
    Event Number:
    31553
    Level:
     Error
    Logging Computer:
    server10.mymg.corp
    User:
    N/A
    Description:
    Data was written to the Data Warehouse staging area but processing failed on one of the subsequent operations. Exception 'SqlException': Transaction (Process ID 68) was deadlocked on lock resources with another process and has been chosen as the deadlock
    victim. Rerun the transaction.
    One or more workflows were affected by this.
    Workflow name: Microsoft.SystemCenter.DataWarehouse.CollectPerformanceData
    Instance name: server10.mymg.corp
    Instance ID: {DBDD19A5-98B3-F90A-641E-7C4693BFD6EB}
    Management group: MyMG
    Event Data:
     View Event Data
    < DataItem
    type ="
    System.XmlData "
    time =" 2014-10-27T11:26:36.2773982-05:00
    " sourceHealthServiceId
    =" DBDD19A5-98B3-F90A-641E-7C4693BFD6EB
    " >
    < EventData
    >
    < Data > MyMG</
    Data >
    < Data >
    Microsoft.SystemCenter.DataWarehouse.CollectPerformanceData
    </ Data
    >
    < Data > server10.mymg.corp
    </ Data
    >
    < Data >
    {DBDD19A5-98B3-F90A-641E-7C4693BFD6EB}
    </ Data >
    < Data >
    SqlException </
    Data >
    < Data >
    Transaction (Process ID 68) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
    </ Data
    >
    </ EventData
    >
    </ DataItem
    >
    User:
    N/A
    After 2 minutes i will have the 31572 which is
    Data writer successfully performed data maintenance operations. What would be the reason for this ?

    There are a few things that can cause this. Your hint here is the deadlock message; processes are competing to write data in the DW, so you need to find out what that is. You also have a message stating that data is written to staging, but timing out moving
    the data. If you happen to be running the Exchange MP, this creates an additional data set that can quietly clog up the works--but you would see that dataset in the event message as well.
    Because you say aggregation is moving, try following these steps, they have served me well. I think you are seeing the results of data getting 'stuck' that causes the system to slowly choke itself. Note: You can focus on just the perf dataset, but always
    a good idea to make sure everything is healthy while you're at it.
    Steps:
    Check to see if the staging area for data written to the DW is clear and data is moving:
    select count(*) from Alert.AlertStage
    select count (*) from Event.eventstage
    select count (*) from Perf.PerformanceStage
    select count (*) from state.statestage
    These values normally rise and fall rapidly. If you open each table, they should also only contain recent data. If you find any data points older than a day, something is stuck. In one case I found state.statestage stuck around 14300. A look at the data
    showed a few hundred rows of data that had information 4 months old.
    See if data is moving (if you have old data, the answer is evident):
    1. select * from StandardDataset
    2. Plug the appropriate GUID into exec StandardDatasetMaintenance @DatasetId='<GUID from query>'
    3. See if the count decreases
    If it doesn't, if the data is expendable, just truncate the table (don't be afraid to do this--if OpsMgr is choking, stop the choking, and don't fret over missing data points)
    truncate table state.statechange
    Optional query:
    Plug the DatasetId into the below query
    USE [OperationsManagerDW]
    DECLARE @DataSet uniqueidentifier
    SET @DataSet = (SELECT DatasetId FROM StandardDataset WHERE DatasetId = '138B1324-BE31-42D7-A6CB-EA10139E309C')
    EXEC StandardDatasetMaintenance @DataSet
    Note: Valid current values are Event, Alert, CM, Perf, State
    If these check out, it's time to
    See if any SQL jobs are running that could be interfering
    If you are not out of space, but close, and autogrow is killing your performance
    Run the large table query to see if you have unchecked perf data growth that exceeds the performance capability of the SQL box
    You may indeed have a SQL performance issue, but that's the last thing you troubleshoot--make sure OpsMgr is healthy first

  • SCOM 2012 SP1 - Show on event view all snmp trap (SNMP monitoring work)

    Hello everybody, 
    Sorry for my english, I write normaly in french, but we have more result in english. 
    I have a problem with SCOM 2012. I try to catch all snmp traps sended by a 2960 CISCO switch on a EventView with a specific rule (Authoring->Rule->Collection Rules -> Event Based -> SNMP Trap (Event) based on the object target "Node")
    I creat a specific management pack juste for the rule and the views. 
    SNMP Monitoring - CISOC 2960 => It's OK, I can have the processor state, utilization, etc ...
    SNMP Monitoring Ubuntu computer => It's OK, I can have all the state I want.
    SNMP Traps => The switch or the computer send traps over the network, and I can see in wireshark, the server receive the traps
    SNMP Service (Windows service) => Disabled
    SNMP trap (Windows service) => Disabled
    Health Service (Windows service) => Enabled
    Port 162 UDP => Open and listenning by the MonitoringHost.exe
    Firewall rules => Everythinks is OK
    SNMP Trap send version is => 2c
    SNMP Monitoring device version is => 2c
    I try too many of solution on different web site like :
    http://scom-2012.blogspot.ch/2012/07/setting-up-snmp-monitoring-in-scom-2012.html
    http://social.technet.microsoft.com/Forums/systemcenter/en-US/731661b9-10a1-4d3f-ba83-8e84d25ab760/event-collection-for-network-devices-scom-2012
    http://social.technet.microsoft.com/Forums/systemcenter/en-US/a15bce49-fb62-4fd4-93cf-f87c3b734d58/snmp-trap-based-monitoring?forum=operationsmanagergeneral
    http://social.technet.microsoft.com/Forums/systemcenter/en-US/41f5b6ef-c8b9-461d-bdcb-81fde5a89f50/scom-2012-unable-to-monitor-snmp-traps?forum=operationsmanagergeneral
    http://social.technet.microsoft.com/Forums/systemcenter/en-US/4051fbd1-06f1-49e0-9ad4-4cbe4d2d7d4d/discover-windows-computer-as-network-device-w-snmp?forum=operationsmanagerauthoring
    http://technet.microsoft.com/en-us/library/hh563870.aspx
    http://social.technet.microsoft.com/Forums/en-US/cad1d3f9-594f-4f06-a5aa-660ccc2e9192/snmp-trap-based-monitoring-in-scom-2012-sp1?forum=operationsmanagerauthoring
    http://social.technet.microsoft.com/Forums/en-US/41f5b6ef-c8b9-461d-bdcb-81fde5a89f50/scom-2012-unable-to-monitor-snmp-traps?forum=operationsmanagergeneral
    http://social.technet.microsoft.com/Forums/en-US/e05a1c8f-7280-4f80-86cf-aabb4269bb87/scom-2012-customizing-snmp-trap-event-data?forum=operationsmanagergeneral
    http://social.technet.microsoft.com/Forums/en-US/6826f6a6-bbc3-444b-9b18-288d7fedac3e/scom-unable-to-monitor-snmp-traps?forum=operationsmanagergeneral
    http://social.technet.microsoft.com/Forums/en-US/7cd1571a-d292-4efc-9921-5a068f6f1691/scom-2012-sp1-ur2-snmp-monitoring?forum=operationsmanagermgmtpacks
    Do you know a workaround? Or a different way to catch all the traps from a network device and show them (traps) on a event views.
    Thank you in advance. 
    KimBaxZ
    Computer expert system technology

    Hello Yan Li,
    I read your link, and I found this : 
    The network devices must be discovered and registered as ICMPSNMP devices.
    And when I make the dicovery the first time, ICMP doesn't work, so I put only SNMP. This morning I tried with ICMP and SNMP, but the same problem come to me. And I found the rootcause of the problem with this post : http://www.code4ward.net/main/Blog/tabid/70/EntryId/105/Troubleshooting-Network-Discovery-in-SCOM-2012.aspx
    I allowed the SNMP service, ping, and Health Service, just after I try a second time to dicover my device and it's work (ICMP and SNMP).
    I recreat all my management pack and the rule. And now it's work! Thank you very much for your help!!
    Have a nice day
    Best regards
    KimBAxZ
    Computer expert system technology

Maybe you are looking for

  • How to add Help in a Form?

    I want to add help in my forms application. Is it possible to add customized help in a form under Help Menu ? Can anybody 'help' me in this regard :) It will be a great favour !

  • Can't move pix to library

    Hi-- Downloaded some pix to iPhoto. Modified them outside of iPhoto, and saved them back to iPhoto w diff name. They don't show up in the (Library) pix viewable when you open iPhoto. When I try to import them to iPhoto, I'm told they're already there

  • Conversion failed when converting the varchar value to data type int

    Hi, I am unable to resolve this error and would like some assistance please. The below query produces the following error message - Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value 'NCPR' to data type int. Select

  • HT203477 Missing file warning with an import of Red footage.

    I updatd a project and got seveal missing file warnings.  Two clips seem to be very much stuck in this mode.  I have trashed the files out of the project and off the drive and tried several time to reload them to no avial...  Does anyone have any ide

  • Jsession being reset in ssl

    I having trouble with my session being reset within an ssl page in a shopping cart application. I redirect from a non ssl domain to an ssl domain and pass the #session.urlToken# and my session is still intact. From there I then submit the page to its