My root management server goes to grayed state frequently

Hi All,
I have a strange issue with my SCOM 2007 R2 CU4.
My management server grayes out frequently and when i check the operations manager console it throws me these two errors. Can anyone give me a solution for this.
Note:  The following information was gathered when the operation was attempted.  The information may appear cryptic but provides context for the error.  The application will continue to run.
Microsoft.EnterpriseManagement.Common.UnknownAuthorizationStoreException: Unable to perform the operation because of authorization store errors. ---> System.Runtime.InteropServices.COMException (0x800705AA): Insufficient system resources exist to complete
the requested service. (Exception from HRESULT: 0x800705AA)
   at Microsoft.Interop.Security.AzRoles.IAzApplication2.InitializeClientContextFromToken(UInt64 ullTokenHandle, Object varReserved)
   at Microsoft.EnterpriseManagement.Mom.Sdk.Authorization.AzManHelper.AccessCheck(String accessCheckContext, Int32[] operationIds, IntPtr hToken, String stringSid, Int32[] accessCheckReturnCodes, List`1[] accessCheckScopes)
   --- End of inner exception stack trace ---
   at Microsoft.EnterpriseManagement.DataAbstractionLayer.SdkDataAbstractionLayer.HandleIndigoExceptions(Exception ex)
   at Microsoft.EnterpriseManagement.DataAbstractionLayer.InstanceSpaceOperations.GetMonitoringObjectByMonitoringObjectIds(List`1 monitoringObjectIds, String languageCode, MonitoringObjectMode monitoringObjectMode)
   at Microsoft.EnterpriseManagement.ManagementGroup.GetPartialMonitoringObjects(ICollection`1 ids)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Common.TaskPaneContext.GetManagedEntities(ICollection`1 items)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Common.TaskPaneContext.UpdateContextJob(Object sender, ConsoleJobEventArgs args)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Console.ConsoleJobExceptionHandler.ExecuteJob(IComponent component, EventHandler`1 job, Object sender, ConsoleJobEventArgs args)
System.Runtime.InteropServices.COMException (0x800705AA): Insufficient system resources exist to complete the requested service. (Exception from HRESULT: 0x800705AA)
   at Microsoft.Interop.Security.AzRoles.IAzApplication2.InitializeClientContextFromToken(UInt64 ullTokenHandle, Object varReserved)
   at Microsoft.EnterpriseManagement.Mom.Sdk.Authorization.AzManHelper.AccessCheck(String accessCheckContext, Int32[] operationIds, IntPtr hToken, String stringSid, Int32[] accessCheckReturnCodes, List`1[] accessCheckScopes)
====================================================================================
Note:  The following information was gathered when the operation was attempted.  The information may appear cryptic but provides context for the error.  The application will continue to run.
Microsoft.Mom.Isam.IsamOutOfMemoryException: Out of Memory (-1011)
   at Microsoft.Mom.Isam.?A0x5359b89f.HandleError(Int32 err)
   at Microsoft.Mom.Isam.EseInterop.JetGetIndexInfo(JET_SESID sesid, JET_DBID dbid, String table, String index)
   at Microsoft.Mom.Isam.IndexCollection.get_Item(String indexName)
   at Microsoft.Mom.Isam.Cursor.get_CurrentIndexDefinition()
   at Microsoft.Mom.Isam.Cursor.MakeKey(Key key, Boolean end)
   at Microsoft.Mom.Isam.Cursor.GotoKey(Key key)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Cache.EseCursor.Find(Object[] values)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.MPImageCache.RetrieveImageIdFromCache(CacheTable classToImageMapTable, ManagementPackElementImageKey imageKey, Guid& imageId)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.MPImageCache.GetImageCore(CacheSession session, ManagementGroup managementGroup, ManagementPackElementImageKey imageKey, Guid& imageId)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.MPImageCache.GetImageJob(Object sender, ConsoleJobEventArgs jobArgs)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Console.ConsoleJobExceptionHandler.ExecuteJob(IComponent component, EventHandler`1 job, Object sender, ConsoleJobEventArgs args)
Screenshot of the error:
Also i restarted the health service and the health service fails to start below is the screenshot of that.
Analysed the event logs and found the below:
Event id 29104 - OpsMgr Config Service failed to send the dirty state notifications to the dirty OpsMgr Health Services. This may be happening because the Root OpsMgr Health Service is not running.
Event id : 4000 - A monitoring host is unresponsive or has crashed.  The status code for the host failure was 2164195371.
Event id 4503 - A module reported an error 0x8007000E from a callback which was running as part of rule "_44E9A997_6A02_4298_8430_8E01952AB6F3_.RaiseAlert" running for instance "Root managementserver FQDN" with
id:"{6C3444C3-3990-BB5A-F25E-289D8F427570}" in management group "CINSCOM".
Event id - 21044 - The OpsMgr Connector cannot uncompress package, received from IP XXXXXXXXXXXX
Event id 1103 - Summary: 28 rule(s)/monitor(s) failed and got unloaded, 0 of them reached the failure limit that prevents automatic reload. Management group "CINSCOM". This is summary only event, please see other events
with descriptions of unloaded rule(s)/monitor(s).
26380 - The System Center Operations Manager SDK Service failed due to an unhandled exception.  
The service will attempt to restart. 
Exception: 
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at Bid.TraceError(String fmtPrintfW, Object a1, Object a2)
   at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.Execute[T](ExecuteArguments executeArguments, RetryPolicy retryPolicy, GenericExecute`1 genericExecute)
   at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.ExecuteReaderSingleRow(SqlDataReader sqlDataReader, SqlConnection sqlConnection, IList`1 prologEpilogList, RetryPolicy retryPolicy)
   at Microsoft.EnterpriseManagement.Mom.DataAccess.QueryResultsReader.Read()
   at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.ClientReaderManager.GetObjects(Guid id, Int32 count)
   at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccess.GetObjectsFromReader(Guid readerId, Int32 count)
   at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessTieringWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
   at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessExceptionTracingWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
   at SyncInvokeGetObjectsFromReader(Object , Object[] , Object[] )
   at System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object instance, Object[] inputs, Object[]& outputs)
   at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc& rpc)
   at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc& rpc)
   at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(MessageRpc& rpc)
   at System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean isOperationContextSet)
   at System.ServiceModel.Dispatcher.ChannelHandler.DispatchAndReleasePump(RequestContext request, Boolean cleanThread, OperationContext currentOperationContext)
   at System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest(RequestContext request, OperationContext currentOperationContext)
   at System.ServiceModel.Dispatcher.ChannelHandler.AsyncMessagePump(IAsyncResult result)
   at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
   at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
   at System.ServiceModel.Channels.FramingDuplexSessionChannel.TryReceiveAsyncResult.OnReceive(IAsyncResult result)
   at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
   at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
   at System.ServiceModel.Channels.SynchronizedMessageSource.SynchronizedAsyncResult`1.CompleteWithUnlock(Boolean synchronous, Exception exception)
   at System.ServiceModel.Channels.SynchronizedMessageSource.ReceiveAsyncResult.OnReceiveComplete(Object state)
   at System.ServiceModel.Channels.SessionConnectionReader.OnAsyncReadComplete(Object state)
   at System.ServiceModel.Channels.StreamConnection.OnRead(IAsyncResult result)
   at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
   at System.Net.LazyAsyncResult.Complete(IntPtr userToken)
   at System.Net.LazyAsyncResult.ProtectedInvokeCallback(Object result, IntPtr userToken)
   at System.Net.Security.NegotiateStream.ProcessFrameBody(Int32 readBytes, Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security.NegotiateStream.ReadCallback(AsyncProtocolRequest asyncRequest)
   at System.Net.FixedSizeReader.CheckCompletionBeforeNextRead(Int32 bytes)
   at System.Net.FixedSizeReader.ReadCallback(IAsyncResult transportResult)
   at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
   at System.ServiceModel.Channels.ConnectionStream.ReadAsyncResult.OnAsyncReadComplete(Object state)
   at System.ServiceModel.Channels.SocketConnection.FinishRead()
   at System.ServiceModel.Channels.SocketConnection.AsyncReadCallback(Boolean haveResult, Int32 error, Int32 bytesRead)
   at System.ServiceModel.Diagnostics.Utility.IOCompletionThunk.UnhandledExceptionFrame(UInt32 error, UInt32 bytesRead, NativeOverlapped* nativeOverlapped)
   at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)
======================================================
Below is the configuration of the server:
Windows server 2008 r2 enterprise (Without SP1)
SQL server standard edition 2008.
Intel Xeon 2.6 Ghz (2 Processors) Totally 8 cores.
12GB Physical memory (Always 95 - 98% used)
427 agents being monitored (Exchange 2010, Blackberry etc)
All the roles (SCOM, RMS, SQL all are in the same box)
Can anyone please help.
Gautam.75801

Did you install some software before this issue appeared? If yes, you need to uninstall it because May be there is conflict software.
Also for event 29104, you can refer below links
http://blog.tyang.org/2011/09/30/event-id-29104-on-scom-rms-cluster/
http://support.microsoft.com/kb/946417/en-us
Please remember, if you see a post that helped you please click "Vote As Helpful" and if it answered your question, please click "Mark As Answer"
Mai Ali | My blog: Technical | Twitter:
Mai Ali

Similar Messages

  • SCOM 2007 R2 Root Management server showing Not Monitored State in Ops Mgr Console

    Hello Experts,
    In my Prod SCOM 2007 R2 environment RMS server state is "Not Monitored", But we are receiving alerts with limitation. By mistakenly I put Maintenance Mode while rebooting RMS server due to slow performance of the server.
    Can anybody help me to revert back to the RMS Health state ?

    We can identify the Performance Signature Data Collection Rules in this example by executing the following SQL Query. This query should be executed in SQL Management Studio against the Operations Manager database.
    -- Return all Performance Signature Collection Rules
    Use OperationsManager
    select 
    managementpack.mpname, 
    rules.rulename
    from performancesignature with (nolock)
    inner join rules with (nolock)
    on rules.ruleid = performancesignature.learningruleid
    inner join managementpack with(nolock)
    on rules.managementpackid = managementpack.managementpackid
    group by managementpack.mpname, rules.rulename
    order by managementpack.mpname, rules.rulename
    This query will return all Performance Signature Collection Rules and their respective Management Pack name. A column is returned for Management Pack name and Rule name.
    The following Performance Monitor Counters on a Management Server will provide information concerning Database and Data Warehouse write action insertion batch size and time. If the batch size is growing larger, for example the default batch size is 5000 items,
    this indicates either the Management Server is slow inserting the data to the Database or Data Warehouse, or is receiving a burst of Data Items from the Agents or Gateway Servers. 
    · OpsMgr DB Write Action Modules(*)\Avg. Batch Size 
    · OpsMgr DB Write Action Modules(*)\Avg. Processing Time 
    · OpsMgr DW Writer Module(*)\Avg. Batch Processing Time, ms 
    · OpsMgr DW Writer Module(*)\Avg. Batch Size 
    From the Database and Data Warehouse write action account Average Processing Time counter, we can understand how long it takes on average to write a batch of data to the Database and Data Warehouse. Depending upon the amount of time it takes to write a batch
    of data to the Database, this may present an opportunity for tuning. 
    Event ID 2115 A Bind Data Source in Management Group
    http://blogs.technet.com/b/kevinholman/archive/2008/04/21/event-id-2115-a-bind-data-source-in-management-group.aspx
    Niki Han
    TechNet Community Support

  • Operations Manager Failed to Access the Windows Event Log and management server is showing warning state

    Hi,
    I am monitoring AD server from SCOM 2012 R2. My management server goes into waning state. When i run Health explorer then it come back in the healthy state but after some time it again goes into warning state. After seeing alert i found that a alert is coming
    again and again i.e.  Operations Manager Failed to Access the Windows Event Log.The description of alert is mention below
    The Windows Event Log Provider is still unable to open the DhcpAdminEvents event log on computer 'nc2vws12ad5.corp.nathcorp.com'.
    The Provider has been unable to open the DhcpAdminEvents event log for 64080 seconds.
    Most recent error details: The RPC server is unavailable.
    Please suggest me how to resolve this so that my management server will again come back in healthy state.
    Thanks
    Abhishek

    Hi Abhishek,
    As i mentioned earlier the Alert resolution says the same points.
    Can you give details on the below ?
    Is there really a log named "Dhcpadminevents" in the MS's Event viewer ?
    Did you recently configure any new alert where you mentioned "Dhcpadminevents"
    as a event log location ?
    If yes then what is the target you selected for the rule / monitor there ?
    Can you post the results for analysis ?
    Gautam.75801

  • Root management server stops alerting frequently

    Hi All,
    I have a Root management server running SCOM 2007 R2 CU4. All the roles are in the same RMS server (RMS, SQL, Webconsole).
    Suddenly once in a day our alerting stops and even after we see the scom services are running and the management server is in a healthy state. We restart the services SDK, Healthservice, System center management configuration. Still we do not get the alerts.
    We need to fully reboot the RMS and it works fine in a day. Same issue continues and we reboot it once in a day.
    Any idea what is the issue. We cannot afford rebooting this daily.
    We had installed few security patches on the RMS. We got them uninstalled but still the same issue.
    Analysed the event logs and found few logs.
    Lot of Event is 2115 events.
    Event id 26380 - The System Center Operations Manager SDK Service failed due to an unhandled exception. 
    The service will attempt to restart. Exception:
    System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
       at Bid.TraceError(String fmtPrintfW, Object a1, Object a2)
       at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.Execute[T](ExecuteArguments executeArguments, RetryPolicy retryPolicy, GenericExecute`1 genericExecute)
       at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.ExecuteReaderSingleRow(SqlDataReader sqlDataReader, SqlConnection sqlConnection, IList`1 prologEpilogList, RetryPolicy retryPolicy)
       at Microsoft.EnterpriseManagement.Mom.DataAccess.QueryResultsReader.Read()
       at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.ClientReaderManager.GetObjects(Guid id, Int32 count)
       at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccess.GetObjectsFromReader(Guid readerId, Int32 count)
       at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessTieringWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
       at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessExceptionTracingWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
       at SyncInvokeGetObjectsFromReader(Object , Object[] , Object[] )
       at System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object instance, Object[] inputs, Object[]& outputs)
       at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc& rpc)
       at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc& rpc)
       at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(MessageRpc& rpc)
       at System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean isOperationContextSet)
       at System.ServiceModel.Dispatcher.ChannelHandler.DispatchAndReleasePump(RequestContext request, Boolean cleanThread, OperationContext currentOperationContext)
       at System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest(RequestContext request, OperationContext currentOperationContext)
       at System.ServiceModel.Dispatcher.ChannelHandler.AsyncMessagePump(IAsyncResult result)
       at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
       at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
       at System.ServiceModel.Channels.FramingDuplexSessionChannel.TryReceiveAsyncResult.OnReceive(IAsyncResult result)
       at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
       at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
       at System.ServiceModel.Channels.SynchronizedMessageSource.SynchronizedAsyncResult`1.CompleteWithUnlock(Boolean synchronous, Exception exception)
       at System.ServiceModel.Channels.SynchronizedMessageSource.ReceiveAsyncResult.OnReceiveComplete(Object state)
       at System.ServiceModel.Channels.SessionConnectionReader.OnAsyncReadComplete(Object state)
       at System.ServiceModel.Channels.StreamConnection.OnRead(IAsyncResult result)
       at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
       at System.Net.LazyAsyncResult.Complete(IntPtr userToken)
       at System.Net.LazyAsyncResult.ProtectedInvokeCallback(Object result, IntPtr userToken)
       at System.Net.Security.NegotiateStream.ProcessFrameBody(Int32 readBytes, Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
       at System.Net.Security.NegotiateStream.ReadCallback(AsyncProtocolRequest asyncRequest)
       at System.Net.FixedSizeReader.CheckCompletionBeforeNextRead(Int32 bytes)
       at System.Net.FixedSizeReader.ReadCallback(IAsyncResult transportResult)
       at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
       at System.ServiceModel.Channels.ConnectionStream.ReadAsyncResult.OnAsyncReadComplete(Object state)
       at System.ServiceModel.Channels.SocketConnection.FinishRead()
       at System.ServiceModel.Channels.SocketConnection.AsyncReadCallback(Boolean haveResult, Int32 error, Int32 bytesRead)
       at System.ServiceModel.Diagnostics.Utility.IOCompletionThunk.UnhandledExceptionFrame(UInt32 error, UInt32 bytesRead, NativeOverlapped* nativeOverlapped)
       at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)
    Event id 1103 - Summary: 28 rule(s)/monitor(s) failed and got unloaded, 0 of them reached the failure limit that prevents automatic reload. Management group "My Management Group". This is summary only event, please see other events
    with descriptions of unloaded rule(s)/monitor(s).
    Event id: 26319
    The description for Event ID 26319 from source OpsMgr SDK Service cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local
    computer.
    If the event originated on another computer, the display information had to be saved with the event.
    The following information was included with the event:
    Connect uuid:6856a52d-56c3-4aa2-92e3-2ca16e644c12;id=1 The creator of this fault did not specify a Reason. System.ServiceModel.FaultException`1[Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException]: The creator of this fault did not specify
    a Reason. (Fault Detail is equal to Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException: Sdk Service has not yet initialized. Please retry).
    The handle is invalid
    The description for Event ID 26319 from source OpsMgr SDK Service cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local
    computer.
    If the event originated on another computer, the display information had to be saved with the event.
    The following information was included with the event:
    Connect
    uuid:6856a52d-56c3-4aa2-92e3-2ca16e644c12;id=2
    The creator of this fault did not specify a Reason.
    System.ServiceModel.FaultException`1[Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException]: The creator of this fault did not specify a Reason. (Fault Detail is equal to Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException:
    Sdk Service has not yet initialized. Please retry).
    The handle is invalid
    Event id: 1103
    The description for Event ID 1103 from source HealthService cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
    If the event originated on another computer, the display information had to be saved with the event.
    The following information was included with the event:
    Name of my management group
    1
    0
    The handle is invalid
    Event id 29104 - OpsMgr Config Service failed to send the dirty state notifications to the dirty OpsMgr Health Services. This may be happening because the Root OpsMgr Health Service is not running.
    Event id : 4000 - A monitoring host is unresponsive or has crashed.  The status code for the host failure was 2164195371.
    Event id 4503 - A module reported an error 0x8007000E from a callback which was running as part of rule "_44E9A997_6A02_4298_8430_8E01952AB6F3_.RaiseAlert" running for instance "Root managementserver FQDN" with id:"{6C3444C3-3990-BB5A-F25E-289D8F427570}"
    in management group "My Management Group".
    ==================================
    Can anyone help us.

    Hi,
    How to troubleshoot Event ID 2115 in Operations Manager
    http://support2.microsoft.com/kb/2681388
    SCOM 2007 R2 - SDK Service Exception with Event ID's 27000, 26371, 26338, 26380, 26319
    http://social.technet.microsoft.com/Forums/systemcenter/en-US/93977ed8-ff95-49d0-bc85-42217526c5b0/scom-2007-r2-sdk-service-exception-with-event-ids-27000-26371-26338-26380-26319?forum=operationsmanagergeneral
    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

  • Identifying the Root Management Server in Operations Manager 2012

    Hello,
    I want to replace my primary management server.
    Do i need to do the following steps if i have SCOM 2012??
    https://technet.microsoft.com/en-us/library/cc540401.aspx?f=255&MSPPError=-2147217396
    Promote a management server to a root management server role
    Configure the reporting server with the name of the new root management server.
    Configure the Web console with the name of the new root management server.
    Set ENABLE_BROKER to 1 if needed
    Thanks!
    TechNet

    Hello,
    I am not sure i understand how it works.
    The SQL server and the Application server are at the same domain (no firewalls) but not on the same server.
    The reporting are installed on the SQL server.
    We only have one domain admin account that is running on "data warehouse report deployment account"
    At the reporting configuration manager, it's the correct report link.
    At the reporting server i can see that at the :"c:\ProgramFiles\Microsoft SQL Server\MSRS10_50.MSSQLSERVER\Reporting Services\ReportServer" there is a file
    called "reportserver", inside this file i can see the FQDN of my primary SCOM Server under "Security" and "Authentication" sections.
    Is that why the reporting URL doesn't work when the primary server is down?
    Remember - i set the RMS emulator to a different server.
    If i need to change that manually then the first Q i have ask is relevant also for SCOM 2012.
    Thanks.
    TechNet

  • IDM server goes into recovered state

    Hi everybody,
    When I install a second IDM server (on another machine) that shares the same Oracle repository as the first one, I notice that the first server goes from the "active" state to the "recovered" state. This is seen under the Configuration->Servers tab.
    Any ideas why this happens?
    Thanks!

    When you  are looking into? Can you provide more info? What does the process do?
    Best Regards,Uri Dimant SQL Server MVP,
    http://sqlblog.com/blogs/uri_dimant/
    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting:
    Large scale of database and data cleansing
    Remote DBA Services:
    Improves MS SQL Database Performance
    SQL Server Integration Services:
    Business Intelligence

  • Sporadic issue: connections to server going to SYN_RECV state

    Hello all,
    I am facing this sporadic issue with my rmi server for the past few months. I have tried searching the internet world for any help but did not come across any solution that’s of help. Any help from you on this would be much appreciated!
    Our rmi server is accessed by 100-200 clients at any point in time. 99% of the times, the server just works fine except suddenly it kind of hangs i.e., no new connections can be made from clients. The connection attempts from clients fail with error "SocketTimeOutException: read timed out" and on the server, the tcp connections are observed to be in SYN_RECV state (from netstat output). Server is running on Redhat linux and clients could run on any of windows, linux and mac. The frequency of the issue is random with once in few months to multiple times on a single day - the average is 2-3 times a month. After restarting the server, everything starts to work fine again.
    On one such an event, we were able to talk to our network guys to get tcpdump trace logs on server port. Analyzing the dump, the following is the behavior
    client -> SYN -> server (server accepts)
    server -> SYN/ACK -> client (client accepts)
    client -> ACK -> server (server does not accept)
    server -> SYN/ACK -> client
    client -> ACK -> server (server does not accept) . This is repeated 6 times based on tcp retry setting
    server -> RST -> client (client gets timed out error at this point)
    I am no expert of tcp and so for me, the strange thing is to see ACK message at the server port yet server resending syn/ack messages. Could you tell me what this behavior could mean? Any guidance towards debugging this issue will be really helpful.
    Thanks in advance
    PS: I am a newbie to forum so please feel free to let me know if I my post has any issues w.r.t forum decor/guidelines etc

    I think you have an intermittent hardware problem somewhere in a router or switch in this path.

  • Managed server in WARNING state in Obiee 11.1.1.7.0 prod env

    Hi All,
    We have OBIEE 11.1.1.7.0 running in our company. Last couple of days we are experiencing that, the managed server "bi_server1" always changes to "warning" state in every 20 to 30 minutes or so. Reports are too slow and behaves funny after warning. When I restart opmn, then it shows health is "OK". But again changes to "WARNING" after sometimes. It says: Thread pool has stuck threads.
    We have got a RAC environment of database and our OBIEE is pointing to one of the node and i can tnsping to it. I can increase the StuckThread time to 1200, 1200 or so instead of 600, 60 for bi_server1 in the console as found somewhere as a tuning process but not sure if it will help much!! I can't even cancel the long running queries in the "managed session" of obiee analytics.
    Kindly advise what exactly is causing this problem and what must I do to resolve this??
    The bi_server1 log file displays the following error:
    ####<Oct 15, 2013 1:31:59 PM SAST> <Error> <WebLogicServer> <SWAOBIEE> <bi_server1> <[ACTIVE] ExecuteThread: '5' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <705beb8dd8c1b40f:-c2b96b4:141a7012d14:-8000-000000000004373a> <1381836719713> <BEA-000337> <[STUCK] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "709" seconds working on the request "weblogic.servlet.internal.ServletRequestImpl@46631799[
    GET /analytics/saw.dll?Sessions HTTP/1.1
    Connection: keep-alive
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.35 (KHTML, like Gecko) Chrome/27.0.1448.0 Safari/537.35
    Referer: http://10.20.16.72:9704/analytics/saw.dll?Admin
    Accept-Encoding: gzip,deflate,sdch
    Accept-Language: en-US,en;q=0.8
    Cookie: ORA_BIPS_LBINFO=141bbd5ed78; ORA_BIPS_NQID=v1b3rdkjen27ul12vejs054ra8v5mgmsk25pu7a; JSESSIONID=5xfZSdkK6N0HN49H2hKFT9SmQ0ngVp6vLJvw81QfTychYjpwpmFZ!-2050741508; ADMINCONSOLESESSION=Q2TLSdkdv1GT79BFgzPyw5njCtPDXmmyWbhb523hgV5DRqtv7csD!1865413782
    ]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace:
            java.net.SocketInputStream.socketRead0(Native Method)
            java.net.SocketInputStream.read(SocketInputStream.java:129)
            com.siebel.analytics.web.sawconnect.SAWConnection$NotifyInputStream.read(SAWConnection.java:165)
            java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
            java.io.BufferedInputStream.read(BufferedInputStream.java:237)
            com.siebel.analytics.web.sawconnect.sawprotocol.SAWProtocol.readInt(SAWProtocol.java:188)
            com.siebel.analytics.web.sawconnect.sawprotocol.SAWProtocolInputStreamImpl.readChunkHeader(SAWProtocolInputStreamImpl.java:282)
            com.siebel.analytics.web.sawconnect.sawprotocol.SAWProtocolInputStreamImpl.startReadingNewMessage(SAWProtocolInputStreamImpl.java:49)
            com.siebel.analytics.web.sawconnect.SAWServletHttpBinding.forwardResponse(SAWServletHttpBinding.java:201)
            com.siebel.analytics.web.SAWBridge.processRequest(SAWBridge.java:189)
            com.siebel.analytics.web.SAWBridge.doGet(SAWBridge.java:224)
            javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
    Thanks,
    BK.

    I had similar issue in windows 7 [64 bit].
    It is found that the nodemanger windows service was not present and I had to create it by executing installNodeMgrSvc.cmd present in wls server location say, 'C:\Oracle\Middleware\wlserver_10.3\server\bin'  in my case and then come to bi wls console and 'Resume' the bi_server1 managed server. You would see progress in the installer once the managed server bi_server1 is 'Running' state.
    Hope this helps.

  • Managed Server state "STARTING" after shut it down.

    Hello,
    I have a really odd situation here. I have a Managed Server with a "STARTING" state but it is really shutdown. I have tried to do "Force Shutdown Now" but the only I can get is this:
    Status of Last Action = TASK COMPLETED
    State = STARTING
    I cannot start the Managed Server because Weblogic says it is in an incompatible state. I have tried to restart the Admin server and delete the folder /cache and /tmp but I cannot solve it. Any help please? Where does WebLogic keep track of the status of its managed Servers?
    Thank you,
    Oscar

    Hello Fabian,
    I am using Solaris(unix). The managed server is not running. At least there is no weblogic process running on that machine (ps -ef | grep weblogic). I will try tomorrow to search of the name of the managed server as you suggested.
    It seems that the Admin server saves the status "STARTING" in any place for that managed server. Tomorrow I will also try to start the managed server from command line and later on shut it down from the Admin server(GUI). Let's see whether the managed server sends the notification to the admin server telling him that it was shut down correctly.
    Thanks for your reply,
    Oscar

  • Assignment Server Component (AsgnSrvr) goes to unavailable state

    Our AsgnSrvr component goes to unavailable state frequently almost 2-3 times a day. In component log, we can see "SBL-SVR-09016: Failed to get task instance: task number 35651599.  The task may have exited or does not exist". Our enterprise is on 8.0.0.13.
    As per Doc ID 1531224.1, maxMTServer:MaxTask should be 1:5 but ours is 1:20. Is this configuration fine for 8.0.0.13? If not how can we check that this configuration is not feasible for our enterprise and what type of data I should collect to define the issue?
    In the component log, we are getting below error in almost all the AsgnSrvr log:
    GenericLog    GenericError    1    0000000253561cf4:0    2014-04-22 19:13:38    Unable to find table CX_DATATMPLVIEW in the Siebel Repository
    GenericLog    GenericError    1    0000000453561cf4:0    2014-04-22 20:03:16    (scfdata.cpp (9620) err=1319736 sys=0) SBL-SVR-09016: Failed to get task instance: task number 35651599.  The task may have exited or does not exist.
    Does this indicate any issue and how can we proceed to relate this to issue [Component going unavailable]?
    It is happening on all the servers of the enterprise.
    Thanks,
    Abhishek

    Hello Abhishek,
    It might be getting timeout out.
    Please verify parameter 'MaxSkillsAge'.
    Please verify following knowledge article for further details:
    Assignment Manager timing out and terminating intermittently with various errors (Doc ID 1085215.1)
    Best Regards,
    Chetan

  • Manage Server Health Goes Blank in Admin Console

    Hello Team,
    I am continously facing issue with OSB manage server. It is running fine but health is blank in Admin Console.
    Due to which it is not responding to any user request.
    Kindly guide me to troubleshoot this issue.

    Hi,
    It is due to the problem with the JMX communication between and Admin and your OSB managed servers. Did you observed any messages related to it?
    Is it a SSL communication, then try to make sure that certificates are valid.
    As a general practice, make sure you have listen address configured for your servers. If left it blank(default), then it would be listening on the all the available host and communication would get messy in these situations.
    If still facing problem, i would suggest you enable the below debugs and raise a support case
    1. Apply the below three debug logs from console(for admin as well as Managed Server)
       weblogic->management->JMXCore
       weblogic->management->JMXDomain
       weblogic->management->JMXRuntime
    2. Restart all the servers
    If the issue, occurs, provide the debug along with the thread dump(take 5 thread dump at an interval of 5seconds) for the Managed Server(which is unknown state) and for the admin server.
    Regards
    Rosario

  • One Management Server is Grey and it's not our RMS server.

    Greetings -
    One of our SCOM 2007 Servers is Grey and in the even log I see a bunch of 4508 errors.  This is our SCOM 2007 environement and it never get touched much as there is only exchange on it, and until they move to exchange 2013 they wont be moving to SCOM
    2012.
    So the RMS server is working fine, but the other server we had set up by definition is our WebConsole, Reporting, Collecting server.  It has been grey for a month or so as we have tried all the normal "Grey Agent" ways to get it back talking.
    Cleared agent Cache and such.
    What could this be, where could we look to see what is wrong? since we are getting alerts that is the main function of this scom enviroment as I dont think anyone uses reporting or the web console but still i am sure it impacts something we use as well.

    Hi,
    As the problematic management server is in grey state, I would like to suggest you run
    Show Gray Agent Connectivity Data task.
    The Show Gray Agent Connectivity Data task will help you identify why an agent is gray.
    Please also turn off firewall on both RMS and the grey management server, and re-flush health service state and caches.
    Regards,
    Yan Li
    Regards, Yan Li

  • Health Service Heartbeat Failure Alert for Generated when one Management Server Down,

    Hi,
    I have Two Management Server, every one manage about 100 server, when one Management Server goes down unexpected, I receive 100 Alert for 100 Server Health Service Heartbeat Failure.
    My Question, why when the Management Server down, it send that all Managed agent Health Service Heartbeat Failure?
    Is there a way to change this?

    SCOM 2012 agent will autofailover when primary server is down. You can check the failover management server by using the following powershell cmdlet:
    #Verify Failover for Agents reporting to MS1
    $Agents = Get-SCOMAgent | where {$_.PrimaryManagementServerName -eq 'MS1.DOMAIN.COM'}
    $Agents | sort | foreach {
    Write-Host "";
    "Agent :: " + $_.Name;
    "--Primary MS :: " + ($_.GetPrimaryManagementServer()).ComputerName;
    $failoverServers = $_.getFailoverManagementServers();
    foreach ($managementServer in $failoverServers) {
    "--Failover MS :: " + ($managementServer.ComputerName);
    Write-Host "";
    http://www.systemcentercentral.com/how-does-the-failover-process-work-in-opsmgr-2012-scom-sysctr/

  • Managed Server Startup - Part Duex

    Hello:
    Is it possible to auto-start managed servers on WLS 10.3.5 under Solaris 10?
    I read the thread "managed server startup" and did not want to hijack the thread. In my case, I have an init script that starts up Node Manager and one that starts up the Admin Server.
    So, once those two processes are up, is there a way for the defined managed servers to auto-start without any manual intervention on a user's part? Yes, I do have CrashRecoveryEnabled=true but as I understand it, that is for when a managed server goes down. In my case, the Solaris server is booting up.
    Thank you,
    Perry

    Unless you have 50 managed servers then I can't understand why you would want that. Just start them all up at the same time from the admin console under 'control' tab of server page. What if you have a server or two that you don't want on for some reason? This would power it up regardless. I would think you would want to control that from node manager lifecycle. There may be a way to do that through scripting but it's probably not a best practice. We'll see if someone else chimes in on the subject. I am curious as well...

  • Start managed server in a cluster fail

    Hi,
    I'm new in Oracle Weblogic, and I'm testing Weblogic 10.3.
    I setup multiple IP in my laptop as follows:
    10.0.0.183 devpc
    10.0.2.183 node1
    10.0.3.183 node2
    And I configured a domain with Configuration Wizard as follows:
    AdminServer
    Listen Address:10.0.0.183
    Listen Port:7001
    ManagedServer_1
    Machine:(none)
    Cluster:Cluster_1
    Listen Address:10.0.2.183
    Listen Port:7001
    ManagedServer_2
    Machine:(none)
    Cluster:Cluster_1
    Listen Address:10.0.3.183
    Listen Port:7001
    When I started ManagedServer_1 with the command "*startManagedWebLogic.cmd ManagedServer_1 http://10.0.0.183:7001*",
    managed server started to RUNNING state successfully, and after few seconds, I got these error.
    +<Notice> <WebLogicServer> <devpc> <ManagedServer_1> <Main Thread> <<WLS Kernel>> <> <> <1239098617688> <BEA-000365> <Server state changed to RUNNING>+
    +<Notice> <WebLogicServer> <devpc> <ManagedServer_1> <Main Thread> <<WLS Kernel>> <> <> <1239098617688> <BEA-000360> <Server started in RUNNING mode>+
    +<Info> <J2EE> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098617703> <BEA-160151> <Registered library Extension-Name: bea_wls_async_response (JAR).>+
    +<Error> <Cluster> <devpc> <ManagedServer_1> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635062> <BEA-000170> <Server ManagedServer_1 did not receive the multicast packets that were sent by itself>+
    +<Critical> <Health> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635062> <BEA-310006> <Critical Subsystem Cluster has failed. Setting server state to FAILED. Reason: Unable to receive self generated multicast messages>+
    +<Critical> <WebLogicServer> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635078> <BEA-000385> <Server health failed. Reason: health of critical service 'Cluster' failed>+
    +<Notice> <WebLogicServer> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635078> <BEA-000365> <Server state changed to FAILED>+
    +<Error> <> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635234> <BEA-000000> <+
    +===== FULL THREAD DUMP ===============+
    Tue Apr 07 18:03:55 2009
    BEA JRockit(R) R27.6.0-50_o-100423-1.6.0_05-20080626-2105-windows-ia32
    +"Main Thread" id=1 idx=0x4 tid=2208 prio=5 alive, in native, waiting+
    +-- Waiting for notification on: weblogic/t3/srvr/T3Srvr@0x09E9E9E8[fat lock]+
    at jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native Method)
    at java/lang/Object.wait(J)V(Native Method)
    at java/lang/Object.wait(Object.java:485)
    at weblogic/t3/srvr/T3Srvr.waitForDeath(T3Srvr.java:811)
    +^-- Lock released while waiting: weblogic/t3/srvr/T3Srvr@0x09E9E9E8[fat lock]+
    at weblogic/t3/srvr/T3Srvr.run(T3Srvr.java:459)
    at weblogic/Server.main(Server.java:67)
    at jrockit/vm/RNI.c2java(IIIII)V(Native Method)
    -- end of trace
    +.+
    +.+
    +.+
    How to fix it?
    Thanks in advance.

    Hi,
    I changed the settings as follows:
    AdminServer
    Listen Address:devpc
    Listen Port:7001
    ManagedServer_1
    Machine:(none)
    Cluster:Cluster_1
    Listen Address:node1
    Listen Port:7001
    ManagedServer_2
    Machine:(none)
    Cluster:Cluster_1
    Listen Address:node2
    Listen Port:7001
    Cluster_1
    Cluster Address:node1:7001,node2:7001
    The exceptions still remained, any suggestions would be appreciated.

Maybe you are looking for