My root management server goes to grayed state frequently
Hi All,
I have a strange issue with my SCOM 2007 R2 CU4.
My management server grayes out frequently and when i check the operations manager console it throws me these two errors. Can anyone give me a solution for this.
Note: The following information was gathered when the operation was attempted. The information may appear cryptic but provides context for the error. The application will continue to run.
Microsoft.EnterpriseManagement.Common.UnknownAuthorizationStoreException: Unable to perform the operation because of authorization store errors. ---> System.Runtime.InteropServices.COMException (0x800705AA): Insufficient system resources exist to complete
the requested service. (Exception from HRESULT: 0x800705AA)
at Microsoft.Interop.Security.AzRoles.IAzApplication2.InitializeClientContextFromToken(UInt64 ullTokenHandle, Object varReserved)
at Microsoft.EnterpriseManagement.Mom.Sdk.Authorization.AzManHelper.AccessCheck(String accessCheckContext, Int32[] operationIds, IntPtr hToken, String stringSid, Int32[] accessCheckReturnCodes, List`1[] accessCheckScopes)
--- End of inner exception stack trace ---
at Microsoft.EnterpriseManagement.DataAbstractionLayer.SdkDataAbstractionLayer.HandleIndigoExceptions(Exception ex)
at Microsoft.EnterpriseManagement.DataAbstractionLayer.InstanceSpaceOperations.GetMonitoringObjectByMonitoringObjectIds(List`1 monitoringObjectIds, String languageCode, MonitoringObjectMode monitoringObjectMode)
at Microsoft.EnterpriseManagement.ManagementGroup.GetPartialMonitoringObjects(ICollection`1 ids)
at Microsoft.EnterpriseManagement.Mom.Internal.UI.Common.TaskPaneContext.GetManagedEntities(ICollection`1 items)
at Microsoft.EnterpriseManagement.Mom.Internal.UI.Common.TaskPaneContext.UpdateContextJob(Object sender, ConsoleJobEventArgs args)
at Microsoft.EnterpriseManagement.Mom.Internal.UI.Console.ConsoleJobExceptionHandler.ExecuteJob(IComponent component, EventHandler`1 job, Object sender, ConsoleJobEventArgs args)
System.Runtime.InteropServices.COMException (0x800705AA): Insufficient system resources exist to complete the requested service. (Exception from HRESULT: 0x800705AA)
at Microsoft.Interop.Security.AzRoles.IAzApplication2.InitializeClientContextFromToken(UInt64 ullTokenHandle, Object varReserved)
at Microsoft.EnterpriseManagement.Mom.Sdk.Authorization.AzManHelper.AccessCheck(String accessCheckContext, Int32[] operationIds, IntPtr hToken, String stringSid, Int32[] accessCheckReturnCodes, List`1[] accessCheckScopes)
====================================================================================
Note: The following information was gathered when the operation was attempted. The information may appear cryptic but provides context for the error. The application will continue to run.
Microsoft.Mom.Isam.IsamOutOfMemoryException: Out of Memory (-1011)
at Microsoft.Mom.Isam.?A0x5359b89f.HandleError(Int32 err)
at Microsoft.Mom.Isam.EseInterop.JetGetIndexInfo(JET_SESID sesid, JET_DBID dbid, String table, String index)
at Microsoft.Mom.Isam.IndexCollection.get_Item(String indexName)
at Microsoft.Mom.Isam.Cursor.get_CurrentIndexDefinition()
at Microsoft.Mom.Isam.Cursor.MakeKey(Key key, Boolean end)
at Microsoft.Mom.Isam.Cursor.GotoKey(Key key)
at Microsoft.EnterpriseManagement.Mom.Internal.UI.Cache.EseCursor.Find(Object[] values)
at Microsoft.EnterpriseManagement.Mom.Internal.UI.MPImageCache.RetrieveImageIdFromCache(CacheTable classToImageMapTable, ManagementPackElementImageKey imageKey, Guid& imageId)
at Microsoft.EnterpriseManagement.Mom.Internal.UI.MPImageCache.GetImageCore(CacheSession session, ManagementGroup managementGroup, ManagementPackElementImageKey imageKey, Guid& imageId)
at Microsoft.EnterpriseManagement.Mom.Internal.UI.MPImageCache.GetImageJob(Object sender, ConsoleJobEventArgs jobArgs)
at Microsoft.EnterpriseManagement.Mom.Internal.UI.Console.ConsoleJobExceptionHandler.ExecuteJob(IComponent component, EventHandler`1 job, Object sender, ConsoleJobEventArgs args)
Screenshot of the error:
Also i restarted the health service and the health service fails to start below is the screenshot of that.
Analysed the event logs and found the below:
Event id 29104 - OpsMgr Config Service failed to send the dirty state notifications to the dirty OpsMgr Health Services. This may be happening because the Root OpsMgr Health Service is not running.
Event id : 4000 - A monitoring host is unresponsive or has crashed. The status code for the host failure was 2164195371.
Event id 4503 - A module reported an error 0x8007000E from a callback which was running as part of rule "_44E9A997_6A02_4298_8430_8E01952AB6F3_.RaiseAlert" running for instance "Root managementserver FQDN" with
id:"{6C3444C3-3990-BB5A-F25E-289D8F427570}" in management group "CINSCOM".
Event id - 21044 - The OpsMgr Connector cannot uncompress package, received from IP XXXXXXXXXXXX
Event id 1103 - Summary: 28 rule(s)/monitor(s) failed and got unloaded, 0 of them reached the failure limit that prevents automatic reload. Management group "CINSCOM". This is summary only event, please see other events
with descriptions of unloaded rule(s)/monitor(s).
26380 - The System Center Operations Manager SDK Service failed due to an unhandled exception.
The service will attempt to restart.
Exception:
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at Bid.TraceError(String fmtPrintfW, Object a1, Object a2)
at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.Execute[T](ExecuteArguments executeArguments, RetryPolicy retryPolicy, GenericExecute`1 genericExecute)
at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.ExecuteReaderSingleRow(SqlDataReader sqlDataReader, SqlConnection sqlConnection, IList`1 prologEpilogList, RetryPolicy retryPolicy)
at Microsoft.EnterpriseManagement.Mom.DataAccess.QueryResultsReader.Read()
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.ClientReaderManager.GetObjects(Guid id, Int32 count)
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccess.GetObjectsFromReader(Guid readerId, Int32 count)
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessTieringWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessExceptionTracingWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
at SyncInvokeGetObjectsFromReader(Object , Object[] , Object[] )
at System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object instance, Object[] inputs, Object[]& outputs)
at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean isOperationContextSet)
at System.ServiceModel.Dispatcher.ChannelHandler.DispatchAndReleasePump(RequestContext request, Boolean cleanThread, OperationContext currentOperationContext)
at System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest(RequestContext request, OperationContext currentOperationContext)
at System.ServiceModel.Dispatcher.ChannelHandler.AsyncMessagePump(IAsyncResult result)
at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
at System.ServiceModel.Channels.FramingDuplexSessionChannel.TryReceiveAsyncResult.OnReceive(IAsyncResult result)
at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
at System.ServiceModel.Channels.SynchronizedMessageSource.SynchronizedAsyncResult`1.CompleteWithUnlock(Boolean synchronous, Exception exception)
at System.ServiceModel.Channels.SynchronizedMessageSource.ReceiveAsyncResult.OnReceiveComplete(Object state)
at System.ServiceModel.Channels.SessionConnectionReader.OnAsyncReadComplete(Object state)
at System.ServiceModel.Channels.StreamConnection.OnRead(IAsyncResult result)
at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.Net.LazyAsyncResult.Complete(IntPtr userToken)
at System.Net.LazyAsyncResult.ProtectedInvokeCallback(Object result, IntPtr userToken)
at System.Net.Security.NegotiateStream.ProcessFrameBody(Int32 readBytes, Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.NegotiateStream.ReadCallback(AsyncProtocolRequest asyncRequest)
at System.Net.FixedSizeReader.CheckCompletionBeforeNextRead(Int32 bytes)
at System.Net.FixedSizeReader.ReadCallback(IAsyncResult transportResult)
at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
at System.ServiceModel.Channels.ConnectionStream.ReadAsyncResult.OnAsyncReadComplete(Object state)
at System.ServiceModel.Channels.SocketConnection.FinishRead()
at System.ServiceModel.Channels.SocketConnection.AsyncReadCallback(Boolean haveResult, Int32 error, Int32 bytesRead)
at System.ServiceModel.Diagnostics.Utility.IOCompletionThunk.UnhandledExceptionFrame(UInt32 error, UInt32 bytesRead, NativeOverlapped* nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)
======================================================
Below is the configuration of the server:
Windows server 2008 r2 enterprise (Without SP1)
SQL server standard edition 2008.
Intel Xeon 2.6 Ghz (2 Processors) Totally 8 cores.
12GB Physical memory (Always 95 - 98% used)
427 agents being monitored (Exchange 2010, Blackberry etc)
All the roles (SCOM, RMS, SQL all are in the same box)
Can anyone please help.
Gautam.75801
Did you install some software before this issue appeared? If yes, you need to uninstall it because May be there is conflict software.
Also for event 29104, you can refer below links
http://blog.tyang.org/2011/09/30/event-id-29104-on-scom-rms-cluster/
http://support.microsoft.com/kb/946417/en-us
Please remember, if you see a post that helped you please click "Vote As Helpful" and if it answered your question, please click "Mark As Answer"
Mai Ali | My blog: Technical | Twitter:
Mai Ali
Similar Messages
-
SCOM 2007 R2 Root Management server showing Not Monitored State in Ops Mgr Console
Hello Experts,
In my Prod SCOM 2007 R2 environment RMS server state is "Not Monitored", But we are receiving alerts with limitation. By mistakenly I put Maintenance Mode while rebooting RMS server due to slow performance of the server.
Can anybody help me to revert back to the RMS Health state ?We can identify the Performance Signature Data Collection Rules in this example by executing the following SQL Query. This query should be executed in SQL Management Studio against the Operations Manager database.
-- Return all Performance Signature Collection Rules
Use OperationsManager
select
managementpack.mpname,
rules.rulename
from performancesignature with (nolock)
inner join rules with (nolock)
on rules.ruleid = performancesignature.learningruleid
inner join managementpack with(nolock)
on rules.managementpackid = managementpack.managementpackid
group by managementpack.mpname, rules.rulename
order by managementpack.mpname, rules.rulename
This query will return all Performance Signature Collection Rules and their respective Management Pack name. A column is returned for Management Pack name and Rule name.
The following Performance Monitor Counters on a Management Server will provide information concerning Database and Data Warehouse write action insertion batch size and time. If the batch size is growing larger, for example the default batch size is 5000 items,
this indicates either the Management Server is slow inserting the data to the Database or Data Warehouse, or is receiving a burst of Data Items from the Agents or Gateway Servers.
· OpsMgr DB Write Action Modules(*)\Avg. Batch Size
· OpsMgr DB Write Action Modules(*)\Avg. Processing Time
· OpsMgr DW Writer Module(*)\Avg. Batch Processing Time, ms
· OpsMgr DW Writer Module(*)\Avg. Batch Size
From the Database and Data Warehouse write action account Average Processing Time counter, we can understand how long it takes on average to write a batch of data to the Database and Data Warehouse. Depending upon the amount of time it takes to write a batch
of data to the Database, this may present an opportunity for tuning.
Event ID 2115 A Bind Data Source in Management Group
http://blogs.technet.com/b/kevinholman/archive/2008/04/21/event-id-2115-a-bind-data-source-in-management-group.aspx
Niki Han
TechNet Community Support -
Hi,
I am monitoring AD server from SCOM 2012 R2. My management server goes into waning state. When i run Health explorer then it come back in the healthy state but after some time it again goes into warning state. After seeing alert i found that a alert is coming
again and again i.e. Operations Manager Failed to Access the Windows Event Log.The description of alert is mention below
The Windows Event Log Provider is still unable to open the DhcpAdminEvents event log on computer 'nc2vws12ad5.corp.nathcorp.com'.
The Provider has been unable to open the DhcpAdminEvents event log for 64080 seconds.
Most recent error details: The RPC server is unavailable.
Please suggest me how to resolve this so that my management server will again come back in healthy state.
Thanks
AbhishekHi Abhishek,
As i mentioned earlier the Alert resolution says the same points.
Can you give details on the below ?
Is there really a log named "Dhcpadminevents" in the MS's Event viewer ?
Did you recently configure any new alert where you mentioned "Dhcpadminevents"
as a event log location ?
If yes then what is the target you selected for the rule / monitor there ?
Can you post the results for analysis ?
Gautam.75801 -
Root management server stops alerting frequently
Hi All,
I have a Root management server running SCOM 2007 R2 CU4. All the roles are in the same RMS server (RMS, SQL, Webconsole).
Suddenly once in a day our alerting stops and even after we see the scom services are running and the management server is in a healthy state. We restart the services SDK, Healthservice, System center management configuration. Still we do not get the alerts.
We need to fully reboot the RMS and it works fine in a day. Same issue continues and we reboot it once in a day.
Any idea what is the issue. We cannot afford rebooting this daily.
We had installed few security patches on the RMS. We got them uninstalled but still the same issue.
Analysed the event logs and found few logs.
Lot of Event is 2115 events.
Event id 26380 - The System Center Operations Manager SDK Service failed due to an unhandled exception.
The service will attempt to restart. Exception:
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
at Bid.TraceError(String fmtPrintfW, Object a1, Object a2)
at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.Execute[T](ExecuteArguments executeArguments, RetryPolicy retryPolicy, GenericExecute`1 genericExecute)
at Microsoft.EnterpriseManagement.Mom.DataAccess.SqlRetryHandler.ExecuteReaderSingleRow(SqlDataReader sqlDataReader, SqlConnection sqlConnection, IList`1 prologEpilogList, RetryPolicy retryPolicy)
at Microsoft.EnterpriseManagement.Mom.DataAccess.QueryResultsReader.Read()
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.ClientReaderManager.GetObjects(Guid id, Int32 count)
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccess.GetObjectsFromReader(Guid readerId, Int32 count)
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessTieringWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
at Microsoft.EnterpriseManagement.Mom.ServiceDataLayer.SdkDataAccessExceptionTracingWrapper.GetObjectsFromReader(Guid readerId, Int32 count)
at SyncInvokeGetObjectsFromReader(Object , Object[] , Object[] )
at System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(Object instance, Object[] inputs, Object[]& outputs)
at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean isOperationContextSet)
at System.ServiceModel.Dispatcher.ChannelHandler.DispatchAndReleasePump(RequestContext request, Boolean cleanThread, OperationContext currentOperationContext)
at System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest(RequestContext request, OperationContext currentOperationContext)
at System.ServiceModel.Dispatcher.ChannelHandler.AsyncMessagePump(IAsyncResult result)
at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
at System.ServiceModel.Channels.FramingDuplexSessionChannel.TryReceiveAsyncResult.OnReceive(IAsyncResult result)
at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
at System.ServiceModel.Channels.SynchronizedMessageSource.SynchronizedAsyncResult`1.CompleteWithUnlock(Boolean synchronous, Exception exception)
at System.ServiceModel.Channels.SynchronizedMessageSource.ReceiveAsyncResult.OnReceiveComplete(Object state)
at System.ServiceModel.Channels.SessionConnectionReader.OnAsyncReadComplete(Object state)
at System.ServiceModel.Channels.StreamConnection.OnRead(IAsyncResult result)
at System.ServiceModel.Diagnostics.Utility.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.Net.LazyAsyncResult.Complete(IntPtr userToken)
at System.Net.LazyAsyncResult.ProtectedInvokeCallback(Object result, IntPtr userToken)
at System.Net.Security.NegotiateStream.ProcessFrameBody(Int32 readBytes, Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.NegotiateStream.ReadCallback(AsyncProtocolRequest asyncRequest)
at System.Net.FixedSizeReader.CheckCompletionBeforeNextRead(Int32 bytes)
at System.Net.FixedSizeReader.ReadCallback(IAsyncResult transportResult)
at System.ServiceModel.AsyncResult.Complete(Boolean completedSynchronously)
at System.ServiceModel.Channels.ConnectionStream.ReadAsyncResult.OnAsyncReadComplete(Object state)
at System.ServiceModel.Channels.SocketConnection.FinishRead()
at System.ServiceModel.Channels.SocketConnection.AsyncReadCallback(Boolean haveResult, Int32 error, Int32 bytesRead)
at System.ServiceModel.Diagnostics.Utility.IOCompletionThunk.UnhandledExceptionFrame(UInt32 error, UInt32 bytesRead, NativeOverlapped* nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)
Event id 1103 - Summary: 28 rule(s)/monitor(s) failed and got unloaded, 0 of them reached the failure limit that prevents automatic reload. Management group "My Management Group". This is summary only event, please see other events
with descriptions of unloaded rule(s)/monitor(s).
Event id: 26319
The description for Event ID 26319 from source OpsMgr SDK Service cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local
computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
Connect uuid:6856a52d-56c3-4aa2-92e3-2ca16e644c12;id=1 The creator of this fault did not specify a Reason. System.ServiceModel.FaultException`1[Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException]: The creator of this fault did not specify
a Reason. (Fault Detail is equal to Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException: Sdk Service has not yet initialized. Please retry).
The handle is invalid
The description for Event ID 26319 from source OpsMgr SDK Service cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local
computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
Connect
uuid:6856a52d-56c3-4aa2-92e3-2ca16e644c12;id=2
The creator of this fault did not specify a Reason.
System.ServiceModel.FaultException`1[Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException]: The creator of this fault did not specify a Reason. (Fault Detail is equal to Microsoft.EnterpriseManagement.Common.SdkServiceNotInitializedException:
Sdk Service has not yet initialized. Please retry).
The handle is invalid
Event id: 1103
The description for Event ID 1103 from source HealthService cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
Name of my management group
1
0
The handle is invalid
Event id 29104 - OpsMgr Config Service failed to send the dirty state notifications to the dirty OpsMgr Health Services. This may be happening because the Root OpsMgr Health Service is not running.
Event id : 4000 - A monitoring host is unresponsive or has crashed. The status code for the host failure was 2164195371.
Event id 4503 - A module reported an error 0x8007000E from a callback which was running as part of rule "_44E9A997_6A02_4298_8430_8E01952AB6F3_.RaiseAlert" running for instance "Root managementserver FQDN" with id:"{6C3444C3-3990-BB5A-F25E-289D8F427570}"
in management group "My Management Group".
==================================
Can anyone help us.Hi,
How to troubleshoot Event ID 2115 in Operations Manager
http://support2.microsoft.com/kb/2681388
SCOM 2007 R2 - SDK Service Exception with Event ID's 27000, 26371, 26338, 26380, 26319
http://social.technet.microsoft.com/Forums/systemcenter/en-US/93977ed8-ff95-49d0-bc85-42217526c5b0/scom-2007-r2-sdk-service-exception-with-event-ids-27000-26371-26338-26380-26319?forum=operationsmanagergeneral
Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. -
Identifying the Root Management Server in Operations Manager 2012
Hello,
I want to replace my primary management server.
Do i need to do the following steps if i have SCOM 2012??
https://technet.microsoft.com/en-us/library/cc540401.aspx?f=255&MSPPError=-2147217396
Promote a management server to a root management server role
Configure the reporting server with the name of the new root management server.
Configure the Web console with the name of the new root management server.
Set ENABLE_BROKER to 1 if needed
Thanks!
TechNetHello,
I am not sure i understand how it works.
The SQL server and the Application server are at the same domain (no firewalls) but not on the same server.
The reporting are installed on the SQL server.
We only have one domain admin account that is running on "data warehouse report deployment account"
At the reporting configuration manager, it's the correct report link.
At the reporting server i can see that at the :"c:\ProgramFiles\Microsoft SQL Server\MSRS10_50.MSSQLSERVER\Reporting Services\ReportServer" there is a file
called "reportserver", inside this file i can see the FQDN of my primary SCOM Server under "Security" and "Authentication" sections.
Is that why the reporting URL doesn't work when the primary server is down?
Remember - i set the RMS emulator to a different server.
If i need to change that manually then the first Q i have ask is relevant also for SCOM 2012.
Thanks.
TechNet -
IDM server goes into recovered state
Hi everybody,
When I install a second IDM server (on another machine) that shares the same Oracle repository as the first one, I notice that the first server goes from the "active" state to the "recovered" state. This is seen under the Configuration->Servers tab.
Any ideas why this happens?
Thanks!When you are looking into? Can you provide more info? What does the process do?
Best Regards,Uri Dimant SQL Server MVP,
http://sqlblog.com/blogs/uri_dimant/
MS SQL optimization: MS SQL Development and Optimization
MS SQL Consulting:
Large scale of database and data cleansing
Remote DBA Services:
Improves MS SQL Database Performance
SQL Server Integration Services:
Business Intelligence -
Sporadic issue: connections to server going to SYN_RECV state
Hello all,
I am facing this sporadic issue with my rmi server for the past few months. I have tried searching the internet world for any help but did not come across any solution that’s of help. Any help from you on this would be much appreciated!
Our rmi server is accessed by 100-200 clients at any point in time. 99% of the times, the server just works fine except suddenly it kind of hangs i.e., no new connections can be made from clients. The connection attempts from clients fail with error "SocketTimeOutException: read timed out" and on the server, the tcp connections are observed to be in SYN_RECV state (from netstat output). Server is running on Redhat linux and clients could run on any of windows, linux and mac. The frequency of the issue is random with once in few months to multiple times on a single day - the average is 2-3 times a month. After restarting the server, everything starts to work fine again.
On one such an event, we were able to talk to our network guys to get tcpdump trace logs on server port. Analyzing the dump, the following is the behavior
client -> SYN -> server (server accepts)
server -> SYN/ACK -> client (client accepts)
client -> ACK -> server (server does not accept)
server -> SYN/ACK -> client
client -> ACK -> server (server does not accept) . This is repeated 6 times based on tcp retry setting
server -> RST -> client (client gets timed out error at this point)
I am no expert of tcp and so for me, the strange thing is to see ACK message at the server port yet server resending syn/ack messages. Could you tell me what this behavior could mean? Any guidance towards debugging this issue will be really helpful.
Thanks in advance
PS: I am a newbie to forum so please feel free to let me know if I my post has any issues w.r.t forum decor/guidelines etcI think you have an intermittent hardware problem somewhere in a router or switch in this path.
-
Managed server in WARNING state in Obiee 11.1.1.7.0 prod env
Hi All,
We have OBIEE 11.1.1.7.0 running in our company. Last couple of days we are experiencing that, the managed server "bi_server1" always changes to "warning" state in every 20 to 30 minutes or so. Reports are too slow and behaves funny after warning. When I restart opmn, then it shows health is "OK". But again changes to "WARNING" after sometimes. It says: Thread pool has stuck threads.
We have got a RAC environment of database and our OBIEE is pointing to one of the node and i can tnsping to it. I can increase the StuckThread time to 1200, 1200 or so instead of 600, 60 for bi_server1 in the console as found somewhere as a tuning process but not sure if it will help much!! I can't even cancel the long running queries in the "managed session" of obiee analytics.
Kindly advise what exactly is causing this problem and what must I do to resolve this??
The bi_server1 log file displays the following error:
####<Oct 15, 2013 1:31:59 PM SAST> <Error> <WebLogicServer> <SWAOBIEE> <bi_server1> <[ACTIVE] ExecuteThread: '5' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <705beb8dd8c1b40f:-c2b96b4:141a7012d14:-8000-000000000004373a> <1381836719713> <BEA-000337> <[STUCK] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "709" seconds working on the request "weblogic.servlet.internal.ServletRequestImpl@46631799[
GET /analytics/saw.dll?Sessions HTTP/1.1
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.35 (KHTML, like Gecko) Chrome/27.0.1448.0 Safari/537.35
Referer: http://10.20.16.72:9704/analytics/saw.dll?Admin
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cookie: ORA_BIPS_LBINFO=141bbd5ed78; ORA_BIPS_NQID=v1b3rdkjen27ul12vejs054ra8v5mgmsk25pu7a; JSESSIONID=5xfZSdkK6N0HN49H2hKFT9SmQ0ngVp6vLJvw81QfTychYjpwpmFZ!-2050741508; ADMINCONSOLESESSION=Q2TLSdkdv1GT79BFgzPyw5njCtPDXmmyWbhb523hgV5DRqtv7csD!1865413782
]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace:
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.read(SocketInputStream.java:129)
com.siebel.analytics.web.sawconnect.SAWConnection$NotifyInputStream.read(SAWConnection.java:165)
java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
java.io.BufferedInputStream.read(BufferedInputStream.java:237)
com.siebel.analytics.web.sawconnect.sawprotocol.SAWProtocol.readInt(SAWProtocol.java:188)
com.siebel.analytics.web.sawconnect.sawprotocol.SAWProtocolInputStreamImpl.readChunkHeader(SAWProtocolInputStreamImpl.java:282)
com.siebel.analytics.web.sawconnect.sawprotocol.SAWProtocolInputStreamImpl.startReadingNewMessage(SAWProtocolInputStreamImpl.java:49)
com.siebel.analytics.web.sawconnect.SAWServletHttpBinding.forwardResponse(SAWServletHttpBinding.java:201)
com.siebel.analytics.web.SAWBridge.processRequest(SAWBridge.java:189)
com.siebel.analytics.web.SAWBridge.doGet(SAWBridge.java:224)
javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
Thanks,
BK.I had similar issue in windows 7 [64 bit].
It is found that the nodemanger windows service was not present and I had to create it by executing installNodeMgrSvc.cmd present in wls server location say, 'C:\Oracle\Middleware\wlserver_10.3\server\bin' in my case and then come to bi wls console and 'Resume' the bi_server1 managed server. You would see progress in the installer once the managed server bi_server1 is 'Running' state.
Hope this helps. -
Managed Server state "STARTING" after shut it down.
Hello,
I have a really odd situation here. I have a Managed Server with a "STARTING" state but it is really shutdown. I have tried to do "Force Shutdown Now" but the only I can get is this:
Status of Last Action = TASK COMPLETED
State = STARTING
I cannot start the Managed Server because Weblogic says it is in an incompatible state. I have tried to restart the Admin server and delete the folder /cache and /tmp but I cannot solve it. Any help please? Where does WebLogic keep track of the status of its managed Servers?
Thank you,
OscarHello Fabian,
I am using Solaris(unix). The managed server is not running. At least there is no weblogic process running on that machine (ps -ef | grep weblogic). I will try tomorrow to search of the name of the managed server as you suggested.
It seems that the Admin server saves the status "STARTING" in any place for that managed server. Tomorrow I will also try to start the managed server from command line and later on shut it down from the Admin server(GUI). Let's see whether the managed server sends the notification to the admin server telling him that it was shut down correctly.
Thanks for your reply,
Oscar -
Assignment Server Component (AsgnSrvr) goes to unavailable state
Our AsgnSrvr component goes to unavailable state frequently almost 2-3 times a day. In component log, we can see "SBL-SVR-09016: Failed to get task instance: task number 35651599. The task may have exited or does not exist". Our enterprise is on 8.0.0.13.
As per Doc ID 1531224.1, maxMTServer:MaxTask should be 1:5 but ours is 1:20. Is this configuration fine for 8.0.0.13? If not how can we check that this configuration is not feasible for our enterprise and what type of data I should collect to define the issue?
In the component log, we are getting below error in almost all the AsgnSrvr log:
GenericLog GenericError 1 0000000253561cf4:0 2014-04-22 19:13:38 Unable to find table CX_DATATMPLVIEW in the Siebel Repository
GenericLog GenericError 1 0000000453561cf4:0 2014-04-22 20:03:16 (scfdata.cpp (9620) err=1319736 sys=0) SBL-SVR-09016: Failed to get task instance: task number 35651599. The task may have exited or does not exist.
Does this indicate any issue and how can we proceed to relate this to issue [Component going unavailable]?
It is happening on all the servers of the enterprise.
Thanks,
AbhishekHello Abhishek,
It might be getting timeout out.
Please verify parameter 'MaxSkillsAge'.
Please verify following knowledge article for further details:
Assignment Manager timing out and terminating intermittently with various errors (Doc ID 1085215.1)
Best Regards,
Chetan -
Manage Server Health Goes Blank in Admin Console
Hello Team,
I am continously facing issue with OSB manage server. It is running fine but health is blank in Admin Console.
Due to which it is not responding to any user request.
Kindly guide me to troubleshoot this issue.Hi,
It is due to the problem with the JMX communication between and Admin and your OSB managed servers. Did you observed any messages related to it?
Is it a SSL communication, then try to make sure that certificates are valid.
As a general practice, make sure you have listen address configured for your servers. If left it blank(default), then it would be listening on the all the available host and communication would get messy in these situations.
If still facing problem, i would suggest you enable the below debugs and raise a support case
1. Apply the below three debug logs from console(for admin as well as Managed Server)
weblogic->management->JMXCore
weblogic->management->JMXDomain
weblogic->management->JMXRuntime
2. Restart all the servers
If the issue, occurs, provide the debug along with the thread dump(take 5 thread dump at an interval of 5seconds) for the Managed Server(which is unknown state) and for the admin server.
Regards
Rosario -
One Management Server is Grey and it's not our RMS server.
Greetings -
One of our SCOM 2007 Servers is Grey and in the even log I see a bunch of 4508 errors. This is our SCOM 2007 environement and it never get touched much as there is only exchange on it, and until they move to exchange 2013 they wont be moving to SCOM
2012.
So the RMS server is working fine, but the other server we had set up by definition is our WebConsole, Reporting, Collecting server. It has been grey for a month or so as we have tried all the normal "Grey Agent" ways to get it back talking.
Cleared agent Cache and such.
What could this be, where could we look to see what is wrong? since we are getting alerts that is the main function of this scom enviroment as I dont think anyone uses reporting or the web console but still i am sure it impacts something we use as well.Hi,
As the problematic management server is in grey state, I would like to suggest you run
Show Gray Agent Connectivity Data task.
The Show Gray Agent Connectivity Data task will help you identify why an agent is gray.
Please also turn off firewall on both RMS and the grey management server, and re-flush health service state and caches.
Regards,
Yan Li
Regards, Yan Li -
Health Service Heartbeat Failure Alert for Generated when one Management Server Down,
Hi,
I have Two Management Server, every one manage about 100 server, when one Management Server goes down unexpected, I receive 100 Alert for 100 Server Health Service Heartbeat Failure.
My Question, why when the Management Server down, it send that all Managed agent Health Service Heartbeat Failure?
Is there a way to change this?SCOM 2012 agent will autofailover when primary server is down. You can check the failover management server by using the following powershell cmdlet:
#Verify Failover for Agents reporting to MS1
$Agents = Get-SCOMAgent | where {$_.PrimaryManagementServerName -eq 'MS1.DOMAIN.COM'}
$Agents | sort | foreach {
Write-Host "";
"Agent :: " + $_.Name;
"--Primary MS :: " + ($_.GetPrimaryManagementServer()).ComputerName;
$failoverServers = $_.getFailoverManagementServers();
foreach ($managementServer in $failoverServers) {
"--Failover MS :: " + ($managementServer.ComputerName);
Write-Host "";
http://www.systemcentercentral.com/how-does-the-failover-process-work-in-opsmgr-2012-scom-sysctr/ -
Managed Server Startup - Part Duex
Hello:
Is it possible to auto-start managed servers on WLS 10.3.5 under Solaris 10?
I read the thread "managed server startup" and did not want to hijack the thread. In my case, I have an init script that starts up Node Manager and one that starts up the Admin Server.
So, once those two processes are up, is there a way for the defined managed servers to auto-start without any manual intervention on a user's part? Yes, I do have CrashRecoveryEnabled=true but as I understand it, that is for when a managed server goes down. In my case, the Solaris server is booting up.
Thank you,
PerryUnless you have 50 managed servers then I can't understand why you would want that. Just start them all up at the same time from the admin console under 'control' tab of server page. What if you have a server or two that you don't want on for some reason? This would power it up regardless. I would think you would want to control that from node manager lifecycle. There may be a way to do that through scripting but it's probably not a best practice. We'll see if someone else chimes in on the subject. I am curious as well...
-
Start managed server in a cluster fail
Hi,
I'm new in Oracle Weblogic, and I'm testing Weblogic 10.3.
I setup multiple IP in my laptop as follows:
10.0.0.183 devpc
10.0.2.183 node1
10.0.3.183 node2
And I configured a domain with Configuration Wizard as follows:
AdminServer
Listen Address:10.0.0.183
Listen Port:7001
ManagedServer_1
Machine:(none)
Cluster:Cluster_1
Listen Address:10.0.2.183
Listen Port:7001
ManagedServer_2
Machine:(none)
Cluster:Cluster_1
Listen Address:10.0.3.183
Listen Port:7001
When I started ManagedServer_1 with the command "*startManagedWebLogic.cmd ManagedServer_1 http://10.0.0.183:7001*",
managed server started to RUNNING state successfully, and after few seconds, I got these error.
+<Notice> <WebLogicServer> <devpc> <ManagedServer_1> <Main Thread> <<WLS Kernel>> <> <> <1239098617688> <BEA-000365> <Server state changed to RUNNING>+
+<Notice> <WebLogicServer> <devpc> <ManagedServer_1> <Main Thread> <<WLS Kernel>> <> <> <1239098617688> <BEA-000360> <Server started in RUNNING mode>+
+<Info> <J2EE> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098617703> <BEA-160151> <Registered library Extension-Name: bea_wls_async_response (JAR).>+
+<Error> <Cluster> <devpc> <ManagedServer_1> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635062> <BEA-000170> <Server ManagedServer_1 did not receive the multicast packets that were sent by itself>+
+<Critical> <Health> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635062> <BEA-310006> <Critical Subsystem Cluster has failed. Setting server state to FAILED. Reason: Unable to receive self generated multicast messages>+
+<Critical> <WebLogicServer> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635078> <BEA-000385> <Server health failed. Reason: health of critical service 'Cluster' failed>+
+<Notice> <WebLogicServer> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635078> <BEA-000365> <Server state changed to FAILED>+
+<Error> <> <devpc> <ManagedServer_1> <[STANDBY] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1239098635234> <BEA-000000> <+
+===== FULL THREAD DUMP ===============+
Tue Apr 07 18:03:55 2009
BEA JRockit(R) R27.6.0-50_o-100423-1.6.0_05-20080626-2105-windows-ia32
+"Main Thread" id=1 idx=0x4 tid=2208 prio=5 alive, in native, waiting+
+-- Waiting for notification on: weblogic/t3/srvr/T3Srvr@0x09E9E9E8[fat lock]+
at jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native Method)
at java/lang/Object.wait(J)V(Native Method)
at java/lang/Object.wait(Object.java:485)
at weblogic/t3/srvr/T3Srvr.waitForDeath(T3Srvr.java:811)
+^-- Lock released while waiting: weblogic/t3/srvr/T3Srvr@0x09E9E9E8[fat lock]+
at weblogic/t3/srvr/T3Srvr.run(T3Srvr.java:459)
at weblogic/Server.main(Server.java:67)
at jrockit/vm/RNI.c2java(IIIII)V(Native Method)
-- end of trace
+.+
+.+
+.+
How to fix it?
Thanks in advance.Hi,
I changed the settings as follows:
AdminServer
Listen Address:devpc
Listen Port:7001
ManagedServer_1
Machine:(none)
Cluster:Cluster_1
Listen Address:node1
Listen Port:7001
ManagedServer_2
Machine:(none)
Cluster:Cluster_1
Listen Address:node2
Listen Port:7001
Cluster_1
Cluster Address:node1:7001,node2:7001
The exceptions still remained, any suggestions would be appreciated.
Maybe you are looking for
-
OIM 9.1.0.2 - SAP UM Integration
Hi Gurus, IHAC who have facing the following situation during SAP UM provisioning: When customer does a provisioning request for SAP UM, the provisioning is successful completed. Some of the requested roles are Role Master (Primary). This roles adds
-
Install Prerequisite sharepoint 2013 on windows server 2012 r2, take forever
today i try to install prerequisite, for the first time, it take like 2 hour and still not finish. its normal?? and then i cancel by using task manager..by end task.. now i run again install prerequisite. did i make any mistake? currenly it still ins
-
Macbook Pro Retina External Monitor Blurry
Just got a new macbook pro retina and got it home. After getting everything transferred over from my old toshiba I connected it to my 1 month old HP external monitor and the display is blurry. I have read on many sites that this is a common problem h
-
i am trying to start a seperate itunes account for my daughter and the itunes store wont pull up at all on her ipod. it's pulling up just fine for my iphone...how do i fix this issue?
-
Need to download Adobe Acrobat 7.0 Standard
I'm the netadmin at my place of employment and have recently formatted one of my user's PCs. Unfortunately the Adobe 7.0 std. CD I have doesn't work anymore so I need to download an .iso or something of it so that I can install it back on his PC. He