RTMT ALERT ERROR CiscoDRFFailure
From the past few days Im getting this RTMT alert constantly,
Reason : DRF was unable to backup component PHX_CONFIG.Error : Unknown Database Error AppID : Cisco DRF Master ClusterID :
Any ideas what this error might be ?
Hi Kamalakar,
Looks to be the following
https://tools.cisco.com/bugsearch/bug/CSCur24834/?reffering_site=dumpcr
Symptom:
UCCX backup fails on CUIC component for second node (phx_config)
Conditions:
UCCX 10.5 SU1 HA
Workaround:
1. Enable Root access on both first node and the second node
2. Copy the /opt/cisco/desktop/openfire/passphrase from Primary Node to the same location on Secondary Node.
3. Restart the Unified CCX Notification Service and the Cisco Finesse Tomcat service.
4. Redo the Back up. It should work this time.
HTH
Manish
Similar Messages
-
[RTMT-ALERT-StandAloneCluster] CiscoDRFFailure
Hi All,
I am getting alert from my unity connections servers for DRF component. Yesterday I got the alert for pub node and today I got for the sub node around the same time. I have verified the backup status and everything seems to be fine. I am attaching the traces here, Could you tell me whats going on in my network.
Subject: [RTMT-ALERT-StandAloneCluster] CiscoDRFFailure
Reason : Unable to access SFTP server or SFTP server too slow to respond.
AppID : Cisco DRF Master
ClusterID :
NodeID : eccun005-sub
TimeStamp : Tue Apr 08 21:01:00 GMT+00:00 2014.
The alarm is generated on Tue Apr 08 21:01:00 GMT+00:00 2014
Thanks,
Lajith PThe alert I got it for today and I could see the line in traces as
2014-04-08 21:01:00,257 DEBUG [NetMessageDispatch] - drfAlarm:sendAlarm: Sending Alarm: DRFSftpFailure -
[RTMT-ALERT] CallProcessingNodeCpuPegging
We are receiving an RTMT alert for one of the node in the cluster.
Error is [RTMT-ALERT] CallProcessingNodeCpuPegging.
Processor load over 90 Percent.
Aupair (98 percent) uses most of the CPU.
Please suggest what to do for the resolution of this and also want to know what is this AUPAIR and why is it running on CCM server.
Its urgent.Hi Sam,
If the CPU pegging alert is reported at some specific time of the day then it is most likely due to some scheduled activity like DRS Backup, CDR Load etc. Please check the following link
http://www.cisco.com/en/US/products/sw/voicesw/ps556/products_tech_note09186a00808ef0f4.shtml
CPU Pegging Alerts
CPUPegging/CallProcessNodeCPUPegging alerts monitor CPU usage based on configured thresholds:
Note: %CPU is calculated as %system + %user + %nice + %iowait + %softirq + %irq
Alert messages include these:
%system, %user, %nice, %iowait, %softirq, and %irq
The process that uses the most CPU
The processes that wait on Uninterruptible disk sleep
CPU Pegging alerts can come up in RTMT due to higher CPU usage than what is defined as the watermark level. Since CDR is a CPU intensive application when it loads, check if you receive the alerts in the same period as when the CDR is configured to run reports. In this case, you can need to increase the threshold values on RTMT. Refer to Alerts for more information about RTMT alerts.
HTH
Manish -
Hi,
Is there any way to get more detail from the alerts generated by RTMT. I would like to know what gateway went down but we cannot tell from the messages that are sent from the system.Here is an example of what is recieved when we lose a PRI gateway:
Thanks!
ChrisHi,
From the problem description I read that you are trying to figure out which device is
having this registration and unregistration attempts. Now assuming that you are looking at
these message from Alert central, could you please also review the Application logs under
System > Tools > System Viewer for related details?
--The RTMT alerts severity must be set to Error, so in UCM can you
modified it to be "Informational"
You should able to find more details next time they
get a device unregistration message.
Remember that check on all servers
the registration and unregistration events are always displayed under the
UCM server to which the gateway is registered. -
We are getting this alert on a fair few of our VMs with VHDXs and Dynamic VHDs. Everything seems OK but I am not sure what this actually means and what I need to do to resolve the issue. How do I reset the error count if that is what is required? Thanks
in advance.
Alert: Error Count Monitor Resolution state: New
Error Count Monitor Source: MyVm01 Path: MyHost.MyDomain.local;MyHost.MyDomain.local;FE71577B-A2E2-45C0-B757-2FBCEC9311DE Last modified by: System Last modified time: 2/9/2013 2:08:48 PM Alert description: Instance c:-clusterstorage-volume1-MyVm01-virtual
Sat 09/02
To:Administrator
09 February 2013 14:09
Alert: Error Count Monitor
Source: MyVm01
Path: MyHost.MyDomain.local;MyHost.MyDomain.local;FE71577B-A2E2-45C0-B757-2FBCEC9311DE
Last modified by: System
Last modified time: 2/9/2013 2:08:48 PM
Alert description: Instance c:-clusterstorage-volume1-MyVm01-virtual hard disks-MyVm01-DATA02.vhdx
Object Hyper-V Virtual Storage Device
Counter Error Count
Has a value 9
At time 2013-02-09T14:08:48.0000000+00:00
DarrenBut I am getting this alert from SCOM and SCOM has no information about the alert for me to find out what to do - thought that was the point of SCOM to let you know of problems and how to resolve them. :)
The alert is coming from the Error Count Monitor that is part of the Hyper-V Management Pack Extensions (v 4.0.0.0)
I have tried looking in the Event Logs on the Host and there doesn't seem to be any storage related errors there. I am trying to establish if this is a false positive, why it is happening and if it is safe to override and ignore.
There is nothing on the Product Knowledge tab and nothing on the Alert Context other than what I have already mentioned (see below).
Thanks for responding.
Time Sampled:
09/02/2013 14:08:48
Object Name:
Hyper-V Virtual Storage Device
Counter Name:
Error Count
Instance Name:
c:-clusterstorage-volume1-myvm-virtual
hard disks-MyVM-DATA02.vhdx
Value:
9
Darren -
How to monitor the alerts(error handling) in seerburger adapter?
hi,
is it like normal moniotoring for any scenario..i have to do the monitoring in seeburger adapter.
any other specific thing i have to take care to do monitoring the alerts (error handling) in seeburger adapter for AS2 and FTP adapters..
plz give other ways to monitor interfaces for seeburger adapter.
Regards
RajaHi Raja,
All Seeburger adapters have option of some kind of acknowledgement to be sent to the sender or any partner... u need to check the respective adapter documention....
for eg: FTP adapter can be configured for Message protocol - FTP-REPORTS
for sending various types of reports to any Business Partner,,,reports like dispatch report, transmission reports,,....etc
and for monitoring errors in XI... u can configure alerts as u normally do.....
*Reward points if useful*
Regards,
Sushil. -
My application may corrupt and the operations therefore may fail and post the alert "error: PANIC: fatal region error detected; run recovery ...", and this alert is directly printed in the main process's window.
My question is how can I do to localize this alert in my thread which just deals with the Berkeley DB database?
ThanksHi,
You can configure an error callback function. See the run-time error configuration section of the Reference Guide here:
http://www.oracle.com/technology/documentation/berkeley-db/db/ref/debug/runtime.html
It sounds like DB_ENV->set_errcall is what you want:
http://www.oracle.com/technology/documentation/berkeley-db/db/api_c/env_set_errcall.html
Regards,
Alex Gorrod, Oracle -
Getting the following alert in the call manager in the RTMT for low water mark exceeded, my experience tells setting the log alert value to a lower level helps on the issue , but not sure. Can somebody guide on this.
===============================================================================================
12/21/2011 7:30 PM : UC_RTMT-2-RTMT_ALERT 2138: Dec 22 2011 00:30:17.348 UTC : %UC_RTMT-2-RTMT_ALERT: %[Name=LogPartitionLowWaterMarkExceeded][Detail=
UsedDiskSpace : 62
MessageString : Common Disk utilization hits LWM!
AppID : Cisco Log Partition Monitoring Tool ClusterID :
NodeID : NY1PUB01
TimeStamp : Wed Dec 21 19:29:52 EST 2011.
The alarm is generated on Wed Dec 21 19:29:52 EST 2011.][App ID=Cisco AMC Service][Cluster ID=][Node ID=NY1PUB01]: RTMT Alert
=======================================================================================================
ccm 6.1.3Abhishek,
Looks like the lower threshold for UsedDiskSpace is set at around 60% & that's why the alert has raised. Keeping it at 75-80% is also fine. All it means is whenever 75-80 % of disk space is full, give an alert. I feel alert at 60% is too low or can be ignored.
Pls remember to rate helpful posts.
GP. -
Hi all,
need some help in understanding the following RTMT alert that came in last night:
At Sat Jan 24 01:29:52 PST 2015 on node , the following CoreDumpFileFound events generated:
TotalCoresFound : 1
CoreDetails : The following lists up to 6 cores dumped by corresponding applications.
Core1 : Unknown (core.17118.11.cimlistener.1422095365)
AppID : Cisco Log Partition Monitoring Tool
ClusterID :
NodeID : IPTSUB01
TimeStamp : Sat Jan 24 01:29:28 PST 2015
Is this a self-rectifying event?
Would a therapeutic reboot be warranted?
Thank you very much for your help.Hi Carlo,
Thank you very much for your feedback.. I followed your suggested approach and it looks like the scenario matches Matthews': Example 3: Core Stack Corruption ... (please see backtrace below)
====================================
backtrace
===================================
#0 0x004dabae in ?? ()
#1 0xb542ee80 in ?? ()
#2 0x00000001 in ?? ()
#3 0x00000002 in ?? ()
#4 0x00000000 in ?? ()
per Matthew's article ...the next steps... "It is recommended that the corresponding service log (e.g. ccm traces, tomcat logs) and the complete core file be retrieved from the affected system for TAC review." Could you suggest which ccm traces (and for what period of time) will be needed? also, need a little clarification on the tomcat logs (how-to pull)... lastly, could you provide the best method to copy the core dump file for tac or will the syntax of the utils core analyze <coredump filename> be sufficient?
Thank you! -
Hi All,
I have query related to RTMT alert. is it possible to edit/add more information about the alert?
For Example, recently I got the below alert which has no necessary information like which router/gateway port number cluster details etc.. Its just a blind message.
MGCP DChannel is out-of-service. Current total of 1 MGCP gateway device(s) with D-Channel-Out-Of-Service status. The alert is generated on Thu May 19 11:10:29 EDT 2011 on cluster StandAloneCluster.
Do we have any option to enable/edit the more information for RTMT alert in CUCM?
Thanks in advance !!!!Hi,
I think you have to go into RTMT and click on alert details here to view the actual name of the device.
Only other quick way to handle this is SNMP traps from the actual router and into email..
Cheers,
Tim -
Is there any way to generate an RTMT alert as soon as the PRI goes down in cisco voice gateway (MGCP). I have almost 20 gateways and all have multiple T1 circuits and need to setup on all.
Thanks Yosh, I have Alerts setup for "Number of Registered Gateways Decreased" but my management wants to have Alert specific for all PRI's.
I wonder if I setup alert for D-Channel "DatalinkService" does it sent alert for the status of individual PRI "Down/up -
[RTMT-ALERT] DirectoryConnectionFailed - CM 4.1(3)SR8
hi
continually getting alert after configuring RTMT relating to Directory Connection even though everything appears ok.
[RTMT-ALERT] DirectoryConnectionFailed
Directory connection failed .
Monitored precanned object has value of 0.
my callmanager cluster is integrated into our active directory so not using DC directory and i am wondering if this
alert relates only to DC directory??
If it does i will disable it, however, if it also monitors the connection to active directory then i want to leave it enabled.
Everything looks ok on the callmanagers.
Cheers,
Git should be fixed on your version but since you mention it's all good
K07469952
In the Cisco CallManager 4.1(3), the RTMT generates directory replication and connection alarms even when the directory replication appears to function normally
http://www.ciscotaccc.com/kaidara-advisor/voice/showcase?case=K07469952
HTH
java
if this helps, please rate -
RTMT alerts for UCM and UCCX:
We have had issues with UCCX agents that were unable to login because the CTI service or Tomcat service is hung. We usually have to restart either CTI or Tomcat on the publisher to correct. Does anyone know of any RTMT alerts that can be setup to notify us that one of those services is not responding or hung?
Hi Hariharan,
In addition to Atul's link please refer the below mentioned link CPU and Memory usage
http://www.cisco.com/en/US/docs/voice_ip_comm/cucm/managed_services/cucm_health.html#wp1101115
Tx,
Hope this helps
Shalu -
i can't connect to itunes store the alert error says "itunes could not connect to the itunes store. An unknown error occurred (0x80092013)"
Hi all,
I have the same problem with my Windows 8.1 laptop and my guess is that since I changed my Internet Provider who now delivers IPv6 addresses the problem occurs. A frieend who has a different provider does not have the problem in his home, but when he tries using iTunes at my home with my network he gets the same problem.
I have already solved a previous problem with Outlook 2003 not being able to receive emails from my wifes GMail account although everything else worked fine. The solution here was to enter the IPv4 address of Google POP-Server in the hosts file, overriding the IPv6 address.
Maybe something similar helps with this iTunes error.
If someone knows the hostname of the iTunes Store server, it could be pinged and the IPv4 address could be determined and then entered in the HOSTS file.
I'll continue trying in this direction and will post a solution if I find one.
If someone has the iTunes server name or IPv4 address, please post it. -
OSB SB Console - Operations Dashboard /Monitoring/Dashboard/Server Health
has lots of errors/alerts/warnings that have not been purged. I don't see any provision to purge these alerts/messages ?
1> How do I Purge these alerts/errors/messages
2> How do I configure these
Thank you!Hi,
Go to operation, Pipeline Alerts (or SLA alerts depending on type). On the right site click on the Extended Alert History.
In this page that follows you can click on the "Purge Alert History" link to purge all messages.
For auto purging you can follow the steps below:
Go in the Administration Console to Diagnostics -> Archives
Then, for each OSB server add a Data Retirement Policy
Choose a Custom Archive, and its name is the well known “CUSTOM/com.bea.wli.monitoring.pipeline.alert”.
You can retirement age (in hours) for example 336 (14 days) and the interval of purging for example every 24 hours (retirement period) at 00:00 (retirement time = 0).
Cheers,
Robert van Mölken
Oracle Integration Specialist
Maybe you are looking for
-
Hyperlink to a specific place in my page, not a total reload
I would love to put a link, at the bottom of my page to go back to the top without scrolling, or in the text at the begining of my page to point to a section lower down but i don't seem to be able to do this. I can only reload my page and point to th
-
I bought iPhone 6 plus few days back and after doing all the required steps including signing in iCloud account and so on, everything is working fine but just pages files, which I have stored in iCloud before and available in the other devices such a
-
[CC] Search&Replace: Bug only with Regular Expressions
Can you confirm the following behaviour/bug? Steps to reproduce: 1 Create a new document in DW CC 2 Enter that text in the source code view: <p>€</p> 3 Open Search&Replace Field Search in: Current document Field Search: Text Field Search (3rd Field):
-
HT1316 what is a puk code and how do i get a sim code for my ipad 1
what is a PUK coade and how do i get a sim code for my ipad !? i was factory restoring my ipad and received a box asking for the puk and sim code? please help i am totally locked out.
-
I can't access my wi-fi settings
my wi-fi worked for a while, but my touch froze for a minute. when i turned it off and back on, i didn't have a signal and when i go to settings it is gray and says no wi-fi. what happened?