How to Monitor RAC "Health" and how to Administer

Now, we are using the RAC that we have setup... i am managing now 11.2.0.3 RAC on windows and 10.2.0.5 RAC on OEL...
Can I ask, what are the things that needs to be checked in these setup...like what are the logs that I need to check.. how do i monitor well the Clusters...
its like what are the things that I have to constantly monitor in RAC..sort of RAC "health" monitoring....
thanks!

Hi
There are lots of options...
Enterprise Manager Database Control/Enterprise Manager Grid Control
CRSCTL Utility Reference
http://docs.oracle.com/cd/E11882_01/rac.112/e16794/crsref.htm
Database Health Reports
AWR
ADDM
ASH
Logs
AlertLogs/ClusterLogs
Cheers

Similar Messages

  • How to monitor RAC services and nodeapps in Grid Control

    Hi,
    I have created a number of RAC service names such as GL on a 2-node RAC and would like to use Grid control 10.2.0.2 to monitor the availability of those services and nodeapps. I was not able to find anything in Grid Control that would allow me to configure that. What are being monitored now are the listeners, database instances and nodes. Would it be possible to monitor more than just the RAC listeners, instances and nodes?
    thanks.

    I don´t think that there is an out-of-the-box metric. However, CRS monitors your services and Grid Control monitors CRS errors. If you need more granular monitoring, I have 2 suggestions:
    1) user callouts:
    [http://download.oracle.com/docs/cd/B28359_01/rac.111/b28254/hafeats.htm#RACAD7133]
    2) extending oracle enterprise manager
    I have written a paper on how to extend oem at [http://www.ora-solutions.net/web/papers/]
    "Extending Oracle Enterprise Manager to collect HP-UX glance data"
    You can follow the instructions to build a new target type called "RAC_SERVICE" and add your serivces as targets, e.g. S_BATCH, S_ONLINE, S_HR.
    Best regards,
    Martin Decker
    www.ora-solutions.net
    Edited by: mdecker on Jan 2, 2009 10:57 AM
    Edited by: mdecker on Jan 2, 2009 10:59 AM

  • How do i monitor the health of my Raid array?

    First, I want to thank Harm, Bill, and all the countless others who continue to give great advice on this forum.  My question is how to I monitor the health of my raid array and how to determine which disk is acting up.   I am using a 3-disk soft raid 0 off my motherboard (gigabyte ud3p).  Seems to work pretty well but occasionally has a hiccup in certain programs.  I wonder if it is a sign of an impending problem or if it is just because it is a soft raid.  I've tried several HD diagnostics (Crystal Disk Info, Active@, HD Tune, etc.) but aside from temperature, they don't give any info about the impending death of my raid 0.  I have the SMART feature turned on in bios.
    To premtively address the critics about the raid 0.  I only do about one video a week and do a backup every night.  So i figured it (and when) it crashes, i'll just lose a days work. The motherboard is suppose to do a raid 5 but it performed really poorly.  My system is configured with additional drives (SSD boot, Raid 0 scratch, and final video) as recommended.  Any advice would be appreciated.
    michael

    The problem with almost all raid controllers is that they do not support SMART. So that is out. With software raids you are even more limited.
    With hardware raid controllers you have web based interfaces that show some basic information, like this:
    but software raids do not. There are two ways to determine possible problems, at least that I know of:
    1. Use drive cages with LED's for the individual drives to show activity and inspect them visually.
    2. Use old-fashioned manual labour to feel vibrations, temperatures and hear clicks on individual drives.
    With only 3 drives in the raid, the chances of guessing correctly are 33.3% to start with and they only increase with manual inspection. A far easier job than in the case of 6 or more disks.
    Sorry I can not offer better suggestions.

  • How do I remove LG Health and Smart Tips ?

    How do I remove LG Health and Smart Tips ?

    Please note that any discussion of rooting/hacking is a violation of the Verizon Wireless Terms of Service
    Message was edited by: Admin Moderator

  • I hit the wrong monitor refresh rate and now lost monitor picture.  How can I get back to the right refresh rate with no pic to view?

    I hit the wrong monitor refresh rate and now lost monitor picture.  How can I get back to the right refresh rate with no pic to view?

    Try this:
    Mac OS X 10.6 Help: If you changed your display’s resolution and now it doesn’t display a picture

  • How to start Oracle 10g RAC database and clusterware?

    I have steps to stop the 10g RAC Database and clusterware but not sure about starting it.
    I have heard executing
    $crsctl stop crs --as root
    on each node
    will start the database,asm,nodeapps .Is that true?
    or we have to do that step by step like we do in stopping the clusterware and database below
    1.Stop the agent:
    cd to $AGENT_HOME/corpng04.amhc.amhealthways.net/bin, then run: ./emctl stop agent
    2.Stop the full database
    $ oracle_home/bin/srvctl stop database -d db_name
    3.Stop the ASM Instances on node1,node2
    $ oracle_home/bin/srvctl stop asm -n node -- I guess you can't give multiple nodes in one command with comma,you need to give this multiple times with diff node name
    4.Stop the NodeApps :vip,listener,oms and gsd
    $ oracle_home/bin/srvctl stop nodeapps -n node -- I guess you can't give multiple nodes in one command with comma,you need to give this multiple times with diff node name
    5.Stop the CRS cluster processes :those bloody 3 evmd,ocssd,crsd
    $su - root
    $CRS_home/bin/crsctl stop crs

    Paul R @ NL wrote:
    before is shutting down crs i tend to stop the instances and services via srvctl then stop crs via crsctl
    just the way i do it. not saying it's the right way but it is the one i am comfortable with.Good -) If we stop CRS, but forgot shutdown oracle instances ... we'll see shutdown abort in alert log file(that mean instances are shutdowned abort).
    We should shutdown instance before stop CRS anyway.

  • Best practice on monitoring Endeca health / defining outage

    (This is a double post from the Endeca Experience Management forum)
    I am looking for best practice on how to define Endeca service outage and monitor the health of the system. I understand this depends on your user requirements and it may vary from customer to customer. Specifically what criteria do you use to notify your engineer there is a problem? We have our load balancers pinging dgraphs on an interval. However the ping operation is not sufficient in our use case. We are also experimenting running a "low cost" query to the dgraphs on an interval and using some query latency thresholds to determine outage. I want to hear from people on the field running large commercial web site about your best practice of monitoring/notifying health of the system.
    Thanks.

    The performance metric should help to analyse the query and metrics for fine tuning.
    Here are few best practices:
    1. Reduce the number of components per page
    2. Avoid complex LQL queries
    3. Keep the LQL threshold small
    4. Display the minimum number of columns needed

  • Script fro monitoring RAC services when they failover or relocate to other

    when SERVICE relocate/failover to NODE1? how to catch this.? Do we have any monitoring for Services and where they are running?

    Hi,
    yes the Event Monitor of the clusterware will notify this.
    And you can react on these information (on the server) with the help of a FAN Callout:
    http://docs.oracle.com/cd/E11882_01/rac.112/e16795/hafeats.htm#RACAD7133
    You even could react on client side. In this case you would use ONS (Oracle Notification Service).
    http://www.oracle.com/technetwork/products/clustering/overview/awm11gr2-130711.pdf
    Regards
    Sebastian

  • Install RAC 11gR2 and Apply Patches with OUI New Software Updates Option

    Has anyone tried the installation with the New Software Updates Option?
    I will install 11gR2 RAC Grid and I plan to do the following steps:
    1) Install Oracle Grid Infrastructure Patchset 11.2.0.2
    2) Do all the configuration with OUI
    3) Create ASM disks
    4) Install Oracle Database 11.2.0.2 Software Only
    5) Apply Oracle Recommended Patches 11.2.0.2.2 (12311357) and 12431716. (ref. MOS doc [ID 756671.1])
    6) Create RAC Database (advanced option) with DBCA
    With these steps:
    Grid and ASM disks would be configured without the recommended patches, but the database will be created with the patches applied.
    I would not like to do a "Software only" install of GI and then apply the patches because later I would have to manually do all the configuration steps.
    If I use the OUI Software Updates Option to apply the patches, will OUI do all the patching and then let me configure GRID with OUI ?
    Quote from Oracle® Grid Infrastructure Installation Guide 11g Release 2 (11.2) for Linux - Part Number E17212-11:
    " Use the Software Updates feature to dynamically download and apply software updates as part of the Oracle Database installation. You can also download the updates separately using the downloadUpdates option and later apply them during the installation by providing the location where the updates are present."
    I could not find more detailed information about this feature on documentation and MOS .
    Thanks

    Hi,
    You can download Latest Updates And Patches Using using option -downloadUpdates
    ./runInstaller -downloadUpdatesYou can use this note:
    *How To Download The Latest Updates And Patches Using 11.2.0.2 OUI [ID 1295074.1]*
    Don't miss it ..
    *Error: INS-20704 While Installing 11.2.0.2 with "Use pre-downloaded software updates" Option [ID 1265270.1]*
    Note : Please make sure that user downloading the patches updates have the proper/correct permission to download the patch updates from MOS ( My Oracle Support).
    Do not use the /tmp/oraInstall* directories for the download location. Unpublished bug 9975999
    Documentation explaining ...
    4.5.1 Running Oracle Universal Installer
    Downloading Updates Before Installation
    http://download.oracle.com/docs/cd/E11882_01/install.112/e16763/inst_task.htm#BABJGGJH
    Regards,
    Levi Pereira

  • CCMS monitoring of WE02 and SM58

    Hi,
    I need to setup monitoring for WE02 and SM58 to trigger an auto-alert for errors.
    For WE02 errors, this only needs to be done for status code 64 - awaiting transmission, if they have been idling too long.
    For SM58 any errors should have alerts triggered.
    Just wondering how I can go about to resolve this issue?
    Thank you.

    Assuming that you know CCMS basis configuration.
    For SM58 (Transcation RFC), in RZ20 we have some MTEs and tree navigation is ,
    RZ20 --> SAP CCMS Technical Expert Monitors --> All Monitoring Contexts --> Transactional RFC and Queued RFC. Here, we can monitor for all inbound/outbound RFC connections and queues as well.
    For WE02 ,
    RZ20 --> SAP CCMS Technical Expert Monitors --> All Monitoring Contexts --> ALE/EDI BD4(Client_nr) Log.sys <Logical_system>. Here, you have all the MTEs for IDocs.
    Hope this info is useful to you.
    - Hari.

  • Second Monitor with nVidia and Openbox

    I am going to be picking up a second monitor this afternoon, and I am wondering how easy it will be to set up. Can I just run the nVidia xorg configure command and will that do most of it for me? I also noticed there are different methods of using 2 monitors. Is twinview the best or should I look at something else?

    GUI works fine and i always run nvidia-settings as root.
    su -c 'nvidia-settings'
    or just apply a alias
    alias nvidia-settings='su -c 'nvidia-settings''
    Good Luck!
    Last edited by JuseBox (2010-03-24 12:45:37)

  • What do you use to monitoring the IDs and Alerts?

    I have installed a 4250 and built out some custom signatures got the email working from VMS 2.2 for some of high priority alerts. This all great but I was wondering what other people use to monitor the alerts and how they get notifications. I see what I have as being very limited in scope and there are a number of parts in VMS that just do not work.
    Do you use other console products? I have seen a couple of applications advertised that say they aggregate alerts from the pix, ids and mcaffee. Just wondering if or what others use. I only have one ids but I do have other items pix, mcaffee.

    I wish I could but I can't offer up anything that won't come at a cost.
    The original poster in this thread alluded to solutions capable of looking at disperate data in one console. That’s a Security Information Management System (SIMS), pure and simple. Cisco sells one, as do many other vendors. The unfortunate thing is that there is no inexpensive SIMS solution out there that offers multi-product monitoring. They all come at a significant price point.
    Since you both have a single sensor, I agree that VMS Basic is more than you need. The hiccups with it far outweigh the benefits. IEV might be "uncomfortable" to use but it works. In my experience (in a single-sensor monitoring environment), using IEV for monitoring and IDM for configuration management is the easiest, if not the most eloquent, solution.
    If you want fancy visualization or reporting, you're going to have to be willing to plunk down some cash to buy it. With a single IDS sensor, it's just not economically viable IMHO.
    I hope this helps,
    Alex Arndt

  • M93p dual monitor problems; monitor turns off and cannot be turned on again

    Hi,
    we have multiple M93p tiny computers causing problems in dual monitor setup (VGA / Displayport->DVI). Our users (already >5)  keep telling us, they leave their workplace, return, then DVI attached monitor if off and they do not find a way to turn it on. This is true; the only workaround that has been found by our users is to bring Win7/32bit into standby mode and awaking it again.
    Needless to say, we turned all windows energy saving options off (computer standby/hibernate, monitor turn off). Still the problem appears.
    Maybe there is something wrong with DVI power/DVI signal?
    - Monitor independent, as far as I can see
    - Independent from Windows energy saving modes
    - BIOS updates did not help
    [todo]: Try Intel/(AMD?) drivers, let users test.
    I haven't seen exactly when the monitor turns off because it is reported by users. I had a testing machine, but was not able to reproduce.
    http://forums.lenovo.com/t5/A-M-and-Edge-Series-Th​inkCentre/Dual-monitors-problem-VGA-and-Displaypor​... is related.
    Anyone?
    Thx

    Make a "Genius" appointment at an Apple Store to have the machine tested.
    Back up all data on the internal drive(s) before you hand over your computer to anyone. If privacy is a concern, erase the data partition(s) with the option to write zeros* (do this only if you have at least two complete, independent backups, and you know how to restore to an empty drive from any of them.) Don’t erase the recovery partition, if present.
    Keeping your confidential data secure during hardware repair
    *An SSD doesn't need to be zeroed.

  • Lock monitor, top sql and trace data viewer

    can any one please tell me how can i access lock monitor, top sql and trace data viewer
    i am also not sure whether these tools are installed on my machine or not, if they are please tell me how can i access them, if not please tell me how can i install them
    thanks a lot

    Hey,
    You can launch OEM from a client machine and then:
    - Log in as system for example;
    - go to tools,
    - then - Diagnostics Packs, then you will have:
    a) Lock Monitor
    b) Perfomance Manager
    c) Performance Overview
    d) Top Sessions
    e) Top SQL.
    I hope this can help you a little.
    Regards,
    Marcello

  • Monitoring RAC services in Windows

    I have a 2 node RAC database installed on Win 2003. I've had the Database service go down a few times kind of "magically". I'm looking into the kind of magic that caused this (suspect it was a storage issue), but in the mean-time I need to send out an alert to someone if this goes down. Does anyone have a good way/tool to monitor OS services and send out an e-mail when one goes down?

    Aren't you using grid control?
    http://download.oracle.com/docs/cd/B19306_01/server.102/b25159/monitor.htm#i1006661

Maybe you are looking for