ILOM on X4x00 - temperature monitoring snmp, automatic shutdown

Hello,
we run some X4100/X4200 servers. I want to monitor temperatures via ILOM's snmp interface. Unfortunately I get all kind of information with snmpwalk like sensor typ, hysteresis, unit, .... except the current sensor readings. Did anybody succeed in reporting temperatur sensor readings via snmp with ILOM ? We run ILOM firmware 1.0, build 6464.
Additionally I need to verify that some automatic shutdown happens once a system gets too hot. To test I would like to lower the temperature sensor threshold. How can I do this, how can I lower the temperature thresholds in ILOM?
Any help is greatly apreciated, with kind regards,
Heinrich

I'm trying to get thermal shutdowns to happen, too, for a cluster of X4100s managed with N1SM. I don't use SNMP, though. I've found that the ILOMs have good support for IPMI. e.g.
echo changeme > changeme
ipmitool -u root -f changeme -H 192.168.1.20 sensor list
I can set the sensor thresholds with ... sensor thresh fp.t_amb unr 33
Thresholds set this way don't stick across power-cycling the ILOM, so I'll probably set up a cron job to set them.
The machine does a hard shutdown a few seconds after a sensor exceeds it's non-recoverable threshold. It never seems to shut down or signal the OS with ACPI or anything when the critical or non-critical (warning) thresholds are exceeded. It does log it in the sensor log, and I think sends an IPMI alert, since N1SM configures the master machine as an alert target. (in the ILOM cli, show /SP/alerts, or something).
ipmitool(1m) says that the soft shutdown command is done by having ACPI signal the OS that there is a "fatal overtemperature". I had assumed that that was what N1SM used to soft-power-off machines, but it seems that ipmitool ... power soft doesn't actually do anything to Solaris 10u1 amd64. So that's what I need to fix, I guess. I could try booting GNU/Linux from a CD to maybe see the ACPI events and make sure they're really being sent...
Anyway, I'd love to hear how to get Solaris to soft shutdown on critical temps. If I don't figure it out soon, I'll just lower the "non-recoverable" thresholds, since I'm just running grid engine on nodes provisioned from a flash archive, so I don't need to worry too much about clean shutdowns.

Similar Messages

  • 1130AG temperature monitoring (SNMP / MIB)

    Hi
    Im trying to get some SNMP monitoring running for multiple 1130AG APs. I keep running into walls though. APS are running with the following image : c1130-k9w7-mx.124-25d.JA1.
    Problem 1.
    I cant use the show environment cmd on the APs. According to the following official Cisco document, the show environment cmd is supported from image version 12.2(11) JA , but is not available on my APs running 12.4(25d)JA1:
    http://www.csd.uoc.gr/~hy435/material/cr12425d.pdf
    Problem 2.
    According to the Cisco IOS MIB locator, the 1130AG AP does not support the ENVMON-MIB . That MIB is generally used for temperatur monitoring on many cisco devices. The 1130AG does however support ENTITY-MIB , but I am not entirely sure that this MIB can return temp info. There is very little/no documentation on this.
    Can anyone help me out? I would very much appreciate it :)
    \Christian

    Hi
    Im trying to get some SNMP monitoring running for multiple 1130AG APs. I keep running into walls though. APS are running with the following image : c1130-k9w7-mx.124-25d.JA1.
    Problem 1.
    I cant use the show environment cmd on the APs. According to the following official Cisco document, the show environment cmd is supported from image version 12.2(11) JA , but is not available on my APs running 12.4(25d)JA1:
    http://www.csd.uoc.gr/~hy435/material/cr12425d.pdf
    Problem 2.
    According to the Cisco IOS MIB locator, the 1130AG AP does not support the ENVMON-MIB . That MIB is generally used for temperatur monitoring on many cisco devices. The 1130AG does however support ENTITY-MIB , but I am not entirely sure that this MIB can return temp info. There is very little/no documentation on this.
    Can anyone help me out? I would very much appreciate it :)
    \Christian

  • Cisco Temperature Monitor

    Hi,
    I am using Cisco 3750,3560,2960 and 1800 Series routers in my network. I want to see temperature graph of said equipment on NMS (OP Manager).
    Kindly let me know that these devices support such snmp traps ? and how i see these devices temperature information on NMS.
    Regards,
    Arshad Ahmed

    Please refer to, for example, the entries listed at the bottom of this post taken from the envmon MIB (full MIB at link I posted earlier).
    I also confirmed a 3560, for example, will reporte its temperature via SNMP. The attached screenshot shows CiscoWorks LMS reporting a 3560G-24TS temperature graph (function similar to what you want to do with 3rd party NMS):
    If you want to query directly from cli, that can be done as well. For instance:
    C:\Program Files (x86)\CSCOpx\objects\jt\bin>snmpwalk -c [community_string] -O s -v 2c [device_adress] 1.3.6.1.4.1.9.9.13.1.3.1.3
    iso.3.6.1.4.1.9.9.13.1.3.1.3.1005 = Gauge32: 40
    Tells me the Cisco 3560 device queried in the screenshot above has a temperature of 40 degrees C.
    I got the OID values to translate the entries below into numeric form used in the CLI query above from this table.
    ciscoEnvMonTemperatureStatusTable OBJECT-TYPE
            SYNTAX     SEQUENCE OF CiscoEnvMonTemperatureStatusEntry
            MAX-ACCESS not-accessible
            STATUS     current
            DESCRIPTION
                    "The table of ambient temperature status maintained by the
                    environmental monitor."
            ::= { ciscoEnvMonObjects 3 }
    ciscoEnvMonTemperatureStatusEntry OBJECT-TYPE
            SYNTAX     CiscoEnvMonTemperatureStatusEntry
            MAX-ACCESS not-accessible
            STATUS     current
            DESCRIPTION
                    "An entry in the ambient temperature status table, representing
                    the status of the associated testpoint maintained by the
                    environmental monitor."
            INDEX      { ciscoEnvMonTemperatureStatusIndex }
            ::= { ciscoEnvMonTemperatureStatusTable 1 }
    CiscoEnvMonTemperatureStatusEntry ::=
            SEQUENCE {
                    ciscoEnvMonTemperatureStatusIndex       Integer32,
                    ciscoEnvMonTemperatureStatusDescr       DisplayString,
                    ciscoEnvMonTemperatureStatusValue       Gauge32,
                    ciscoEnvMonTemperatureThreshold         Integer32,
                    ciscoEnvMonTemperatureLastShutdown      Integer32,
                    ciscoEnvMonTemperatureState             CiscoEnvMonState
    ciscoEnvMonTemperatureStatusIndex OBJECT-TYPE
            SYNTAX     Integer32 (0..2147483647)
            MAX-ACCESS not-accessible
            STATUS     current
            DESCRIPTION
                    "Unique index for the testpoint being instrumented.
                    This index is for SNMP purposes only, and has no
                    intrinsic meaning."
            ::= { ciscoEnvMonTemperatureStatusEntry 1 }
    ciscoEnvMonTemperatureStatusDescr OBJECT-TYPE
            SYNTAX     DisplayString (SIZE (0..32))
            MAX-ACCESS read-only
            STATUS     current
            DESCRIPTION
                    "Textual description of the testpoint being instrumented.
                    This description is a short textual label, suitable as a
                    human-sensible identification for the rest of the
                    information in the entry."
            ::= { ciscoEnvMonTemperatureStatusEntry 2 }
    ciscoEnvMonTemperatureStatusValue OBJECT-TYPE
            SYNTAX     Gauge32
            UNITS      "degrees Celsius"
            MAX-ACCESS read-only
            STATUS     current
            DESCRIPTION
                    "The current measurement of the testpoint being instrumented."
            ::= { ciscoEnvMonTemperatureStatusEntry 3 }
    ciscoEnvMonTemperatureThreshold OBJECT-TYPE
            SYNTAX     Integer32
            UNITS      "degrees Celsius"
            MAX-ACCESS read-only
            STATUS     current
            DESCRIPTION
                    "The highest value that the associated instance of the
                    object ciscoEnvMonTemperatureStatusValue may obtain
                    before an emergency shutdown of the managed device is
                    initiated."
            ::= { ciscoEnvMonTemperatureStatusEntry 4 }
    ciscoEnvMonTemperatureLastShutdown OBJECT-TYPE
            SYNTAX     Integer32
            UNITS      "degrees Celsius"
            MAX-ACCESS read-only
            STATUS     current
            DESCRIPTION
                    "The value of the associated instance of the object
                    ciscoEnvMonTemperatureStatusValue at the time an emergency
                    shutdown of the managed device was last initiated.  This
                    value is stored in non-volatile RAM and hence is able to
                    survive the shutdown."
            ::= { ciscoEnvMonTemperatureStatusEntry 5 }
    ciscoEnvMonTemperatureState OBJECT-TYPE
            SYNTAX     CiscoEnvMonState
            MAX-ACCESS read-only
            STATUS     current
            DESCRIPTION
                    "The current state of the testpoint being instrumented."
            ::= { ciscoEnvMonTemperatureStatusEntry 6 }

  • Laptop automatic shutdown when Arch startup

    My laptop automatic shutdown while in arch linux startup but i already enable phc_cpufreq undervolt and its voltage and also i added cpufreq_ondemand modules
    but when the time the system boot black background with text  the startup suddenly said temperature max reach 70C and it post shutdown after a 5 second the laptop shutdown...eventhough my pentium -m can reach 100C
    when after waiting about 10 minutes and i turn on the laptop back it go smoothly startup and i can log in to DE.It never shutdown after that even i was using the laptop 100% load but if i restart the pc it will go to startup and post the msg reach temperature max and auto shutdown
    anyone here can help me to disable archlinux kernel or whatever startup process that automatically shutdown the laptop
    phc vids show my correct vids so i got a feeling it only activate if  the system manage go to DE kde and loggin but not when on boot Startup archlinux

    I had this problem on my ThinkPad r51p which would suddenly say critical temperature reached and shut down. It was the same with different linux distros and FreeBSD as well.
    I found no other solution to the problem but writing a small script that I always had running in the background:
    #!/usr/local/bin/php
    <?php
    $slow = FALSE;
    while(1) {
    $temperatur = `cat /proc/acpi/thermal_zone/THM0/temperature | grep temperature | awk ' { print $2 } ' `;
    if((int)$temperatur > 85 && !$slow) {
    // print "Slowing down".$temperatur." C";
    system("echo -n 600000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq"); $slow = TRUE;
    } elseif((int)$temperatur < 80 && $slow) {
    // print "Speeding up".$temperatur." C";
    system("echo -n 1700000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq"); $slow = FALSE;
    sleep(3);
    ?>
    But most likely there is a more elegant solution to it than this.
    And, btw, Do NOT try this script unless you know exactly what your doing and the settings are right for your CPU
    Last edited by dr_relling (2009-02-24 23:48:09)

  • Is it possible to transfer the temperature readings through "Serial I/O" port of Lakeshore temperature monitor?

    With this monitor, you can have 2 analog outputs that you can transfer to the computer by using DAQ boards. In my case, I need temperature measurements at 3 different locations so I need 3 analog outputs, which is not possible with this device.
    However, the monitor can also store temperature data in its memory and print the data. What I need is not to print the data but send the data to the computer somehow in digital form and postprocess it. I am using the "serial I/O" port to connect the device to the computer. I think I can do some configurations on the temperature monitor with the drivers in the link (http://sine.ni.com/apps/utf8/niid_web_display.download_page?p_id_guid=0241DD1B3AAF0341E0440003BA7CCD... ) but I am not sure if the device recognize the commands that I send through the I/O port. 
    Any advice on how to transfer (in digital form) the stored data on the monitor to the computer?

    http://sine.ni.com/apps/utf8/niid_web_display.download_page?p_id_guid=0241DD1B3AAF0341E0440003BA7CCD...
    and the temperature monitor is the one in this link: http://www.lakeshore.com/products/cryogenic-temperature-monitors/model-218/Pages/Overview.aspx
    Actually, it says in the manual that the stored data can be send to either printer or they can be retrieved by the computer interface. However, there is not enough information about how to perform this task. 
    I am using labview to read the data for the analog outputs and the drivers in the first link are also in labview language. But with those drivers, I cannot retrieve the data to the computer. 
    I will check MAX to test the communication with the device, though. Thanks for the advice!

  • Zen Stone Plus 4GB - Automatic Shutdown - IT DOES NOT WORK

    :Zen Stone Plus 4GB - Automatic Shutdown - IT DOES NOT WORKVHello!
    I do possess this MP3 Player (with integrated speakers) for more than 2 years now and have never been using the Automatic Shutdown function of the player (in the menu settings). Yesterday, I went to bed, wanted to listen to some music and tried out this function the first time, but it did not work. I tried it with 5, 0 and 30 minutes. What am I doing wrong or is this setting just a joke?
    I am using the current software/firmware and am not a newbie to technical instruments so I do know how to - theoretically - start this function by going to setting - Automatic Shutdown - scrolling to the minutes I want to have - Enter/Confirm with the middle ball-knob at the front of the player.
    Maybe some of you might help me, I would highly appreciate that.
    Kind regards
    Guyfromgermany

    Hi JLM, am running .0.05e fine and in fact listening to it at the moment. Is there anyway you can regress back to .0.03 to test your theory? Testing a bunch of 28k CBR MP3s is an easy thing to try also.Message Edited by BlueRobin on 0-02-200802:49 PM

  • Good Temperature Monitor for C2D MBP??

    Just got my 17" MBP and it's great!!
    I am new to the laptop world, so if someone can recommend a temperature monitor (preferably free) that would be great.
    It seems fine, but I'd like to be able to monitor this.
    Cheers

    It may actually be overkill......
    But this is the best one I have found:
    http://www.bresink.com/osx/HardwareMonitor.html
    It is an amazing product.
    SG
    C2D 2.16(XP)/G52.0 DUAL/MDD 1.25 DUAL/867 12 inch/ and many more   Mac OS X (10.4.8)  

  • WebLogic Datasources : Automatically Shutdown and Start up

    Hi there,
    I have a problem with my datasources, I have WLST script which shutdown and start automatically some datasources in my environment, all of them in my crontab !
    Follow exactly my script !
    [weblogic@wls_server restartDS]$ cat shutdownDS1.py | more
    connect('weblogic','password','t3://server:10000')
    domainConfig()
    servers = cmo.getServers()
    domainRuntime()
    for server in servers:
    try:
    print 'check ' + server.getName()
    print '/ServerRuntimes/' + server.getName() + '/JDBCServiceRuntime/' + server.getName() + '/JDBCDataSourceRuntimeMBeans/MYDATASOURCE-XA1'
    cd('ServerRuntimes/' + server.getName() + '/JDBCServiceRuntime/' + server.getName() + '/JDBCDataSourceRuntimeMBeans/MYDATASOURCE-XA1')
    objectArray = jarray.array([], java.lang.Object)
    stringArray = jarray.array([], java.lang.String)
    invoke('shutdown', objectArray, stringArray)
    except WLSTException,e:
    print server.getName() + 'is not running.'
    for server in servers:
    try:
    print 'check ' + server.getName()
    print '/ServerRuntimes/' + server.getName() + '/JDBCServiceRuntime/' + server.getName() + '/JDBCDataSourceRuntimeMBeans/MYDATASOURCE-XA2'
    cd('ServerRuntimes/' + server.getName() + '/JDBCServiceRuntime/' + server.getName() + '/JDBCDataSourceRuntimeMBeans/MYDATASOURCE-XA2')
    objectArray = jarray.array([], java.lang.Object)
    stringArray = jarray.array([], java.lang.String)
    invoke('shutdown', objectArray, stringArray)
    except WLSTException,e:
    print server.getName() + 'is not running.'
    exit()
    Don´t worry, I have another script which kill XA3 and XA4 datasources as well, so, this way, I have load balancing here !
    It works with CRONTAB !
    The problem here is : Doesn´t matter if datasources are running or not, simply , datasources automatically shutdown and after that, start it! But I would like to improve this, I would like to implement datasource status condition, I mean, when datasource it become overloaded, it kill automatically , this way, I would not use crontab, but another kind of trigger |!
    Did you see this before ?
    I would like to improve this workaround !
    Just keep in mind, me and my team, didn´t discover the root cause yet, so, this automatically datasource restart is a kind of workaround!
    Thanks in advance !
    Edson

    Hmm. what was the problem app?
    Force a shut down by holding the power button for an extended amount.
    Reboot into SafeBoot mode to clear cache files, then reboot as normal.
    SafeBoot  http://support.apple.com/kb/HT1564

  • Temperature Monitor for OS 10.3.9

    I need an application or console command that will show me the temperature of my CPU.
    I'm running a Beige G3 that is using a G4 processor. It's running OS 10.3.9 through the use of the application xPostFacto. I've found the apps "Temperature Monitor 4.8" and "Hardware Monitor 4.8," but those don't support any Mac OSs below 10.4.x. Temperature Monitor 4.7 supports Mac OS 10.3.9, but I can't find a download of it for the life of me. All of the links redirect to version 4.8.
    Please, any help would be much appreciated, my project is at a stand still until I can find something that works.
    Thanks!

    Click on my TMM name at left & send me an email. I'll send you the 10.3 version. Don't know if your Mac has the sensors to monitor the temps.
     Cheers, Tom

  • Temperature Monitor Alert:Memory controller heatsink Results!!!

    Hello all...
    I have a question about the Memory Controller Heatsink sensor in the temparature monitor and I would appreciate it if you could shed any light in my problem.
    Well it was long time since I checked my temperatures on my G5 DP 2.3 so yesterday while I was into some MP3 encoding I heard the fans making alot of noise ,not something unbearable but not the usual silence behaviour I was used to even under heavy load and with the nap off.
    So because of that noise I decide to check it with the temperature monitor but nothing unusual came up except...yes you guess it the Memory Controller heatsink.
    An alert came up and inform me that the temp was over 75 C/167 F.
    The threshold was by default at 75 C/167 F.
    Let me add that it's summer here and a hot one at 37 C and the room is not airconditioning,but it was exactly the same one year ago(June 2005)and I never came up with an alert.
    Also I have no problem with the rest indications.
    So what do you think is it normal or may I have a broken fan?
    And finally what is the Memory Controller Heatsink?
    Thanx in advance!

    The memory controller heatsink is shown, with the associated cooling tubes and fins, in the right-hand photo of the 'back' of the main logic board here
    http://homepage.mac.com/jerrycube/jerrycubepix/22601bluebord.jpg
    Air is pulled over the back of the main logic board and through the cooling fins, by the fan in a plastic housing, at 90deg to all the other fans, behind the hard drives.
    This fan is called "Main Logic Board Backside" by Hardware Monitor in this ancient DP2.0 - and is showing "20%" in 22deg C room temperature. "Memory Controller Heatsink" is at 54deg C.
    37deg C is above the specified max. 'Operating Temperature' of 35deg C shown here
    http://support.apple.com/specs/powermac/PowerMac_G5_Late2005.html
    I think it would probably be advisable to find some way of lowering the temperature of the room the G5 is in...

  • We are trying automate shutdown VM listed in hyper -v using C# &WMI?

    We are trying to automate shutdown VM listed in hyper-V using C# WMI? I think there is difference between VM shutdown and VM turnoff?I got the requeststate change value for turnoff(that is 3), what is the requeststate change value to shutdown VM?
    Thanks in advance.
    Venkatesh

    "shut down" attempts to trigger a clean shutdown of the VM by using the Integration Components to signal the OS in the VM to cleanly shut down.  If the ICs are not available or that times out, then the VM is Powered Off.
    "turn off" simply hits the power switch of the VM.  As if the power cord was pulled.  This is a hard down state, and generally one we don't want to see.
    http://blogs.msdn.com/b/virtual_pc_guy/archive/2011/01/11/shutting-down-a-virtual-machine.aspx
    http://blogs.msdn.com/b/virtual_pc_guy/archive/2011/01/10/turning-off-a-virtual-machine.aspx
    http://blogs.msdn.com/b/virtual_pc_guy/archive/2008/01/30/shutting-down-a-hyper-v-virtual-machine.aspx  (old namespace)
    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.

  • Temperature monitor lite

    temperature monitor lite has detected the SMART sensor on my 1.83 mbp... can anyone tell me what temperature range is "normal" for my machine?
    thanks in advance.
    MacBook Pro   Mac OS X (10.4.6)  

    i know the whole heat thing has been done to death, but i was still hoping somone would humor me. i couldn't find a "normal" range listed in any other thread, or i wouldn't have posted.
    and actually, sadly, my mbp isn't running just fine. among other issues, when it runs hot, my fan spins up and then down right away... i know, i know... the infamous "moo" has also been done to death. but before i send my machine away for service for two weeks (two weeks! an eternity) i thought a comparison with other users' experiences would be prudent...
    anyone?

  • New MacPro: Good temperature Monitor Application for the nMP?

    Can anyone suggest a good temperature monitoring application they have used for the new Mac Pro?
    Thanks

    I would recommend iStat Menus (http://bjango.com/mac/istatmenus/). I switched to it when I got the Mac Pro 2013 because Hardware Monitor doesn't work with the new Mac Pro. It's pretty good, although the history is only visual with graphs showing the evolution of the different sensors temperature. With Harware Monitor it was possible to read the temperature that were recorded at a certain point in time.

  • Best Temperature Monitoring Program?

    Hey all, first time posting in the forums.
    I recently got a mid-2012 MBP for college, so of course I immediately began putting games like Borderlands and GTA:SA on it. It runs anything I throw at it beautifully, but it gets awfully hot while it does. I know this is common for MBP's, even up to 90 degrees celsius, but it is something a tad worrisome when it gets over 80 (some games get it there every time....). So I've been trying to monitor it to at least be sure I'm not letting it get up there too often or for too long.
    So had a few questions about temperature monitoring with Macs. I've currently got 3 programs running monitor temperature and system usage- System Pal, iStat Pro, and SMCfancontrol (which I also kick up a bit when playing the more GPU intensive games). The problem is that all three show different temperatures. Right now SystemPal reads 41 degrees celsius, SMCfancontrol reads 49, and iStat Pro has temps between 30 and 42 degrees. Which one's right? Is there a more accurate/definitive program for temp monitoring?
    If it makes any difference, I usually play projected onto a 32 in LG TV using a thunderbolt to HDMI. I know this is more GPU intensive to begin with, but after going HD it feels a necessary evil.
    If anyone happens to know of ways to keep temps down while gaming, monitor temps during full screen apps/games, or ways to monitor GPU usage (right now I have none ) I'd greatly appreciate advice on that too!
    Thanks so much for answers!

    Try iStat Menus 4.02 or Temperature Monitor 4.96 or Hardware Monitor 4.96.

  • CPU temperature monitoring utility - It works!!!

    After long seeking and testing many HW monitoring programs it works! CPU temp monitoring on toshiba notebooks! Hmonitor v4.2.5.1 at http://hmonitor.com/
    Good luck!

    Hi sauliaus
    Thanks for your information.
    I have checked some postings and it seems that some user would like to install the temperature monitoring utility.
    I'm sure this post will help many users.
    Thanks again ;)
    Bye

Maybe you are looking for