Linux server crashing neither CLI nor the GUI will be accessible, server down

Problem:
Background:
Linux server became unresponsive today both from GUI and CLI, could happen to both (Airwave or Clearpass)
collected the below logs from the server:
memory usage , CPU usage and /var/log/messages
After doing a hard reboot the server was accessible.
Diagnostics:
Check Memory usage
Following log shows server memory usage
[root@localhost mercury]# sar -r
15:00:01 kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit
15:20:01 476604 1396772 74.56 110140 707116 1201652 30.64
15:30:02 526240 1347136 71.91 110412 710536 1165148 29.71
15:55:53 LINUX RESTART
16:00:01 kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit
16:10:01 517168 1356208 72.39 136040 588964 1196724 30.52
16:20:01 510580 1362796 72.75 137488 596560 1191664 30.39
As we can see, it’s not that high and has plenty of free Memory.
Check CPU usage
Following log shows CPU usage.
[root@localhost mercury]# sar -u
15:00:01 CPU %user %nice %system %iowait %steal %idle
15:20:01 all 6.01 0.04 1.74 1.59 0.14 90.48
15:30:02 all 4.97 0.04 1.54 7.87 0.15 85.44
Average: all 7.20 0.06 2.19 2.69 0.26 87.60
15:55:53 LINUX RESTART
16:00:01 CPU %user %nice %system %iowait %steal %idle
16:10:01 all 9.13 0.04 2.78 6.98 0.31 80.76
16:20:01 all 4.21 0.04 1.39 3.49 0.15 90.73
Again, CPU wasn’t at 100%.
However, when i check the /var/log/messages log , saw the following:
Check Kernel Panic messages in Logs
Aug 22 15:38:05 servercore kernel: INFO: task jbd2/vda3-8:250 blocked for more than 120 seconds.
Aug 22 15:38:05 servercore kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1
Aug 22 15:38:05 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:38:05 servercore kernel: jbd2/vda3-8 D 0000000000000000 0 250 2 0x00000000
Aug 22 15:38:06 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:38:06 servercore kernel: Call Trace:
Aug 22 15:38:06 servercore kernel: INFO: task rs:main Q:Reg:1035 blocked for more than 120 seconds.
Aug 22 15:38:06 servercore kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1
Aug 22 15:38:06 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:38:06 servercore kernel: rs:main Q:Reg D 0000000000000000 0 1035 1 0x00000080
Aug 22 15:38:06 servercore kernel: Call Trace:
Aug 22 15:38:06 servercore kernel: INFO: task queueprocd - qu:1793 blocked for more than 120 seconds.
Aug 22 15:38:06 servercore kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1
Aug 22 15:38:06 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:38:06 servercore kernel: queueprocd - D 0000000000000000 0 1793 1 0x00000080
Aug 22 15:38:06 servercore kernel: Call Trace:
Aug 22 15:38:06 servercore kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1
Aug 22 15:38:06 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:38:06 servercore kernel: Call Trace:
Aug 22 15:38:06 servercore kernel: INFO: task httpd:30439 blocked for more than 120 seconds.
Aug 22 15:38:06 servercore kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1
Aug 22 15:38:07 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:38:07 servercore kernel: httpd D 0000000000000000 0 30439 2223 0x00000080
Aug 22 15:38:07 servercore kernel: Call Trace:
Aug 22 15:38:11 servercore kernel: INFO: task httpd:30482 blocked for more than 120 seconds.
Aug 22 15:38:11 servercore kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1
Aug 22 15:38:11 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:38:11 servercore kernel: httpd D 0000000000000000 0 30482 2223 0x00000080
Aug 22 15:38:11 servercore kernel: Call Trace:
Aug 22 15:39:54 servercore kernel: INFO: task jbd2/vda3-8:250 blocked for more than 120 seconds.
Aug 22 15:39:54 servercore kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1
Aug 22 15:39:54 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:39:54 servercore kernel: jbd2/vda3-8 D 0000000000000000 0 250 2 0x00000000
Aug 22 15:39:54 servercore kernel: Call Trace:
Aug 22 15:39:54 servercore kernel: INFO: task flush-253:0:263 blocked for more than 120 seconds.
Aug 22 15:39:54 servercore kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1
Aug 22 15:39:54 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:39:54 servercore kernel: flush-253:0 D 0000000000000000 0 263 2 0x00000000
Aug 22 15:39:54 servercore kernel: Call Trace:
Aug 22 15:39:56 servercore kernel: INFO: task rs:main Q:Reg:1035 blocked for more than 120 seconds.
Aug 22 15:39:56 servercore kernel: Not tainted 2.6.32-431.11.2.el6.x86_64 #1
Aug 22 15:39:56 servercore kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 22 15:39:56 servercore kernel: rs:main Q:Reg D 0000000000000000 0 1035 1 0x00000080
Aug 22 15:39:56 servercore kernel: Call Trace:
Aug 22 15:42:11 servercore kernel: Clocksource tsc unstable (delta = -8589964877 ns)
As we can see all the errors contained “echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.” and “blocked for more than 120 seconds” somewhere.
Explanation
By default Linux uses up to 40% of the available memory for file system caching. After this mark has been reached the file system flushes all outstanding data to disk causing all following IOs going synchronous. For flushing out this data to disk this there is a time limit of 120 seconds by default. In the case here the IO subsystem is not fast enough to flush the data within 120 seconds. As IO subsystem responds slowly and more requests are served, System Memory gets filled up resulting in the above error.
Solution
Increasing the vm.dirty_background_ratio by 5% and vm.dirty_ratio to 10, would give some buffer for the server.
Change vm.dirty_ratio and vm.dirty_backgroud_ratio
[root@localhost mercury]# sysctl -w vm.dirty_ratio=10
vm.dirty_ratio=10
[root@localhost mercury]# sysctl -w vm.dirty_background_ratio=5
vm.dirty_background_ratio=5
Commit Change by running the below command:
[root@localhost mercury]# sysctl -p
The above command should fix the kernel panic errors. However, a reboot would reset these changes.
So, we could monitor the server for a week or so and after confirming there are no more errors in the messages log, we could make this change permanent by doing the following:
[root@localhost mercury]# vi /etc/sysctl.conf
ADD 2 lines at the bottom
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
Save and exit by hitting the escape key + wq!
[root@localhost mercury]# reboot
More Explanation:
vm.dirty_background_ratio is the percentage of system memory that can be filled with “dirty” pages — memory pages that still need to be written to disk — before the pdflush/flush/kdmflush background processes kick in to write it to disk. My example is 5%, so if my virtual server has 32 GB of memory that’s 1.6 GB of data that can be sitting in RAM before something is done.
vm.dirty_ratio is the absolute maximum amount of system memory that can be filled with dirty pages before everything must get committed to disk. When the system gets to this point all new I/O blocks until dirty pages have been written to disk. This is often the source of long I/O pauses, but is a safeguard against too much data being cached unsafely in memory.

Hi,
Please apply the following correction manually.
1. Go to transaction ST03N
2. Change user from 'Administrator' to 'Expert Mode'.
3. Go to Collector and performance analysis -> Performance database
   -> Monitoring database -> Contents
4. Search where further info contains the string "h/2"
like the following monikeys:
  - 'days  h/2'
  - 'weeks  h/2'
  - 'months  h/2'  
5. Double click on each, so that they become red and show ** delete
6. Finally SAVE
This will remove the corresponding database related history up to the deletion date. It will accumulate anew afterwards.
How can I identify the monikey that has to be deleted?
When you load the text of the dump and then jump off to the break point of the debugger you may find the error break point look like:
"IMPORT HIST2 FROM DATABASE MONI(DB) ID MONIKEY".
To find the right monikey entry causing the dump you can search for the word 'MONIKEY' within the text of the dump. This can be 'days  h/3' or
'tabgrowth  2'...
You can then go back to the procedure above and search where further info contains the monikey that you found in the text of the dump.
997535     DB02: Problems with History Data.
Award points if helpful.
Thanks,
Tanuj

Similar Messages

  • Unable to load youtube videos. Neither browsing nor the youtube app are working, this hapenned after I updated to iOS6

    Unable to load youtube videos. Neither browsing nor the youtube app are working, the video start loading but nothing more happen. This issue is hapenning after I updated to iOS 6.

    Try this first.
    Reboot the iPad by holding down on the sleep and home buttons at the same time for about 10-15 seconds until the Apple Logo appears - ignore the red slider - let go of the buttons.
    And then try this.
    Go to Settings>Safari>Clear History, Cookies and Data. Restart the iPad. Restart the iPad by holding down on the sleep button until the red slider appears and then slide to shut off. To power up hold the sleep button until the Apple logo appears and let go of the button.
    And ... Are you using the mobile YouTube website?
    http://m.youtube.com/home

  • On my iPad2 the swoosh sound is checked in Settings but neither it nor the jingle when the battery is being recharged now work

    On my iPad2 the swoosh sound is checked in Settings but neither it nor the jingle when the iPad is connected for recharging are working. Any ideas ?

    Ensure that the mute switch is not activated. If it is not be sure the volume is at a level in which you can hear it. If your still having issues, test the sounds in other applications. If it is affecting multiple things, you may want to attempt a reset by holding the lock/power button and the home button until you see an Apple logo.

  • Neither LR nor PS CC will recognize raw files from a Nikon D5500 camera, although it is indicated as being supported.

    Neither LR nor PS CC will recognize raw files from a Nikon D5500 camera.

    Drostow wrote:
    I have been using Photoshop CS5 with no problems.  I just bought a Nikon D600 which is not supported by CS5.  I heard that Lightroom 4.2 will support it so I shelled out $199 for that product.  Now I see that if I import a raw file into LR, work on it and then try to send it to PS for more editing I can't do that.  I am told I have to spend another $199 to upgrade to CS6 because Adobe has no plans to have CS5 support my camera.  This is a rip-off.
    Uh...no...you are wrong.
    CS5 can open D600 files if you convert the D600 files using the free DNG Converter 7.3 RC which is available at labs.adobe.com (note DNG Converter 7.2, ACR 7.2 and LR 4.2 offer preliminary support). LR 4.3 and ACR 7.3 RC (both are release candidates hence the RC category) can deal with D600 raw files directly. If you process the D600 file in LR 4.3 RC, you can open that image in CS5 if you render the image in Lightroom 4.3 RC.
    Yes, Adobe will no longer update CS5.x because Adobe started shipping CS6. It's Adobe's policy to cease support for versions of Photoshop that is no longer shipping–Adobe is now shipping CS6. It ain't a rip-off, it's the standard Adobe policy to support CURRENT customers, not former customers.
    You really need to catch up bud...software and new cameras change all the time. Buying a new cameras generally means you'll need to update your software. Foolish you for not knowing that. All of this could be mitigated if Nikon and Canon (and the other main camera brands) adopted some sort of raw file format standards...sadly, they don't which allows people who fail to grasp the reality of the situation to blame Adobe.

  • How can I make my Apple tv remote stop interacting with my MacBook and my iMac? Everytime I press the reote, the sound will go up or down on my computers or they will wake up.

    How can I make my Apple tv remote stop interacting with my MacBook and my iMac? Everytime I press the reote, the sound will go up or down on my computers or they will wake up.

    Welcome to the Apple community.
    If you don't want to use a remote with your computer the easiest thing to do is just to turn the IR off. (System Preferences/Security/General)
    If you still want to use your computer remote then you must pair your computers remote with the computer AND the Apple TV remote with the Apple TV.
    To pair a remote with a device hold down the menu and FF buttons together for six seconds or until you see a chain icon on screen (best take the computer into another room, or turn it off, when you do this)

  • When im on my apple tv my mac will either start playing music or the volume will go up and down.  Its not on airplay or anything.  Everything work except for this.  thank you please answer ASAP. thanks

    When im on my apple tv my computer will either start playing music, or the voume will go up and down.  Its not on airplay, so i dont know whats wrong... please answer ASAP. Thank you so much!

    The computer is picking up the signal from the remote. Go into system preferences - security - general. Click the padlock and go into advanced. Check disable remote control infrared receiver.

  • Power Manager and Access Connections crash, neither help nor solution from Lenovo!?!

    Hello
    I've already posted here on the forum several weeks ago about the obvious problem that Lenovo has with the latest version of the Power Manager and the Access Connections which both crash either on startup or while tryibg to open it.
    So far there has been neither an update, nor a solution nor an official statement from Lenovo about that problem that consists rightnow for over two month!
    When searching the forum I see the same all over again: Unanswered threads, solutions that don't solve anything at all and so on...
    And just if someone here comes up with comments like:
    "work's here" or "just rollback to the old version", what is the point of writing software that doesn't work?? And why not giving ANY SUPPORT AT ALL???!!!
     I really would at least like to see an official statement saying "we work on it". After buying one of the most expensive notebooks on the market (and in my opinion still one of the best) hat would be the least one could expect form a "award wining" company like Lenovo...
    Regards
    Martin Pauli

    Hello, I am a Help Desk Consultant at Rensselaer Polytechnic Institute and currently, we've had many problems with access connections, 4965 agn wireless card, and its drivers.
    Symptoms:
    1) Wireless card is detected my windows device manager
    2) Wireless card is NOT detected by access connections, intel wireless utilities/tests/etc
    Other related information:
    1)  Events with ID 7036 + source: NETw#v## (the # stand for numbers, I can't quiet remember which ones but the one after the v is a 5
    2) Rollback driver is NEVER an abailable option
    3) Every failure case with same symptoms came in starting mid September
    4) If you select "show hidden devices" in device manager, there is a huge number of devices, most of which seem to be repeats
    After many hours of debugging and fustrated complaints, I have determined that it could be one of many problems:
    1) Teredo tunnelling: Apparently, atheros/intel/access connections won't find the wireless card if teredo has been enabled or something
    2) Power management: A default setting is "let windows turn the wireless off it there's not enough power" in wireless properties>configure>power management
    3) Driver update combination conflict: between a update on 9/21 and present time, there's something horribly gone wrong
    If anyone has a definitie solution to this, please post. 

  • Neither Firefox nor the profile manager will start. In my task manager, the process is using 50% of my cpu but does nothing. I love browsing with firefox, how can I fix this?

    I've tried restarting my computer, the I tried uninstalling and reinstalling, then uninstalling, erasing all firefox files and folders, then restarting and reinstalling. I've also tried opening the profile manager to change my profile. After reinstalling a few times and trying to open the profile manager before launching firef0x, I got the firefox has crashed window and the process in the task manager is still using 50% of my cpu. The most recent expunge then re-download pops up a "file is corrupt" error and won't even install the program.

    Kill those processes that you are seeing, including any plugincontainer.exe, only once firefox has stopped running should you try to update it.
    Totally unexpected instances of firefox running may sometimes be the result of malware activity.

  • I updated to FF 7.0.1 and now neither FF nor the profile manager will open, not even in safe mode. Did reinstall, then clean reinstall, still no help.

    FF automatically updated to v7. Since then, it won't open. I've tried opening it with the taskmanager open, and firefox.exe appears under processes for maybe 2 seconds, then disappears. I did a reinstall, then a clean reinstall (after copying my profiles folder to a flash drive). Still no luck. Won't open in safe mode, won't open Profiles Manager. I tried to use my son's FF under his login, and it updated and also refuses to work. This is Windows XP.

    Thanks for this mha007 - I can now open FF with a new profile. Can I copy my settings from the old profile or will this bring over the same problem, maybe a corrupt file. If it would bring the same problem, is there any way I can check which file is corrupt, apart from taking them over one by one?

  • I chged my internet provider and new email address and  cloud is not recognizing this chg.  i deleted my old email acct from ipad and added new acct and since then i cannot send email nor the cloud will recognize the new email address.  do i need to sync

    i chage my internet provider and therefore email address. i deleted my old email address from my ipad and now i am not able to email from new email address nor will i cloud recongnize my new email address which is my apple id now.  since i deleted my old email acct   the cloud is not allowing me to chg this address in settting.  the cloud still has my old email address listed. 

    See Here > Apple ID: Contacting Apple for help with Apple ID account security
              Ask to speak with the Account Security Team...

  • The load process runs in the SAP GUI and the GUI will timeout if someone do

    Hi
    Our main issue: for loading DRW/BKT/Characteristics/Doc Files, I have already tested processing the Idocs in Background mode. All the Idocs failed because the background process on the SAP application server cannot access the doc files. (it has to use saphttp.exe on the client)
    help me  to solve thi sproblem
    regards,
    jagadish

    https://jdic.dev.java.net/
    The JDesktop Integration Components (JDIC) project aims to make Java� technology-based applications ("Java applications") first-class citizens of current desktop platforms without sacrificing platform independence.
    This demo application demonstrates functionality in the JDIC APIs org.jdesktop.jdic.tray package.
    It creates a tray icon on the desktop (in the System Tray Area for Windows platforms, or in the Notification Area for Unix platforms), with a caption (text), an animated icon, and an associated Swing menu containing icons. It also has a tooltip displayed when the mouse hovers over the tray icon.

  • Neither Reader nor Flash Player will download on my mac

    Every time I try to start up the installation on my computer they always stop at a certain percent--so far it's been 40%, 44%, and 25%
    I can leave my computer for hours and it still won't move.
    I've gone through all the troubleshooting already, and I've made sure my browsers are closed each time
    Help!!

    Try un-installing and then re-installing.
    Adobe Flash Uninstaller
    Adobe Flash Player

  • Indesign CS4 installed in German but the GUI will be English

    Hallo,
    after you install indesign cs4 (german) on a windows XP pro SP3 machine as administrator and then logged in as a normal User, the Interface language then will be english.
    Soloutions are
    1. you will make the user as an administrator
    2. run regedit and then give the key
    hklm\software\adobe\indesign
    all rights
    Why didn´t write adobe such articles in the knowledgebase??
    greetings Jürgen L.

    or another solution:
    in registry key - XP
    [HKEY_LOCAL_MACHINE\SOFTWARE\Adobe\InDesign\]
    or Vista 64bit
    [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Adobe\InDesign]
    set
    "User Interface Locale Setting"=dword:00000002
    and you will get permanent English interface
    - don't work with ID CS2 (in PL version)
    - in CS3 this will cause resetting workspace layout
    - work in CS4 and reset layout ;)
    in CS4 IE - I think you can set ANY language for user interface - 5 is Chinese or something like that, 6 Italy, 10 Polish, 12 Magyar(?), etc. ;)
    robin
    www.adobescripts.com

  • Neither iTunes nor my computer will recognize my iPod

    My iPod has a white screen and will not turn off until the battery dies.
    I suspect it needs reformatting but, when I connect it to my computer, Windows recognizes only the USB Mass Storage Device aspect of the iPod and nothing else...So, it doesn't show up in iTunes for reformatting.
    I haven't seen anything like this before and I can't find any similar discussions. Please help! (I have already completed the first four "r"'s so, please don't suggest them!)

    1. Update iTunes to the latest version. Plug in your iPod. If iTunes still can't recognize it, then in iTunes in the top left corner click help> run diagnostics. On the box that comes up, check the last two things. Click next and it should identify your iPod.
    2. Click on your windows start menu. Type in "services". Click on it and when it pops up, on the bottom of it click on "standard". Now Scroll down to find "Apple Mobile Device" Right click it when you see it and click on "Start". When it has started, close iTunes and replug in your iPod and it should show up.
    3. Check the USB cable
    4. Verify that Apple Mobile Device Support is installed
    5. Restart the Apple Mobile Device Service and verify that the Apple Mobile Device USB Driver is installed.
    <Link Edited by Host>

  • Neither Cloud nor Backup Assistant will backup any of my contacts?

    On Razr M, running the awful 4.4.2 kitkat OS. Everything else backs up fine, but it says there are no contacts to backup.

    Thanks, I was unaware of Google contacts... I thought contacts were only stored on the phone, until manually backed up somewhere. I just found a way to check, and apparently they are in Google. Planning to reset my phone, hoping to regain the battery life and functionality I had before the kitkat update, and wanted to be sure everything was backed up properly.

Maybe you are looking for

  • Can't access my iTunes on my external hard drive

    I have been putting my music (both downloaded from iTunes and burned from my CDs) onto an external hard drive and until this week, it's worked out well. Recently, I had a friend take off Windows Vista (it came with my laptop) and install Windows XP.

  • Someone please tell me how to get this stupid album artwork on my iPod!!!

    okay, i know several people have posted question after question about this problem and no one has yet to answer it fully. i have collected album artwork to the songs on my iTunes, checked that stupid show album artwork under "get info", and now i don

  • Bluetooth adapter for printer

    My new Mac Mini is bluetooth ready. Is there a bluetooth adapter I can buy for my canon printer, MX850, so I can use it wirelessly? Thanks

  • Credit Memo To Vendor?

    Hi Experts Any suggestion for the below requirement is highly appreciated with rewards. Biz Scenario Vendor A is unable to fulfilled their contract and supply stock to us.Thus, the required stock was brought from vendor B instead. The differences in

  • Help needed, Oracle and PreparedStatement setNull for VARCHAR problem

    Using the JDBC drivers included with JDK 1.3, I am encountering a strange problem when trying use a PreparedStatement with a NULL parameter in a VARCHAR column. When running the code below (user_id is an integer and login is a nullable varchar) the p