[SOLVED] RAID status

Last night I created a raid5 array using mdadm and 5 x 3TB hdd's.  I let the sync happen overnight and on returning home this evening
watch -d cat /proc/mdstat
returned:
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sda1[0] sdc1[2] sde1[5] sdb1[1] sdd1[3]
11720536064 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
bitmap: 2/22 pages [8KB], 65536KB chunk
unused devices: <none>
which to me pretty much looks like the array sync is completed.
I then updated the config file, assembled the array and formatted it using:
mdadm --detail --scan >> /etc/mdadm.conf
mdadm --assemble --scan
mkfs.ext4 -v -L offsitestorage -b 4096 -E stride=128,stripe-width=512 /dev/md0
Running
mdadm --detail /dev/md0
returns:
/dev/md0:
Version : 1.2
Creation Time : Thu Apr 17 01:13:52 2014
Raid Level : raid5
Array Size : 11720536064 (11177.57 GiB 12001.83 GB)
Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
Raid Devices : 5
Total Devices : 5
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Thu Apr 17 18:55:01 2014
State : active
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : audioliboffsite:0 (local to host audioliboffsite)
UUID : aba348c6:8dc7b4a7:4e282ab5:40431aff
Events : 11306
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
3 8 49 3 active sync /dev/sdd1
5 8 65 4 active sync /dev/sde1
So, I'm now left wondering why the state of the array isn't "clean"?  Is it normal for arrays to show a state of "active" instead of clean under Arch?
Last edited by audiomuze (2014-07-02 20:10:33)

Contrasting two RAID5 arrays - the first created in Arch, the second in Ubuntu Server.  Both were created using the same command set (the only differences being number of drives and stride optimised for number of drives).  Why the difference in status out of the starting blocks?  I've not been able to find anything in the documentation or reading mdadm's manpage to explain this.  If there's additional info required in order to assist please let me know and I'll provide same.
Thx in advance for considering and assistance.
/dev/md0:
Version : 1.2
Creation Time : Thu Apr 17 01:13:52 2014
Raid Level : raid5
Array Size : 11720536064 (11177.57 GiB 12001.83 GB)
Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
Raid Devices : 5
Total Devices : 5
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon May 5 05:35:28 2014
State : active
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : audioliboffsite:0
UUID : aba348c6:8dc7b4a7:4e282ab5:40431aff
Events : 11307
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
3 8 49 3 active sync /dev/sdd1
5 8 65 4 active sync /dev/sde1
/dev/md0:
Version : 1.2
Creation Time : Sun Feb 2 21:40:15 2014
Raid Level : raid5
Array Size : 8790400512 (8383.18 GiB 9001.37 GB)
Used Dev Size : 2930133504 (2794.39 GiB 3000.46 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Update Time : Mon May 5 06:45:45 2014
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : fileserver:0 (local to host fileserver)
UUID : 8389cd99:a86f705a:15c33960:9f1d7cbe
Events : 208
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
4 8 49 3 active sync /dev/sdd1

Similar Messages

  • Has anyone upgraded the Ironport ESA to 8.5.6-074 and had the issues of Raid status showing unknown?

    Has anyone upgraded the Ironport ESA to 8.5.6-074 and had the issues of Raid status showing unknown? After we upgraded our appliances we are having issues with our ESA appliances showing the RAID status as unknown. When we reported the issue to CISCO we were updated there were no issues reported at all. Could anyone please confirm if you have experienced the same issue. 

    You should see OPTIMAL - meaning the drives in the C170 are in good health/status:
    myc680.local> version
    Current Version
    ===============
    UDI: C680 V FCH1611V0B2
    Name: C680
    Product: Cisco IronPort C680 Messaging Gateway(tm) Appliance
    Model: C680
    Version: 8.5.6-074
    Build Date: 2014-07-21
    Install Date: 2014-07-29 11:16:34
    Serial #: xxx-yyy1611Vzzz
    BIOS: C240M3.1.4.5.2.STBU
    RAID: 3.220.75-2196, 5.38.00_4.12.05.00_0x05180000
    RAID Status: Optimal
    RAID Type: 10
    BMC: 1.05
    There are times post-reboot, that you'll see and get notification of RAID sub-optimal --- meaning that the appliance is running through on a health-check of the appliance's RAID.  You should be getting a notification once RAID status has returned to OPTIMAL, or as per the older OS revisions, READY:
    myc170.local> version
    Current Version
    ===============
    UDI: C170 V01 FCH1428V06A
    Name: C170
    Description: Cisco IronPort C170
    Product: Cisco IronPort C170 Messaging Gateway(tm) Appliance
    Model: C170
    Version: 7.6.3-019
    Build Date: 2013-06-09
    Install Date: 2014-09-12 13:52:24
    Serial #: xxxxxxD87B39-yyyyyy8V06A
    BIOS: 9B1C115A
    RAID: 02
    RAID Status: READY
    RAID Type: 1
    BMC: 2.01

  • My Duo 12TB says RAID Status: Cannot Access Data

    When I started up my computer today, My Duo 12TB (Raid 1 Mirrored) came up with a RAID error.  When I ran WD Drive Utilities program, it gives status:  RAID Status Cannot Access Data   Drive 1 Status  OnlineDrive 2 Status  Online I have tried unplugging both the power and the USB connections.  Restarted computer. What needs to be done to fix this without lossing any Data on the HDs.Thanks,

     
    Hi, 
    Welcome to the WD Community.
    As a recommendation, please contact WD Support for direct assistance on this case.
    WD contact info: 
    http://support.wdc.com/country/index.asp?lang=en%22
     

  • WSA RAID Status: Degraded

    Hi,
       I Have the next status on WSA
    RAID Status:
    Degraded
    Any idea how to get Optimal ?

    Hello Oscar,
    Please try rebooting the appliance and leave it for a day for the raid to rebuild. If the raid still does not re-build there might be  a possible disk issue, kindly open a TAC case.
    Regards,
    Puja

  • Disk utility says "RAID STATUS: Disk missing"?

    When I boot up OS 10.3.9 there is error message "detected volume OS X cannot read - initialize, ignore or eject" - then it dissappears and the system drive behaves normally. But in disk utitlity it says "RAID STATUS: Disk missing".
    The disk is not partitioned and it will not let me repair permissions. Any ideas?

    Disk Utility monitors the status of the disks in a RAID set. If you see a message indicating that a disk is missing or has failed, try these troubleshooting steps:
    If you are using a striped RAID set, delete the damaged RAID set. Your data may be lost. Be sure to back up your RAID sets and other data regularly.
    If you are using a mirrored RAID set, there may have been an error writing data to the disk. Click Rebuild in the RAID pane of Disk Utility.
    If a problem persists, replace the damaged disk and click Rebuild to rebuild the RAID set.
    Use the First Aid pane to repair the RAID disk, then check the RAID set to see if it still reports an error. If the problem is still present, quit and reopen Disk Utility, select the RAID disk, and click RAID. Check the RAID set to see if it still reports an error. You may need to restart your computer.
    iBook G4   Mac OS X (10.3.9)  

  • [Solved] RAID 5 degraded after 3.13 upgrade

    Hi there,
    after upgrading my home server to the latest kernel 3.13 I've found out, that my RAID 5 got degraded. One of the drives has been kicked out, but I don't know why. The drive seems okay, I've also done a SMART short test, completed without any errors. The only suspicious looking error message, when upgrading to Linux 3.13 was:
    ERROR: Module 'hci_vhci' has devname (vhci) but lacks major and minor information. Ignoring.
    This is mdstat output:
    [tolga@Ragnarok ~]$ cat /proc/mdstat
    Personalities : [raid6] [raid5] [raid4]
    md127 : active raid5 sda1[0] sdc1[3] sdb1[1]
    5860145664 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
    unused devices: <none>
    smartctl:
    [tolga@Ragnarok ~]$ sudo smartctl -a /dev/sdd
    smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.4-1-ARCH] (local build)
    Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
    === START OF INFORMATION SECTION ===
    Model Family: Western Digital Red (AF)
    Device Model: WDC WD20EFRX-68AX9N0
    Serial Number: [removed]
    LU WWN Device Id: 5 0014ee 2b2cd537a
    Firmware Version: 80.00A80
    User Capacity: 2,000,398,934,016 bytes [2.00 TB]
    Sector Sizes: 512 bytes logical, 4096 bytes physical
    Device is: In smartctl database [for details use: -P show]
    ATA Version is: ACS-2 (minor revision not indicated)
    SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
    Local Time is: Fri Feb 21 22:26:30 2014 CET
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    General SMART Values:
    Offline data collection status: (0x00) Offline data collection activity
    was never started.
    Auto Offline Data Collection: Disabled.
    Self-test execution status: ( 0) The previous self-test routine completed
    without error or no self-test has ever
    been run.
    Total time to complete Offline
    data collection: (26580) seconds.
    Offline data collection
    capabilities: (0x7b) SMART execute Offline immediate.
    Auto Offline data collection on/off support.
    Suspend Offline collection upon new
    command.
    Offline surface scan supported.
    Self-test supported.
    Conveyance Self-test supported.
    Selective Self-test supported.
    SMART capabilities: (0x0003) Saves SMART data before entering
    power-saving mode.
    Supports SMART auto save timer.
    Error logging capability: (0x01) Error logging supported.
    General Purpose Logging supported.
    Short self-test routine
    recommended polling time: ( 2) minutes.
    Extended self-test routine
    recommended polling time: ( 268) minutes.
    Conveyance self-test routine
    recommended polling time: ( 5) minutes.
    SCT capabilities: (0x70bd) SCT Status supported.
    SCT Error Recovery Control supported.
    SCT Feature Control supported.
    SCT Data Table supported.
    SMART Attributes Data Structure revision number: 16
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
    1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
    3 Spin_Up_Time 0x0027 164 163 021 Pre-fail Always - 6766
    4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 273
    5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
    7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
    9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1954
    10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
    11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
    12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 273
    192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 6
    193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 266
    194 Temperature_Celsius 0x0022 115 104 000 Old_age Always - 35
    196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
    197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
    198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
    199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
    200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0
    SMART Error Log Version: 1
    ATA Error Count: 306 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
    Powered_Up_Time is measured from power on, and printed as
    DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
    SS=sec, and sss=millisec. It "wraps" after 49.710 days.
    Error 306 occurred at disk power-on lifetime: 1706 hours (71 days + 2 hours)
    When the command that caused the error occurred, the device was active or idle.
    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    04 61 02 00 00 00 a0 Device Fault; Error: ABRT
    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    ef 10 02 00 00 00 a0 08 22:17:38.065 SET FEATURES [Enable SATA feature]
    ec 00 00 00 00 00 a0 08 22:17:38.065 IDENTIFY DEVICE
    ef 03 46 00 00 00 a0 08 22:17:38.064 SET FEATURES [Set transfer mode]
    ef 10 02 00 00 00 a0 08 22:17:38.064 SET FEATURES [Enable SATA feature]
    ec 00 00 00 00 00 a0 08 22:17:38.064 IDENTIFY DEVICE
    Error 305 occurred at disk power-on lifetime: 1706 hours (71 days + 2 hours)
    When the command that caused the error occurred, the device was active or idle.
    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    04 61 46 00 00 00 a0 Device Fault; Error: ABRT
    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    ef 03 46 00 00 00 a0 08 22:17:38.064 SET FEATURES [Set transfer mode]
    ef 10 02 00 00 00 a0 08 22:17:38.064 SET FEATURES [Enable SATA feature]
    ec 00 00 00 00 00 a0 08 22:17:38.064 IDENTIFY DEVICE
    ef 10 02 00 00 00 a0 08 22:17:38.063 SET FEATURES [Enable SATA feature]
    Error 304 occurred at disk power-on lifetime: 1706 hours (71 days + 2 hours)
    When the command that caused the error occurred, the device was active or idle.
    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    04 61 02 00 00 00 a0 Device Fault; Error: ABRT
    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    ef 10 02 00 00 00 a0 08 22:17:38.064 SET FEATURES [Enable SATA feature]
    ec 00 00 00 00 00 a0 08 22:17:38.064 IDENTIFY DEVICE
    ef 10 02 00 00 00 a0 08 22:17:38.063 SET FEATURES [Enable SATA feature]
    ec 00 00 00 00 00 a0 08 22:17:38.063 IDENTIFY DEVICE
    Error 303 occurred at disk power-on lifetime: 1706 hours (71 days + 2 hours)
    When the command that caused the error occurred, the device was active or idle.
    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    04 61 02 00 00 00 a0 Device Fault; Error: ABRT
    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    ef 10 02 00 00 00 a0 08 22:17:38.063 SET FEATURES [Enable SATA feature]
    ec 00 00 00 00 00 a0 08 22:17:38.063 IDENTIFY DEVICE
    ef 03 46 00 00 00 a0 08 22:17:38.063 SET FEATURES [Set transfer mode]
    ef 10 02 00 00 00 a0 08 22:17:38.063 SET FEATURES [Enable SATA feature]
    ec 00 00 00 00 00 a0 08 22:17:38.062 IDENTIFY DEVICE
    Error 302 occurred at disk power-on lifetime: 1706 hours (71 days + 2 hours)
    When the command that caused the error occurred, the device was active or idle.
    After command completion occurred, registers were:
    ER ST SC SN CL CH DH
    04 61 46 00 00 00 a0 Device Fault; Error: ABRT
    Commands leading to the command that caused the error were:
    CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
    ef 03 46 00 00 00 a0 08 22:17:38.063 SET FEATURES [Set transfer mode]
    ef 10 02 00 00 00 a0 08 22:17:38.063 SET FEATURES [Enable SATA feature]
    ec 00 00 00 00 00 a0 08 22:17:38.062 IDENTIFY DEVICE
    ef 10 02 00 00 00 a0 08 22:17:38.062 SET FEATURES [Enable SATA feature]
    SMART Self-test log structure revision number 1
    Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
    # 1 Short offline Completed without error 00% 1954 -
    # 2 Short offline Completed without error 00% 0 -
    # 3 Conveyance offline Completed without error 00% 0 -
    SMART Selective self-test log data structure revision number 1
    SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
    1 0 0 Not_testing
    2 0 0 Not_testing
    3 0 0 Not_testing
    4 0 0 Not_testing
    5 0 0 Not_testing
    Selective self-test flags (0x0):
    After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.
    This is my mdadm configuration:
    [tolga@Ragnarok ~]$ cat /etc/mdadm.conf
    ARRAY /dev/md/Asura metadata=1.2 UUID=34bab60a:4d640b50:6228c429:0679bb34 name=Ragnarok:Asura
    I've checked all partition tables, everything seems ok. "Error 30[x] occurred at disk power-on lifetime: 1706 hours (71 days + 2 hours)" seems to be a one-time event, which happened on 1706 hours (I don't know why; no power loss or something similar). Other than those smartctl errors, everything seems fine. I've also inspected the drive; no suspicious noises or anything else, works like the other 3 drives. Am I safe to simply re-add the drive using "sudo mdadm --manage --re-add /dev/md127 /dev/sdd1" and let it re-sync or should I flag it as failed and then re-add it to the RAID?
    I am using 4x 2TB Western Digital Red drives in a RAID 5, which are about 1 year old and they ran perfectly fine until now. The server is currently shut down until this problem is fixed. I currently got a partial backup of my data (most important ones) and will make a full backup, before attempting a repair. At the moment, I'm still able to access all my data, so nothing's wrong there.
    So, what do you guys think, what should I do?
    Last edited by tolga9009 (2014-09-13 12:48:13)

    Thank you brian for the fast reply. I've backed up all my important data and tried the command. It's not working ...
    [tolga@Ragnarok ~]$ sudo mdadm --manage --re-add /dev/md127 /dev/sdd1
    mdadm: --re-add for /dev/sdd1 to /dev/md127 is not possible
    [tolga@Ragnarok ~]$ lsblk
    NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
    sda 8:0 0 1.8T 0 disk
    ââsda1 8:1 0 1.8T 0 part
    ââmd127 9:127 0 5.5T 0 raid5 /media/Asura
    sdb 8:16 0 1.8T 0 disk
    ââsdb1 8:17 0 1.8T 0 part
    ââmd127 9:127 0 5.5T 0 raid5 /media/Asura
    sdc 8:32 0 1.8T 0 disk
    ââsdc1 8:33 0 1.8T 0 part
    ââmd127 9:127 0 5.5T 0 raid5 /media/Asura
    sdd 8:48 0 1.8T 0 disk
    ââsdd1 8:49 0 1.8T 0 part
    sde 8:64 0 59.6G 0 disk
    ââsde1 8:65 0 512M 0 part /boot/efi
    ââsde2 8:66 0 4G 0 part [SWAP]
    ââsde3 8:67 0 54.6G 0 part /
    ââsde4 8:68 0 512M 0 part /boot
    Out of curiosity, I've compared "mdadm -E" of the corrupted and a healthy drive. Here's the output:
    [tolga@Ragnarok ~]$ diff -u sdc sdd
    --- sdc 2014-02-21 23:28:51.051674496 +0100
    +++ sdd 2014-02-21 23:28:55.911816816 +0100
    @@ -1,4 +1,4 @@
    -/dev/sdc1:
    +/dev/sdd1:
    Magic : a92b4efc
    Version : 1.2
    Feature Map : 0x0
    @@ -14,15 +14,15 @@
    Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=1167 sectors
    - State : clean
    - Device UUID : 4ce2ba99:645b1cc6:60c23336:c4428e2f
    + State : active
    + Device UUID : 4aeef598:64ff6631:826f445e:dbf77ab5
    - Update Time : Fri Feb 21 23:18:20 2014
    - Checksum : a6c42392 - correct
    - Events : 16736
    + Update Time : Sun Jan 12 06:40:56 2014
    + Checksum : bf106b2a - correct
    + Events : 7295
    Layout : left-symmetric
    Chunk Size : 512K
    - Device Role : Active device 2
    - Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)
    + Device Role : Active device 3
    + Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
    So, I guess my only way to fix this is remove the faulty drive from the RAID, zero out the superblock and then re-add it as a new drive. Or is there any other way to fix this?
    //Edit: I've used "mdadm --detail /dev/md127" and found out, that the faulty drive wasn't even listed anymore. So instead of using "re-add", I simply added it as a new drive and it's resyncing now. In about 220mins, I'll know more ! Is there a way to check for corruption after syncing the drives?
    //Edit: ^ this worked. My drive probably didn't got kicked after the 3.13 upgrade, but I've simply noticed it then. The drive seems to be kicked after ~1700 hours for some unknown reason - I've now disconnected and reconnected all drives to opt out any wiring issues. Since the drive was out of sync, simply re-adding it didn't work. I had to manually add it to the array again and this caused a resync, which took around 3,5 hours. I think that's okay for a 4x 2TB RAID 5 array. Everything is working fine again, no data corruption, nothing. I'll mark it as solved.
    Last edited by tolga9009 (2014-09-13 12:59:22)

  • TS130 RAID Status

    I am running Ubuntu 12.04 LTS Server on a headless TS130. I've got my operating system on a single SSD and my data on a RAID Array consisting of two 2TB drives in a RAID 1 configuration.
    Everything is running perfectly, however I am wondering how I would know if there is a problem with the RAID Array... for instance if one of the drives failed and needed to be swapped out, how do I know? Ubuntu recognizes the drives as a RAID array, but there is not a "driver" or a utiility for checking the status of the hardware RAID Array.
    Thanks in advance for your help!

    I believe upon boot, it will flash a screen looking similar to your BIOS that gives the status of your raid configuration, It only holds for a few seconds but the text is either Green, Yellow, or I assume red based off of the status.

  • RAID status in Conky

    I wrote a tiny script to monitor the status of RAID arrays. It is written in awk and is meant to be used with conky. I have this in my conky config file:
    RAID: ${execi 60 gawk -f path-to-the-script /proc/mdstat}
    #!/bin/gawk -f
    # This script checks the status of RAID arrays from /proc/mdstat
    /_/ { state = "Warning!"; exit }
    /blocks/ { state = $NF }
    END { print state }
    If everything is working correctly the ouput will be: "RAID: [UU]". It means that both drives are Up and running.
    If there is something wrong with the drive it will give an error message: "RAID: [WARNING!]".
    Maybe someone will find this usefull.

    Thanks!
    I have multiple RAIDs so I used the following command to check the status of a specific RAID:
    ${execi 60 cat /proc/mdstat | grep md126 -A 1 | gawk -f path-to-the-script --}
    (replace md126 with the array you want to check)

  • Monitoring Raid Status on SRE-910 module in 3945 router

    I'm at my wits end here.  We just recently purchased a 3945 ISR G2 router and have a SRE-910 module (with two hard drives) configured in a Raid 1.  We are running a stand-alone version of ESXi on the service module and I'm trying to figure out how to monitor the status of the Raid on the drives (along with other health issues).  SNMP has revealed nothing so far and even opening a support case for which MIB's to use has proved fruitless.  All the documents I find on monitoring the modules say to use LMS which is now Cisco Prime.  I've downloaded the trial copy, put in the SNMP settings and scanned the router.  I get device results and it shows that I have the SRE-910 module installed, but I get no other configuration / device informaiton from the module itself.
    I tried to create a new Monitoring template using the NAM health as the base template (which I'm assuming this is the correct template).  Unfortunately, when I acutally try to deploy the template against the discovered router, I get an 'Unexpected end of list' error which makes me assume I'm still doing something wrong.  I ANYONE out there monitoring the device health of their service modules in a 3945 router?  What am I missing????

    Oh, and by the way, I tried to monitor this through the ESXi host / vCenter, but even after pulling one of the hard drives from the module, neither software detected that there was an issue.  That is why I'm assuming that this will have to be monitored on the router side somehow.

  • [SOLVED] vim status line invisible/wrapped in terminal unless I resize

    Hi all,
    Whenever I open a file in Vim within a terminal (urxvt, Terminator, xterm) the status line at the very bottom is either invisible or wrapped onto a new line (ie "bot" or "All" will span a line) until I manually resize the window by dragging an edge - this seems to fix the issue (and Vim continues to display correctly even if I continue resizing) .  I tried changing the window geometries (in .Xresources for urxvt, and in terminator via --geoemtry) without luck.  Changing fonts (type, size) doesn't help either.
    I'm running Openbox on a recent Retina MacBook Pro with nVidia drivers, with infinality-bundle installed.
    The same behavior occurs under Awesome, and I tried various Openbox themes, all with the same result.   So, perhaps its a bash or nVidia issue.
    gVim works perfectly though.
    Any thoughts?
    Thanks!
    Last edited by iamjerico (2014-03-26 19:54:12)

    The status line - the very bottom line - is either (1) below the visible portion of the terminal window - and remains so even if I page down repeatedly (urxvt) - or (2) is visible but the text appearing on it, like "bot" or "all" that are normally visible on the lower right corner, will wrap around (Terminator).  That is, I'll see "b" on the bottom right corner and then "ot" on the bottom left but a line lower.  In either case, if I manually resize the terminal window, it "catches up" and all appears correctly.
    Update: when running Vim without opening a file (just "vim" from the command line), in Terminator ALL of the visible text from the startup text is wrapped (ie "Vim is open source and freely distributable" wraps onto a second line).  In urvxt, that text appears normally but the status line is not visible.  In both cases, a window resizes corrects the display.
    Very strange.

  • [Solved] dwm status bar

    I am learning about DWM...slowly...
    So, I have conky piped to dzen2 as per the screenshot;
    But how do I now remove the 'dwm-6.0' from the status bar?
    Thanks for your patience.
    Last edited by ancleessen4 (2013-04-13 11:05:17)

    dzconky;
    #!/bin/sh
    FG='#aaaaaa'
    BG='#333333'
    FONT='-*-inconsolata-*-r-normal-*-*-90-*-*-*-*-iso8859-*'
    conky | dzen2 -e - -h '14' -w '1100' -ta r -x '750' -fg $FG -bg $BG -fn $FONT &
    conky;
    out_to_console yes
    #out_to_x no
    background no
    update_interval 2
    total_run_times 0
    use_spacer none
    TEXT
    #Arch 64 :: $kernel :: $uptime_short :: ^fg(\#60B48A)$mpd_smart ^fg():: ${cpu cpu1}% / ${cpu cpu2}% :: cpu0 ${execi 5 sensors | grep "Core 0" | cut -d "+" -f2 | cut -c1-2}C cpu1 ${execi 5 sensors | grep "Core 1" | cut -d "+" -f2 | cut -c1-2}C :: ^fg(\#DC8CC3)$memperc% ($mem) ^fg():: ${downspeed enp0s20} ${upspeed enp0s20} :: ${time %a %b %d %H:%M}
    $kernel :: Updates: ${if_match ${execi 1800 (checkupdates | wc -l)} == 0}${color2}up to date${color}${else}${color3}${execi 1800 (checkupdates | wc -l)} new packages${color}${endif} :: ^fg(\#60B48A)$mpd_smart ^fg():: ^fg(\#DC8CC3)$memperc% = $mem ^fg():: ${time %H:%M} :: ^fg(\#8CD0D3)${execi 300 /home/neil/bin/weather.sh "EUR|LU|LU001|Luxembourg"}
    I have tried changing '-x' and '-w' but no change to the layout...
    Just tried to recompile with '-x=850'...to try and push dzen output to the right but no change...
    makepkg -efi --skipinteg
    I am sure this is a simple solution that I am missing...

  • [SOLVED] dwm-status.sh, possible to include scripts within the script?

    I have 'dwm-status.sh' printing some basic info (mpd, vol, date) into the status bar. Is it possible to make it include eg. gmail, weather, and battery scripts in a similar way that conky would? (execpi 60 python ~/.scripts/gmail.py)
    Also, off topic: How can I see which other special characters represent which icons in Ohsnap font? e.g. "Ñ" "Î" "¨" "¹" "ê" "í" "È represent cpu, memory, hdd, vol, clock.... how about the rest?
    dwm-status.sh:
    #!/bin/sh
    icons=("Ñ" "Î" "¨" "¹" "ê" "í" "È")
    getMusic() {
    msc="$(ncmpcpp --now-playing '{%a - %t}|{%f}')"
    if [ ! $msc ]; then
    echo -ne "\x0A${icons[4]}\x01 music off"
    else
    echo -ne "\x0A${icons[4]}\x01 ${msc}"
    fi
    getVolume() {
    vol="$(amixer get Master | egrep -o "[0-9]+%" | head -1 | egrep -o "[0-9]*")"
    echo -ne "\x0A${icons[5]}\x01 ${vol}"
    getDate() {
    dte="$(date '+%b %d %a')"
    echo -ne "\x0A${icons[1]}\x01 ${dte}"
    getTime() {
    tme="$(date '+ %H:%M')"
    echo -ne "\x0A${icons[6]}\x01 ${tme}"
    while true; do
    xsetroot -name "$(getMusic) $(getVolume) $(getDate) $(getTime)"
    sleep 1
    done
    Last edited by Winston-Wolfe (2013-06-26 13:10:11)

    Thanks guys, I've managed to get my notification scripts printing by simply not using execpi:
    getMail() {
    mai="$(~/.scripts/gmail.py)"
    echo -ne "\x0A${icons[3]}\x01 ${mai}"
    But like you said, I now have a calling frequency challenge.
    Trilby, that looks great. I've got it partially working, but I'm at a loss on how to go about printing the variables in the loop for those blocks that happen periodically.
    otherwise your email indicator (for example) will only pop up for one second every 5 minutes.
    ^ exactly what's currently happening.
    Is there something that just prints the variable without calling for an update like $(getMail) does that I could put into the loop?
    let loop=0
    while true; do
    # stuff that happens every second here
    xsetroot -name "$(getMusic) $(getVolume) $(getDate) $(getTime)"
    if [[ $loop%60 -eq 0 ]]; then
    #stuff that happens every minute here
    xsetroot -name "$(getMusic) $(getMail) $(getVolume) $(getDate) $(getTime)"
    fi
    if [[ $loop%300 -eq 0 ]]; then
    # stuff that happens every 5 minutes here
    xsetroot -name "$(getMusic) $(getMail) $(getWeather) $(getVolume) $(getDate) $(getTime)"
    let loop=0 #this prevents an eventual overflow
    fi
    let loop=$loop+1
    sleep 1
    done
    Last edited by Winston-Wolfe (2013-06-26 08:54:47)

  • RAID status: degraded or failed? is there help?

    We are using mac os sever 10.4.3. When we recently restarted the servers we noticed that 2 drives on each of the servers has yellow warning lights showing on the moitor software but not on the actual drives themselves. We have 2 servers one a G4 and the other a G5. The G4 has 2 drives and both are registering yellow, the G5 has 4 drivers with 2 registering yellow. The monitor says the drives are "degraded" but everything seems to be transferring and backing up fine. Should we do a backup and use Disk Utility to repair the degraded/damaged drives?
    Mac OS Server   Mac OS X (10.4.3)   Processor 2 GHz G5 Memory 2 GB

    You must have RAID 1 mirrored drives setup then and are running from the working disk of the mirrored pair.
    Ofcourse you should fix these types of problems/failing disks.
    Having recent backups is always a good thing. A RAID doesn't save you from human errors but it should prevent "downed" servers.
    Your supposed to be able to repair a degraded mirror from within Disk Utility but I have only done it from the CLI (Terminal) diskutil.
    If you do a : sudo diskutil checkraid
    What do you get?
    If the servers are updated from 10.3.x you can (if not done already) also update the RAID version from the older ver. 1 to ver. 2 available from 10.4.0-> using the command:
    sudo diskutil convertraid

  • (solved)raid 0 questions

    I have to 40 GB drives. First one w/ my /, /boot, and /home the other w/ mostly my music.
    I would like to combine the 2 drives and was wondering how to go about this w/o a reinstall if possible.
    Can I combine the 2 the way they are now? Can I combine them if I removed all the data off the second? Or do I need to do a clean install?
    Last edited by somairotevoli (2007-09-21 06:26:41)

    here's my fstab
    # /etc/fstab: static file system information
    # <file system> <dir> <type> <options> <dump> <pass>
    none /dev/pts devpts defaults 0 0
    none /dev/shm tmpfs defaults 0 0
    /dev/md0 / ext3 defaults,data=journal 0 0
    /dev/md2 swap swap defaults 0 0
    /dev/md1 /boot ext3 defaults 0 0
    /dev/md3 /home ext3 defaults,data=journal 0 0
    /dev/cdrom /mnt/cd iso9660 ro,user,noauto,unhide 0 0
    /dev/dvd /mnt/dvd udf ro,user,noauto,unhide 0 0
    /dev/fd0 /mnt/fl vfat user,noauto 0 0
    and here's my menu.lst
    # (0) Arch Linux
    title Arch Linux [/boot/vmlinuz26]
    root (hd0,0)
    kernel /vmlinuz26 root=/dev/md0 ro md=0,/dev/sda3,/dev/sdb3 vga=792 rootflags=data=journal
    initrd /kernel26.img
    and yet output of  tune2fs -l /dev/md0 stll says
    Default mount options: (none)
    I was thinking maybe I need to force a filesystem check on it to make it stick, yet
    shutdown -F -r now
    or
    touch /forcefsck
    do nothing after a reboot. No file check gets ran.
    I built my md devices according to the wiki on raid/lvm (skiping the lvm part) and using raid 0 for / and /home and raid 1 for /boot and swap.
    My raid setup isn't the problem here as all works as it should... I just don't understand why the full journal does not want to take.
    Last edited by somairotevoli (2007-09-20 20:17:31)

  • [Solved] RAID 0 with Arch Linux and Intel Matrix Storage Manager

    I just bought a brand new DELL Studio desktop and it has 2 500 HD's which I would like to run in RAID 0. I've setup the BIOS to run the RAID 0. Now I want the arch linux installation to recognise the volume. I followed the guide on the wikipage
    http://wiki.archlinux.org/index.php/Ins … _Fake-RAID
    but /dev/mapper doesn't show anything after modprobing the modules and installing dmraid. My controller is a intel matrix storage controller, so what module should I modprobe because sata_sil isn't the right one.
    Regards
    André
    Last edited by fettouhi (2008-12-15 10:57:18)

    you have a separate /boot partition, so your menu.lst must say :
    root (hd0,0) # the partition including /boot
    kernel /vmlinuz26 ... # without leading /boot
    initrd /kernel26.img # without leading /boot
    (in grub, / is the root of the partition that was set with the root command, your vmlinuz26 and kernel26.img are in (hd0,0)/, not in (hd0,0)/boot/)
    did you install from core CD or from FTP ?
    there was a dmraid update few weeks ago that modified device names that dmraid creates :
    it add "p" before the number of the partition (your iws...Volume0 don't change, but your iws...Volume0# change to iws...Volume0p#)
    if you installed from core CD, you will only need to add this "p" when you will upgrade dmraid (or the full system)
    (you will then need to edit both /boot/grub/menu.lst and /etc/fstab)
    if you installed from FTP, you need to add this "p" now, as you installed last version of dmraid

Maybe you are looking for