RAID 5 Set Suddenly NOT VIABLE!

I'm running an early 2008 Mac Pro with an Apple Raid Card with 4 internal, 1TB drives formatted into a single Raid 5 set. The raid card has had numerous issues in recent years, mainly drives that would unexpectedly drop out of the Raid and list as "roaming".  In those past cases, Apple had me shut down the computer, remove the drive sled of the problem drive, wait a few minutes and replace the drive then restart.  On one occasion, the Raid set was back to normal, on other occasions the drive was seen and I would have to assign it as a spare and rebuild the set.  Everything would work fine until the next time.
In the last few days, I discovered the Mac Pro shut down, after I had left it idle for a while with the screen asleep. When I tried to restart, I was greeted with a blank grey screen, no Apple, no anything.  I tried booting from CDs and external Boot Drives with no luck.  This morning I was finally able to boot onto an external drive and discovered my Raid 5 set is now "NOT VIABLE!"
ALL drives are listed "Verified" and "Good." Bays 1 and 2 are "assigned" while Bays 3 and 4 are listed as "Roaming."  The Volume is showing "Degraded" and the Raid Set is listed as "Not Viable."  I have tried shutting down and removing all the drives then reinserting them (in their original slots) and restarting, but nothing changes.
Is there any way I can get these two drives to assign back to the Raid Set WITHOUT losing all my data?  Is there any way to recover all may data?  Am I hosed here and everything is a loss?
Any help will be greatly appreciated.
Thanks in advance.
Brett B.
Here is a screen capture of what Raid Utility reported:

In my experience, Apple RAID  in general (with RAID card or without) does not tolerate a very high error-retry rate from the Drives.
I had a mirrored RAID that repeatedly became Degraded, and decided to take strong action. I made sure I had two Backups, then pulled out the Drives and erased them with Write Zeroes, one pass. One of the drives had more than 10 blocks that needed to be "spared", so Initialization Failed. I ran the procedure multiple times and found that 30 blocks had needed to be spared. If any of these had been marginal in daily use, it could have caused the drive to get (what appears to be) stuck doing dozens to hundreds of retrys, but still not throw an I/O Error that would bring the system to a Halt.
Google did a very large study and concluded that there is a very high probability that drives with multiple errors will end up being replaced within about six months. They attributed this to a cascade of additional errors that often followed the first group detected, leading to the drive being declared unusable.
In my opinion, just reseating the drive (combined with rebuilding, which does re-write the data blocks) may not be strong enough medicine to fix a recurring problem. A drive may repeatedly be doing multiple retrys on some data blocks, or may have Bad Blocks.
I suggest you Zero the troublesome drives (which takes many hours each). If Disk Utility finds more than 10 blocks that need to be spared, or if the drive runs out of available spares, it will report "Initialization Failed!" Consider such a drive not good enough for use in a RAID and replace it with a new one. Or just rebuild your RAID with different (e.g., larger) new drives.

Similar Messages

  • P5800 speaker set suddenly not working

    Hi all,
    I have a CL Inspire P5800 speaker set. I went to turn it on today and got nothing. I have checked the output of the power supply and that is good. I checked and rechecked the control wheel connection to the sub and the power connector to the sub but still get nothing when I turn it on.
    Is there a problem here that someone else has had or are my 2 month old speakers dead ?
    Cheers for any help.

    The remote still clicks when I move the dial. Also, I have stuck it with double sided tape to the side of my desk so it has not been knocked at all. I will give it a try, but i don't think the remote is the problem.
    Cheers

  • Possible to increase the total size of a software raid set?

    Hi,
    I need to increase the size of a software raid set which is internally in one of the Xserves - originally it had 2x 400G drives.
    I've swapped drives into the raid so they are both now 500G drives.
    diskutil info drive XX (ie raid set) shows that the raid is still a 400G raid volume as expected.
    Question is - can I grow the Raid Total Size to use all the capacity of the member volumes?
    I guess the alternative is to remove the drives from the raid, enable raid on one of them and then add the other drive as a member... but this would mean I would have to take the raid offline.
    The raid sets are not my system disks.
    Any clues?
    TIA
    Campbell
    XServes   Mac OS X (10.4.5)   12 Macs & far too many PCs

    Thanks for your help.
    I rebuilt the array using enableRaid. Worked fine with Raid Volume offline for approx 5 minutes - although I had to degrade the array (and go bare with no mirror) twice in the process.
    In case anyone is interested, this is what the process was:
    1 Swap into Raid array larger drive (say, disk 2) and let array rebuild
    2 Disable file services.
    3 removefromraid disk 2 (leave it mounted).
    4 Unmount the raid set and eject drives (say, disk 1) Remove drive - it's the only original backup.
    5 enableRaid on disk 2
    Providing enableRaid is ok -
    6 Insert a fresh larger size drive in the place of the removed drive (disk 1)
    7 Unmount new drive and addToRaid
    8 (Rebuild array). Providing rebuilding ok -
    9 Start file services again.
    Data, permissions & ACLs were all intact. So now users can fill the rest of the raid up with more MP3s this week <sigh>
    Maybe not the best approach but it worked well for me.
    XServes   Mac OS X (10.4.5)   12 Macs & several hundred too many PCs

  • Problem with raid set on xserve

    Hi to all!
    I hope you can help me with a big problem that I have with a Xserve of the work.
    Yesterday afternoon I come to the work and my partners mentioned to me that the web pages were not going. I tried to connect with ARD but he was not answering. I had to move to the building where the server is racked and when I connected a monitor a it was showing me a folder icon with an interrogation icon.
    Since this had happened to me in my Mac of the office once, I initiated from the installation DVD to see if I could repair permissions and/or the discs with the Disk Utility. For my surprise (and terror) does not show any volume or disk except the DVD.
    I got into the RAID Utility in the menu and there I could see the discs but the RAID Sets does not appear... We had mounted since we install the machine a raid 5 with 3 discs and 2 volumes on this raid set.
    Then, from the Terminal, doing "diskutil list" it's show no volume on any disk.
    I suppose that something has happened with the volumes and the raid cannot be mounted...
    Is there any way to  mount the volumes again? Something similar to the "fixmbr" that I believe that there are in others OS to restore the volumes tables...
    Is there any way to mount again the RAID without losing the information from the discs?
    Do you know that can be happening?
    PS: It is a Xserve of the last ones, with Mac OS X 10.6.8. I have already tried to reset the PRAM and it keeps on doing the same.

    The Server Monitr does say anything. It doesnt pick up any server. I tried to add the XServe in the Server Monitor, but it keeps waiting for a reply after I have put in the username, password and IP address.
    It is an XServe with a Mirror software RAID on the 2 x Internal Hard Drives within the XServe
    I am not using the SAN software
    There have been reports of fault latchs on the Apple Drive Bay Modules. Do you think this could be causing the units to dismount?

  • Expanding XServe RAID set - stuck at 99%

    We have an XServe RAID.  One side had a RAID 5 set consisting of 6x500GB HDD's.  We came to within 60Gb of filling it up and therefore initiated expansion of RAID set to 7th drive on Saturday.  This was progressing 1% / 40 mins.  We calculated that it would not complete before Monday when users would start hammering it, but after testing it, it seemed to be working perfectly fine while expanding.  Writing/reading speed did not seem to be noticeably affected and we decided that all seemed well.
    Monday started off well.  Not not a single user noticed reduced speed or any other problems - This was until about 3pm when the file server hosting AFP & SMB crashed!  Not good!  Upon restart, the RAID set would not mount again.
    RAID Admin shows the expansion at 99%.   I was hoping that the RAID set could be mounted again once it completed to 100%, but it does not seem like this will be happening anytime soon - it has been stuck on 99% for about 6 hours now and I don't know when/if it will ever finish.  (Previously 1% every 40mins)
    The fibre channel preference pane shows "Link Established" to the XServe RAID, similarly the RAID Utility shows "Link Up" to server.  The RAID, however, does not appear in Disk Utility - even after restarts.
    Does anyone out there have experience expanding a RAID Set on an XServe RAID?  Does it usually stall at 99%?  How long does it take to finish once it has reached 99%?
    Is there anything I can do to try and get the RAID Set mounted before it finishes its expansion?  It's now 8pm here (GMT+8) and users have finished for the day, but we only have about 12 hours before it needs to be up and running again!
    I am out of ideas to try.  Any help would be greatly appreciated.
    Thanks

    Curiously, more output from the above-mentioned 3rd party tool gives:
    $ ./xserve-raid-info
    Use of uninitialized value in string at ./xserve-raid-info line 655.
    Use of uninitialized value in string at ./xserve-raid-info line 655.
    Use of uninitialized value in string at ./xserve-raid-info line 655.
    Use of uninitialized value in string at ./xserve-raid-info line 655.
    Name: Xserve-RAID
    States:
      RAIDs: optimal
      Components: optimal
      Fibre Channel: optimal
      Network: warning
    Warnings:
      The top network link is down.
    Upper Controller Info:
      Status: ok
      Firmware Version: 1.5.1f2/50
      Up Time: 262 days 1 hr
      Temperature: 75.2 deg F
      Write Cache: disabled
      Prefetch Size: 8 stripes (512 KB/disk)
      Fibre Channel: link up
        Topology: arbitrated loop
        Speed: 2Gb/sec
      Network: link down
        IP Address: 192.168.30.11
        Subnet Mask: 255.255.255.0
    Lower Controller Info:
      Status: ok
      Firmware Version: 1.5.1f2/50
      Up Time: 262 days 1 hr
      Temperature: 82.4 deg F
      Write Cache: disabled
      Prefetch Size: 8 stripes (512 KB/disk)
      Fibre Channel: link up
        Topology: arbitrated loop
        Speed: 2Gb/sec
      Network: link up
        IP Address: 192.168.30.21
        Subnet Mask: 255.255.255.0
    In RAID Admin, I've added systems using 192.168.30.11 (supposedly - link down).
    I can NOT connect using 192.168.30.21 - it says incorrect password even though I know I am entering the correct one...

  • Apple RAID Utility Raid 0 one disk roaming making raid set not viable

    Hello!
    I have a problem with RAID 0. In the picture below you can see that one drive was set to missing and RAID set RS1 has been set to non viable. However all four drives(Bay1-4) are set to be verified with status good. However as can be seen from the picture Bay 3 is set to status: Roaming
    I don't know what I can do here. There was some talk on other discussions but nothing that helped me
    Thanks to anyone that read this and try to help me. If you need more detail I will be more than happy to give them to you. 
    http://shrani.si/f/G/Z9/1KO4CaVc/screen-shot-2012-10-17-a.png
    http://shrani.si/f/41/yG/2o1mIfV/screen-shot-2012-10-17-a.png
    http://shrani.si/f/3p/v/35kiFJyD/screen-shot-2012-10-17-a.png
    http://shrani.si/f/1m/lg/KcXG8j/screen-shot-2012-10-17-a.png
    http://shrani.si/f/1Q/ET/2UkxU4ml/screen-shot-2012-10-17-a.png
    http://shrani.si/f/R/ma/6jYOaty/screen-shot-2012-10-17-a.png

    An error with drive in bay 3 threw it out of the RAID, breaking the Striped RAID. You data will be very difficult to recover, since the data are striped on two different drives, and one of them is no longer part of that RAID set.
    You now need to re-initialize the RAID set (which destroy what is left of the data) using the same or different drives, and restore the data from your backups if desired.
    I recommend writing zeroes to every block on the troublesome drive before even thinking of returning it to service.

  • Non viable RAID set error

    This is one of those problems that I don't know where to seek help from but here goes... I'm pretty much hosed anyway...
    Since August 09 I have been running a RAID set consisting of 2 Western Dig RE4GP 2TB drive (resulting in a roughly 3.xTB volume). Everything has worked fine, every 6 months the Apple RAID card reconditions its battery... no problems... until last night at 10:55pm... I get this error message out of nowhere (I'm not doing anything special or taxing on the computer) saying that the RAID Utility app needs my attention. All of a sudden I see these 3 "severe" events that have occurred:
    Wednesday, December 29, 2010 10:55:23 PM PT Non-viable RAID set RS1 and all associated volumes are offline critical
    Wednesday, December 29, 2010 10:55:19 PM PT Drive 3:50014ee2583cccfd missing - Replace immediately or acknowledge loss of RAID set RS1 and associated volumes critical
    Wednesday, December 29, 2010 10:55:19 PM PT Drive 3:50014ee2583cccfd missing - Previous drive status was inuse critical
    After some panic and thought of what data I might have lost, I decided to reboot and of course the computer won't reboot (before August 09 I was having some nightmares with the Apple RAID card that came with the computer and after calling Apple and complaining, they sent me a new RAID that has been working fine since yesterday I guess). I forced power down and then removed both WD RE4GP trays (bays 1 & 3), restarted the computer and after a while, the computer booted up fine (boot drive is a WE RE3 in bay 1 and I also have a Seagate ES2 in bay 3). Both drives are "associated" in some operational way with the RAID card so this would indicate that the RAID card is working okay but could the WD RE4GPs have failed?? There was no mechanical noise or indication that the drive hardware had failed; after all these are "enterprise" class drives and only a year and a half old and they have never experienced any trauma. I have recently used Disk Warrior and all drives and the RAID set and everything passed without issue. So today I tried to re-insert both WD RE4 sleds into the computer but again it wouldn't boot up just like yesterday (just stuck at the grey screen, no Apple logo appears). Removing both sleds allows the computer to boot so at least some comfort.
    Can anybody give me any tips to maybe save the data on the drives or what I should try? Thanks a lot!

    At what point does this error occur? Can you provide more of the stack trace?
    Geof

  • RAID Set not showing up - 3 severe events

    Intelle Xserve running 10.6 RAID Card running firmware v E-1.3.2.0, Two it drives set up with I think RAID 5.
    The main person for the  system is away as well as his backup. So I'm running on very little information at the moment.
    Log:
    Called in due to user not able to access shared files (Also shared FileMakerpro databases and Portfolio server). Only access to server was via screen share.
    Could not connect. After trying all that I could, powered down the server. Restarted, got a login startup permissions error for an item to run at startup. (I'll have to check what exactly that was). Started the Portfolia server and the FileMakerpro databases. Users able to access and use.
    Next day same thing. User lost connection with the server. Restarted, start filemakerpro databases and portfoilio services. User able to access.
    I suggest pulling over the files needed to work just in case it looses connection again, User starts to pull over some files which is transfering.
    About 15 minutes later I get a call that the user lost connection again.
    Restart the once again. This time get errors from the RAID utility:
    Non-viable RAID set RS1 and all associated volumes are offline
    Drive 3:500 missing - Replace immediately or acknowledge loss of RAID set RS1 and associated volumes
    Drive 3:500 missing - Previous drive status was inuse
    The RAID utility does not show a RAID set.
    The drive appear to be OK
    The drive shows up on the desktop but when I go into the drive the folders etc are showing. Access the main folder results in a loop where it bails out and just shows the desktop again.
    I'm going to get some files to the user from the timemachine backup for now.
    Is it possible that the data is OK and maybe the drive are marked off line for the set but still OK?
    Any info would be great,
    Thanks

    Yes the 160GB has the OS. I'm coming into this cold as the main person and the backup are currently away.
    I have got the users a work around so they can do thier work. The backup person has very limited email contact and he had thought it was a RAID 5.B  But I would agree this looks to be a RAID 0 or 1 since there are only two drive available for data.

  • Rules set in mail are suddenly not working?

    I had my rules set in mail and now they are suddenly not working, what to do to fix?

    I'm having the same problem, but not with all messages. Only a few slip through. However, when I right-click and Apply Rules to individual emails, the rule suddenly works.
    What I'd like to work (and it doesn't), is to be able to select-all emails in inbox and then right-click, Apply Rules. But this never works. AND it'd be must faster than doing it individually. Thx!

  • Trying to buy an app, but it took me tore-set my credit card...My security code is all of a sudden not valid. I tried to re-enter itbutits the same MSG. Can u help?

    Trying to buy an app, but it took me tore-set my credit card...My security code is all of a sudden not valid. I tried to re-enter itbutits the same MSG. Can u help?

    Is the address on your iTunes account exactly the same (format and spacing etc) as on your credit card bill : http://support.apple.com/kb/TS1646 ? If it is then you could try what it says at the bottom of that page :
    If the issue persists, contact your credit card company and verify that they and any company they use to process credit card authorisations have the correct information on file.
    And/or try contacting iTunes support : http://www.apple.com/support/itunes/contact/ - click on Contact iTunes Store Support on the right-hand side of the page

  • TS4284 Disk Utility may not be able to verify or repair permissions on a software RAID set

    Disk Utility may not be able to verify or repair permissions on a software RAID set
    Symptoms
    When Lion is installed on a software RAID set, Disk Utility may not be able to verify or repair volume permissions. The process appears to start but immediately stops.
    Resolution
    Use the command line tool diskutil to verify and/or repair permissions on a software RAID volume. Note: The user running these commands must have administrative privileges.
    https://support.apple.com/kb/TS4284
    How does a bug like this survive testing?

    "How does a bug like this survive testing?"
    Exactly.

  • Ort Express- my network suddenly not showing up. System preferences, advanced/network shows network. Opened Airport Utility options grayed out except configure other but no luck. No set up assistant option to reenter.  Not sure where to go from here?.

    Airport Express.
    My network suddenly not showing up.  Openend system preferences/Network & my network is there. When I open Airport Utility, most option are grayed out except "configure other" with no change.  Utility does not provide option for setup assistant where I could just start over.  So no wifi without network set up and can't use ethernet/router to troubleshoot.  I am stuck any assistance helpful.

    So I went back to the PC after my original post, and somehow I was able to connect to the AirPort wireless network in the Windows Wireless Network List. Once I did this, it showed up in the AirPort Utility, and I was able to access the internet. I have no idea what fixed the problem, but maybe it was simply a matter of waiting a few minutes before connecting to the network?

  • When trying to restore volume, does not detect my raid set.

    So I just bought a Mac Pro. After setting it up, I finally got my extra 320gb hd. So I back everything up with Time Machine, create a raid 0 set with two 320GB HDs, and then try to restore the machine. The problem is when I try to select the volume to restore the Time Machine HD onto, it doesn't detect my RAID set. How do I fix this?

    I'm guessing you created a volume on your raid 0 set? Also, is this an issue with software raid or hardware raid, using the apple raid card?

  • Degraded RAID set help

    Hello Support Community,
    I have a 4 bay RAID set that all of a sudden is showing as degraded.  I have two pairs of disks each striped, and then have those two pairs mirrored.  Not sure if this is a correct RAID format, but it is what it is right now.  So both striped sets say online, with no problems, but my mirrored set of those two pairs shows that it is degraded.  See below.
    Any thoughts as to what might be happening?  I have the options set to automatically rebuild the RAID.  How do I know this is happening?  How long should I expect this to take?  It's a 4TB raid in it's current config, and there is about 3.7TB of data on this RAID.  I have everything backed up in two other locations.  Am I better off starting from scratch?  Or should I just let this thing run for 2 weeks and see what happens?  Any help would be greatly appreciated.
    Thanks
    Dave

    No, I was doing all of this in disk utility.  I left the arrary running overnight, and it's back online now.  Not sure what got screwed up.
    Thanks

  • Mirrored RAID set has degraded following power outage.

    Hello,
    Following a recent power outage our Mac Pro running Leopard OSX Server with 2 x 1TB discs in a Mirroring RAID configuration (with an installed RAID card) developed a 'severe error' message.
    The Raid Set R0-1 has a Viable (degraded) status.
    Drive One (or bay 1) is 'Good' and 'Assigned'
    Drive Two is 'Good' but with the State - 'Roaming'
    Also, the events display describes the failure of a drive 3 (there isn't one) and that the R0-1 is Degraded and no spare is available.
    So, we're a little confused.
    1. Is the Drive 2 no longer part of the RAID mirror (i.e. Roaming)
    OR
    2. Something more significant has happened hence the bogus Drive 3 message?
    Any suggestions or advice would be much appreciated as always.
    Thanks
    Steve

    Yes. Verification with the GUI tool is the first step. But... hmm, if you are dropping communications with the card, that is not a good sign. Make sure you have a backup and then try a PMU reset on the system. Maybe there is something wacky in the power manager. Then try any/all of these from Terminal to get more information:
    raidutil list status
    raidutil list eventinfo
    raidutil list raidsetinfo
    This should provide feedback. If these commands fail, then I fear that the card is not responding. Do you have AppleCare? It might be time to call for a replacement card.
    Hope this helps

Maybe you are looking for