Degraded Health Sets

Hi,
I am currently running a 2013 CU2 DAG with 2 database and 2 cas servers. SCOM is reporting the following but i can find very little info on it;
Alert: Health Set unhealthy
Source: <server name> - Outlook.Protocol
Last modified by: System
Last modified time: 4/2/2014 3:49:58 PM
Alert description: EMSMDB.DoRpc(Logon) step of OutlookRpcDeepTestProbe/<database name> has failed against <server name> proxying to <server name> for HealthMailboxb63d235bb56b428ebf56ea594d3ca0c7@CEOSMTPServer.
Latency: 00:00:00.0520000
ActivityContext: I32:ADS.C[Apollo]=1;F:ADS.AL[Apollo]=3.3585;I32:ADR.C[Apollo]=1;F:ADR.AL[Apollo]=3.0093;I32:ADS.C[Razor]=2;F:ADS.AL[Razor]=2.0185
Outline: [50] EMSMDB.Connect(); [1][FAILED!] EMSMDB.DoRpc(Logon); Likely root cause: Momt
Details:
Error: Error returned in LogonCallResult. Error code = WrongServer (0x00000478)
Log:     Mailbox logon verification
        EMSMDB.Connect()
        Task produced output:
        - TaskStarted = 2/04/2014 3:49:25 PM
        - TaskFinished = 2/04/2014 3:49:25 PM
        - ErrorDetails =
        - RespondingRpcClientAccessServerVersion = 15.0.712.4012
        - Latency = 00:00:00.0505291
        - ActivityContext = I32:ADS.C[Apollo]=1;F:ADS.AL[Apollo]=3.3585;I32:ADR.C[Apollo]=1;F:ADR.AL[Apollo]=3.0093;I32:ADS.C[Razor]=2;F:ADS.AL[Razor]=2.0185
    EMSMDB.Connect() completed successfully.
        EMSMDB.DoRpc(Logon)
        Task produced output:
        - TaskStarted = 2/04/2014 3:49:25 PM
        - TaskFinished = 2/04/2014 3:49:25 PM
        - Exception = Microsoft.Exchange.RpcClientAccess.RopExecutionException: Error returned in LogonCallResult. Error code = WrongServer (0x00000478)
        - ErrorDetails =
        - Latency = 00:00:00.0010381
        - ActivityContext = I32:ADS.C[Apollo]=1;F:ADS.AL[Apollo]=3.3585;I32:ADR.C[Apollo]=1;F:ADR.AL[Apollo]=3.0093;I32:ADS.C[Razor]=2
Any help would be greatly appreciated.
Thanks

Hi,
Please run the following command and post the output:
Get-ServerHealth -Identity Servername -HealthSet Outlook.Protocol
In addition, I recommend you run "test-mapiconnectivity" and check event viewer on exchange server.
http://technet.microsoft.com/en-us/library/bb123681(v=exchg.150).aspx
Use the Test-MapiConnectivity cmdlet to verify server functionality by logging on to the mailbox that you specify. If you don't specify a mailbox, the cmdlet logs on to the SystemMailbox on the database that you specify.
Thanks.
Niko Cheng
TechNet Community Support

Similar Messages

  • Performance degradation after setting filesystemio_option=setall from none.

    Hi All,
    We have facing performance degradation after setting filesystemio_option=setall from none on my two servers as mentioned below.
    Red Hat Enterprise Linux AS release 4 (Nahant Update 7) 2.6.9 55.ELhugemem (32-bit)
    Red Hat Enterprise Linux Server release 5.2 (Tikanga) 2.6.18 92.1.10.el5 (64-bit)
    We are seeing lots of Disk I/O happening. We expected "*filesystemio_option=setall* " will improve performance but it is degrading. We getting slowness complains.
    Please let me know do we need to set somethign else along with this ...like any otimizer parameter( e.g. optimizer_index_cost_adj, optimizer_index_caching).
    Please help.

    Hi Suraj,
    <speculation>
    You switched filesystemio_options to setall from none, so, the most likely reason for performance degradation after switching to setall is the implementation of directio. Direct I/O will skip the filesystem buffer cache, and and allow Oracle to read directly from disk to the database buffer cache. However, on a system where direct I/O is not implemented, which is what you had until you recently messed with that parameter, it's likely that you had an undersized database buffer cache, but that was ok, because many (most) of the physical I/Os your database was doing, were actually being serviced by the O/S filesystem buffer cache. But, you introduced direct I/O, and wiped out the ability of the O/S to service any physical I/Os from filesystem buffer cache. This means that every cache miss on the database buffer cache, turns into a real, physical, spin-the-disk, move-the-drive-head, physical I/O. And, you are suffering the performance consequences.
    </speculation>
    Ok, end of speculation. Now, assuming that what I've outlined above is actually going on, what to do? Why is direct I/O lower performing than buffered, non-direct I/O? Shouldn't it's performance be superior?
    Well, when you have an established system that's using buffered I/O, and you switch to direct I/O, you almost always will have to increase the size of the database buffer cache. The problem is that you took a huge chunk of memory away from the the O/S, that it was using to buffer your I/Os and avoid physical I/O. So, now, you need to make up for it, by increasing the size of the database buffer cache. You can do this, without buying more memory for the box, because the O/S is no longer going to need to use so much memory for filesystem buffers.
    So, what to do? Is it worth switching? Well, on balance, it makes sense to use direct I/O, and give Oracle a larger database buffer cache, for the simple fact that (particularly on a server that's dedicated to being an Oracle database server), Oracle has far more sophisticated caching algorithms, and a better understanding of the various types of data being cached, and so should be able to make more efficient use of the memory, than the (relatively) brain dead caching algorithms of the kernel and filesystem mechanisms.
    But, once again, it all comes down to this:
    What problem are you trying to solve? Did you have any I/O related issues? Do you have any compelling reason to implement direct I/O? Rule #1 is "if it ain't broke, don't fix it." Did you just violate rule #1? :-)
    Finally, since you're on Linux, you can use the 'free' command to see how much memory is on the box, how much is free, and how much is dedicated to filesystem cache buffers. This response is already pretty long, so, I'm not going to get into details, however, if you're not familiar with the command, the results could be misleading. Read the man page, and try to be clear about understanding it before you make any assumptions about the output.
    Hope that helps,
    -Mark

  • Alert: Health Set unhealthy - Clustering

    We have SCOM 2012 R2 setup to monitor our Exchange 2013 CU5 enviroment and we have gotten this error message about our Clustering going in to an unhealthy state a couple of times.  We have checked the FSW and everything seems OK on its end.  I
    cannot find much out there on this message, so any help would be greatly appreciated:
    Alert: Health Set unhealthy
    Source: EXCHANGE04 - Clustering
    Path: EXCHANGE04.company.com;EXCHANGE04.company.com
    Last modified by: System
    Last modified time: 8/24/2014 1:36:35 PM Alert description: The Cluster Group has not been healthy for 7200 minutes. The most recent probe failure message is: Check 'Microsoft.Exchange.Monitoring.QuorumGroupCheck' thrown an Exception!
    Exception - Microsoft.Exchange.Monitoring.ReplicationCheckFailedException: QuorumGroup has failed. Specific error is: Quorum resource 'Cluster Group' is not online on server 'exchange06'. Database availability group 'exchDAG' might not be reachable or may have
    lost redundancy. Error:
      File Share Witness (\\FSW01.company.com\exchDAG.company.com): Offline  is offline. Please verify that the Cluster service is running on the server.
       at Microsoft.Exchange.Monitoring.ReplicationCheck.Fail(LocalizedString error)
       at Microsoft.Exchange.Monitoring.QuorumGroupCheck.RunCheck()
       at Microsoft.Exchange.Monitoring.DagMemberCheck.InternalRun()
       at Microsoft.Exchange.Monitoring.ReplicationCheck.Run()
       at Microsoft.Exchange.Monitoring.ActiveMonitoring.HighAvailability.Probes.ReplicationHealthChecksProbeBase.RunReplicationCheck(Type checkType) Check 'Microsoft.Exchange.Monitoring.QuorumGroupCheck' did not Pass!
    Detail Message - Quorum resource 'Cluster Group' is not online on server 'exchange06'. Database availability group 'exchDAG' might not be reachable or may have lost redundancy. Error:
      File Share Witness (\\FSW01.company.com\exchDAG.company.com): Offline  is offline. Please verify that the Cluster service is running on the server.
    To add some additional information, when I look in Failover cluster manager this is what I see.  I know when we setup the servers the correct FSW information was being displayed.

    Hi,
    According to the error message, "Offline  is offline. Please verify that the Cluster service is running on the server.",
    I suggest double check whether the Cluster service is running as well. If not, please restart the service manually to verify whether this issue exists.
    Please also refer the blog below to double check whether the FSW online:
    Verifying the file share witness server / directory in use for Exchange 2010
    http://blogs.technet.com/b/timmcmic/archive/2012/03/12/verifying-the-file-share-witness-server-directory-in-use-for-exchange-2010.aspx
    If there is nothing abnormal on the Exchange server, it seems an issue on the SCOM side. Please contact SCOM Forum for help so that you can get more professional suggestions. For your convenience:
    http://social.technet.microsoft.com/Forums/systemcenter/en-US/home?category=systemcenteroperationsmanager
    Thanks
    Mavis
    Mavis Huang
    TechNet Community Support

  • Exchange 2013 CU2, Alert for OWA Health set unhealthy from SCOM 2012

    I am facing issue in Exchange 2013 CU2, I got this alert from SCOM 2012 atleast 5-6 times a day, OWA health set is unhealthy, I have done all the steps mentioned in this web link. Authentication type for OWA Virtual directory is integrated windows and Basic.
    I have 2 CAS servers, and this alert generated from both of them.
    http://technet.microsoft.com/en-us/library/ms.exch.scom.OWA(EXCHG.150).aspx?v=15.0.712.24
    Alert: Health Set unhealthy
    Source: EX-CAS - OWA
    Path: EX-CAS;EX-CAS
    Last modified by: System
    Last modified time: 1/5/2014 8:15:08 PM
    Alert description: Outlook Web Access logon is failing on ClientAccess server EX-CAS.
    Availability has dropped to 0%. You can find protocol level traces for the failures on C:\Program Files\Microsoft\Exchange Server\V15\Logging\Monitoring\OWA\ClientAccessProbe.
    Incident start time: 1/6/2014 4:05:08 AM
    Last failed result:
    Failing Component - Owa
    Failure Reason - CafeFailure
    Exception:
    System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> Microsoft.Exchange.Net.MonitoringWebClient.ScenarioException:
    Microsoft.Exchange.Net.MonitoringWebClient.ScenarioException:
    Failure source: Owa
    Failure reason: CafeFailure
    Failing component:Owa
    Exception hint: CafeErrorPage: CafeFailure Unauthorized Inner exception: Microsoft.Exchange.Net.MonitoringWebClient.CafeErrorPageException
    ErrorPageFailureReason: CafeFailure, RequestFailureContext: FailurePoint=FrontEnd, HttpStatusCode=401, Error=Unauthorized, Details=, HttpProxySubErrorCode=, WebExceptionStatus=
    Microsoft.Exchange.Net.MonitoringWebClient.CafeErrorPageException: An error occurred on the Client Access server while processing the request
    WebExceptionStatus: Success
    GET https://localhost/owa/ HTTP/1.1
    User-Agent: Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1; MSEXCHMON; ACTIVEMONITORING; OWACTP)
    Accept: */*
    Cache-Control: no-cache
    X-OWA-ActionName: Monitoring
    Cookie:
    HTTP/1.1 401 Unauthorized
    request-id: 211474d2-a43e-4fab-8038-3aab35353568
    X-FailureContext: FrontEnd;401;VW5hdXRob3JpemVk;;;
    Server: Microsoft-IIS/7.5
    WWW-Authenticate: Negotiate,NTLM,Basic realm="localhost"
    X-Powered-By: ASP.NET
    X-FEServer: EX-CAS
    Date: Mon, 06 Jan 2014 04:14:47 GMT
    Content-Length: 0
    Response time: 0s
     ---> Microsoft.Exchange.Net.MonitoringWebClient.CafeErrorPageException: Microsoft.Exchange.Net.MonitoringWebClient.CafeErrorPageException
    ErrorPageFailureReason: CafeFailure, RequestFailureContext: FailurePoint=FrontEnd, HttpStatusCode=401, Error=Unauthorized, Details=, HttpProxySubErrorCode=, WebExceptionStatus=
    Microsoft.Exchange.Net.MonitoringWebClient.CafeErrorPageException: An error occurred on the Client Access server while processing the request
    WebExceptionStatus: Success
    GET https://localhost/owa/ HTTP/1.1
    User-Agent: Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1; MSEXCHMON; ACTIVEMONITORING; OWACTP)
    Accept: */*
    Cache-Control: no-cache
    X-OWA-ActionName: Monitoring
    Cookie:
    HTTP/1.1 401 Unauthorized
    request-id: 211474d2-a43e-4fab-8038-3aab35353568
    X-FailureContext: FrontEnd;401;VW5hdXRob3JpemVk;;;
    Server: Microsoft-IIS/7.5
    WWW-Authenticate: Negotiate,NTLM,Basic realm="localhost"
    X-Powered-By: ASP.NET
    X-FEServer: EX-CAS
    Date: Mon, 06 Jan 2014 04:14:47 GMT
    Content-Length: 0
    Response time: 0s
       --- End of inner exception stack trace ---
       at Microsoft.Exchange.Net.MonitoringWebClient.BaseExceptionAnalyzer.Analyze(TestId currentTestStep, HttpWebRequestWrapper request, HttpWebResponseWrapper response, Exception exception, Action`1 trackingDelegate)
       at Microsoft.Exchange.Net.MonitoringWebClient.HttpSession.AnalyzeResponse[T](HttpWebRequestWrapper request, HttpWebResponseWrapper response, Exception exception, HttpStatusCode[] expectedStatusCodes, Func`2 processResponse)
       at Microsoft.Exchange.Net.MonitoringWebClient.HttpSession.EndSend[T](IAsyncResult result, HttpStatusCode[] expectedStatusCodes, Func`2 processResponse, Boolean fireResponseReceivedEvent)
       at Microsoft.Exchange.Net.MonitoringWebClient.HttpSession.EndGet[T](IAsyncResult result, HttpStatusCode[] expectedStatusCodes, Func`2 processResponse)
       at Microsoft.Exchange.Net.MonitoringWebClient.Authenticate.AuthenticationResponseReceived(IAsyncResult result)
       --- End of inner exception stack trace ---
       at Microsoft.Exchange.Net.MonitoringWebClient.BaseTestStep.EndExecute(IAsyncResult result)
       at Microsoft.Exchange.Net.MonitoringWebClient.Owa.OwaLogin.AuthenticationCompleted(IAsyncResult result)
       --- End of inner exception stack trace ---
       at Microsoft.Exchange.Net.MonitoringWebClient.BaseTestStep.EndExecute(IAsyncResult result)
       at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Bool
    States of all monitors within the health set:
    Note: Data may be stale. To get current data, run: Get-ServerHealth -Identity 'EX-CAS' -HealthSet 'OWA'
    State              
    Name                                   
    TargetResource                     HealthSet                    
    AlertValue     ServerComponent    
    NotApplicable       OwaCtpMonitor                                                             
    OWA                          
    Unhealthy      None               
    States of all health sets:
    Note: Data may be stale. To get current data, run: Get-HealthReport -Identity 'EX-CAS'
    State              
    HealthSet                    
    AlertValue     LastTransitionTime      
    MonitorCount       
    NotApplicable       ActiveSync                   
    Healthy        1/3/2014 5:21:13 AM     
    2                  
    NotApplicable       AD                           
    Healthy        11/24/2013 6:54:18 AM  
     10                 
    NotApplicable       ECP                          
    Healthy        1/5/2014 3:03:05 AM     
    1                  
    Online             
    Autodiscover.Proxy           
    Healthy        11/20/2013 10:06:37 AM  
    1                  
    NotApplicable       Autodiscover                 
    Healthy        1/3/2014 10:18:17 PM    
    2                  
    Online             
    ActiveSync.Proxy             
    Healthy        11/20/2013 10:06:37 AM  
    1                  
    Online             
    ECP.Proxy                
        Healthy       
    11/21/2013 6:16:08 PM    4                  
    Online             
    EWS.Proxy                    
    Healthy        11/20/2013 10:06:37 AM  
    1                  
    Online             
    OutlookMapi.Proxy            
    Healthy        11/24/2013 6:54:28 AM   
    4                  
    Online             
    OAB.Proxy                    
    Healthy        11/19/2013 7:14:34 PM   
    1                  
    Online             
    OWA.Proxy                    
    Healthy        11/20/2013 10:06:37 AM  
    2                  
    NotApplicable       EDS                          
    Healthy        1/3/2014 5:19:56 AM     
    10                 
    Online             
    RPS.Proxy                    
    Healthy        1/3/2014 5:21:27 AM     
    13                 
    Online             
    RWS.Proxy                     Healthy       
    1/3/2014 5:20:09 AM      10                 
    Online             
    Outlook.Proxy                
    Healthy        1/3/2014 5:21:12 AM     
    4                  
    NotApplicable       EWS                          
    Healthy        1/3/2014 10:18:17 PM    
    2                  
    Online             
    FrontendTransport            
    Healthy        1/5/2014 3:47:09 AM     
    11                 
    Online             
    HubTransport                 
    Healthy        1/5/2014 3:47:09 AM     
    29            
    NotApplicable       Monitoring                   
    Unhealthy      1/5/2014 4:05:57 AM     
    9                  
    NotApplicable       DataProtection               
    Healthy        1/3/2014 5:25:42 AM     
    1                  
    NotApplicable       Network                       Healthy       
    1/4/2014 1:51:16 PM      1                  
    NotApplicable       OWA                          
    Unhealthy      1/5/2014 8:05:08 PM     
    1                  
    NotApplicable       FIPS                         
    Healthy        1/3/2014 5:21:12 AM     
    3                  
    Online             
    Transport                    
    Healthy        1/5/2014 4:11:00 AM     
    9                  
    NotApplicable       RPS                          
    Healthy        11/20/2013 10:07:12 AM  
    2                   
    NotApplicable       Compliance                   
    Healthy        11/20/2013 10:08:10 AM  
    2                  
    NotApplicable       Outlook                      
    Healthy        11/21/2013 6:12:54 PM   
    2                  
    Online             
    UM.CallRouter                
    Healthy        1/5/2014 3:47:10 AM     
    7                  
    NotApplicable       UserThrottling               
    Healthy        1/5/2014 4:16:42 AM     
    7                  
    NotApplicable       Search       
                    Healthy       
    11/24/2013 6:55:06 AM    9                  
    NotApplicable       AntiSpam                     
    Healthy        1/3/2014 5:16:43 AM     
    3                  
    NotApplicable       Security                     
    Healthy        1/3/2014 5:19:28 AM     
    3                  
    NotApplicable       IMAP.Protocol                
    Healthy        1/3/2014 5:21:14 AM     
    3                  
    NotApplicable       Datamining                   
    Healthy        1/3/2014 5:18:34 AM     
    3          
    NotApplicable       Provisioning                 
    Healthy        1/3/2014 5:19:56 AM     
    3                  
    NotApplicable       POP.Protocol                 
    Healthy        1/3/2014 5:20:44 AM     
    3                  
    NotApplicable       Outlook.Protocol             
    Healthy        1/3/2014 5:19:46 AM     
    3                  
    NotApplicable       ProcessIsolation             
    Healthy        1/3/2014 5:19:26 AM     
    9                  
    NotApplicable       Store                        
    Healthy        1/3/2014 5:20:38 AM     
    6                  
    NotApplicable       TransportSync                
    Healthy        11/24/2013 6:53:09 AM   
    3                  
    NotApplicable       MailboxTransport             
    Healthy        1/3/2014 5:21:11 AM     
    6                   
    NotApplicable       EventAssistants              
    Healthy        11/21/2013 6:22:01 PM   
    2                  
    NotApplicable       MRS                          
    Healthy        1/3/2014 5:20:29 AM     
    3                  
    NotApplicable       MessageTracing               
    Healthy        1/3/2014 5:18:15 AM     
    3                  
    NotApplicable       CentralAdmin                 
    Healthy        1/3/2014 5:17:25 AM     
    3                  
    NotApplicable       UM.Protocol                  
    Healthy        1/3/2014 5:17:08 AM     
    3                  
    NotApplicable       Autodiscover.Protocol        
    Healthy        1/3/2014 5:17:13 AM     
    3                  
    NotApplicable       OAB                          
    Healthy        1/3/2014 5:20:51 AM     
    3                  
    NotApplicable       OWA.Protocol                 
    Healthy        1/3/2014 5:20:52 AM     
    3                  
    NotApplicable       Calendaring                  
    Healthy        11/24/2013 6:56:59 AM   
    3                  
    NotApplicable       PushNotifications.Protocol   
    Healthy        11/21/2013 6:16:05 PM   
    3                  
    NotApplicable       EWS.Protocol                 
    Healthy        1/3/2014 5:19:07 AM     
    3                  
    NotApplicable       ActiveSync.Protocol 
             Healthy       
    1/3/2014 5:20:16 AM      3                  
    NotApplicable       RemoteMonitoring             
    Healthy        1/5/2014 3:47:09 AM     
    3
    Any solution for this alert, how to rectify it, but OWA is running perfect for all users.           

    Hi,
    Sorry for the late reply.
    Do we have Exchange 2010 coexistence?
    If it is the case, I know the following known issue:
    Release Notes for Exchange 2013
    http://technet.microsoft.com/en-us/library/jj150489%28v=exchg.150%29.aspx
    Please note the "Exchange 2010 coexistence" session.
    If it is not related to our problem, please check the IIS log.
    If there is any detailed error code, like 401.1, 401.2, please let me know.
    Hope it is helpful
    Thanks
    Mavis
    If you have feedback for TechNet Subscriber Support, contact
    [email protected]
    Mavis Huang
    TechNet Community Support

  • Health set components seems to be unhealthy

    Hi,
    In my environment health sets components seems to be unhealthy but there is no problem with user side
    Below are the components
     HealthSet           AlertValue
     MailboxTransport    Unhealthy
     HubTransport        Unhealthy
     ECP                 Unhealthy
     Search              Unhealthy
     Store               Unhealthy
     MSExchangeCertif... Disabled
     DataProtection      Unhealthy
     RPS                 Unhealthy
     RWS                 Unhealthy
     Compliance          Unhealthy
     Outlook             Unhealthy
    Can somebody help me through this please.

    Hello,
    I think you can combine the heltht reports with the application log?
    Is there any warning or error reprot in it about these unhealty items. If no, I think we can safely ingore these errors.
    Thanks,
    Simon Wu
    TechNet Community Support

  • FrontendTransport health set unhealthy (OnPremisesSmtpClientSubmissionMonitor)

    FrontendTransport health set unhealthy (OnPremisesSmtpClientSubmissionMonitor) - The client submission probe failed 3 times over 15 minutes. 
    Seems like these alerts have started comming for some of the servers, where mailbox and CAS role is installed together. when i cehcked the queue, all seems to be fine. Performed the below mentioned steps, but the issue didn't fixed:
    1. invoke-monitoringprobe" command doesn't work.
    2. Have restarted "health manager service" didn't work.
    Still the alert value is in uhealthy state, have anyone come across the same issue, if so, can you share what are the steps that we have take? 
    Your answers are much appreciated!

    Hi,
    Please check the Monitor Result and Probe Result in the following path and see if there is any related message.
    Event Viewer\Applications and Services Logs\Microsoft\Exchange\ActiveMonitoring\ProbeResult( or MonitorResult).
    Based on your description, everthing works well except this alert. However, there is a way to hide the alert by overriding the monitor using the command below:
    Add-GlobalMonitoringOverride -Identity "FrontendTransport\OnPremisesSmtpClientSubmissionMonitor" -PropertyName Enabled -PropertyValue 0 -ItemType Monitor -ApplyVersion "version"
    Hope this is helpful to you.
    Best regards,
    Belinda Ma
    TechNet Community Support

  • Degraded RAID set (mirror)

    I am running a pair of 2 GB external drives in RAID 1  (mirroring) using the OSX Disk Utility.  Recently I noticed that the set shows it's RAID status as Degraded and one of the two drives is indicated as "Missing," which keeps the "rebuild raid set" button grey.  However, I can verify each of them separately and they appear to be okay.  Only one of the drives appears to be partitioned correctly; the "missing" drive simply shows a RAID slice for the entire 2 GB.  
    I would recreate the mirror set, except that I don't have anywhere to store the 1.3 GB still on the good drive, and I believe I cannot create the set again without erasing the contents.  (I back up to the RAID set as well as use it for un-backed up storage, which I think is safe as long as the RAID set is working.)
    Any ideas how to get the "missing drive" to reappear, so the system can rebuild the set?  Or any other ideas to get out of this problem?  Thanks.

    It doesn't change the problem, but obviously I meant TB (not GB)

  • Degraded RAID set help

    Hello Support Community,
    I have a 4 bay RAID set that all of a sudden is showing as degraded.  I have two pairs of disks each striped, and then have those two pairs mirrored.  Not sure if this is a correct RAID format, but it is what it is right now.  So both striped sets say online, with no problems, but my mirrored set of those two pairs shows that it is degraded.  See below.
    Any thoughts as to what might be happening?  I have the options set to automatically rebuild the RAID.  How do I know this is happening?  How long should I expect this to take?  It's a 4TB raid in it's current config, and there is about 3.7TB of data on this RAID.  I have everything backed up in two other locations.  Am I better off starting from scratch?  Or should I just let this thing run for 2 weeks and see what happens?  Any help would be greatly appreciated.
    Thanks
    Dave

    No, I was doing all of this in disk utility.  I left the arrary running overnight, and it's back online now.  Not sure what got screwed up.
    Thanks

  • Degraded RAID set

    Hi All,
    Dual Core 2.0 running 10.4.2. I set up a raid set and one of the drives almost immediately reported a SMART failure. So I replaced it. No problem, no downtime, even. The machine functioned fine with just the one drive.
    Now...
    Bought a replacement second drive. Added it to the RAID set, rebuilt it. No problem, but the RAID set still reports as degraded. I cannot delete the damaged drive because I don't have it anymore (hindsight is 20/20). How can I delete the non-existent drive?
    Thanks,
    Danny
    Dualcore 2.0 G5   Mac OS X (10.4.2)  

    Yeah, but if you've only got 2 slices and one of them
    is out to lunch, well, it's not rocket science to
    know what your risk is ...
    Well, actually, one failed and has been replaced. It was pretty easy to add the new drive to the RAID set (just drag and drop into the RAID and then rebuild it). The machine worked ave without the failed drive hence...
    What worries me is that I don't know for sure
    if the RAID array is actually working as designed and
    I haven't found a definitive statement about what
    exactly 'degraded status' means. Does it mean
    'not working as well as it should but still doing the
    job' or does it mean 'this RAID array isn't doing
    diddly and sometime soon you're gonna be up the
    creek.'
    As you could see, the RAID was working very well. One drive died and the machine continued working fine, so there was NO downtime (except to switch off the G5 and yank out the failed drive). I'm guessing degraded means that one of the registered slices is missing. But if you have another registered slice, then you're fine.
    More specifically, if my primary drive actually does
    die, is there a complete dupe on the second drive in
    the array or not?
    As in my example, that's exactly right. The hard drives are exact copies.
    I guess I should go ahead and do a hard backup on an
    external drive and then just pull out the primary
    drive (it's hot-swappable) and see what happens.
    Yes, RAID is not back-up. If you have a corrupted directory structure due to software, then you would have to rely on a back-up. But it's nice to have both, really, as the most common problem with disks is hardware failure and this minimizes problems involved.
    To tell you the truth, though, I'm not sure I'd bother again. I find that using psync (available from bombich.com under the Carbon Copy cloner section) works fine. It creates a fully working system disk and can be set to clone the start-up drive daily. If the main disk fails, most people won't even notice that they're starting up from the second one.
    So back to the question: how to remove this degraded status?

  • Degraded raid set now missing

    I have or should I say had 2 250GB mirrored raid set. They would no longer boot I bought a new drive (same size) and attempted to rebuild the set. This went fine for a time but then failed. Now I can no longer see the raidset. I have tried booting from the server CD and it will stay booted just long enought to show me the raid set is gone or to open terminal then it shuts itself down.
    I desperately need the data on this server. My most recent backup to tape was the end of January.
    G5 Server   Mac OS X (10.3.9)  

    Both hard drives were ruined by a power surge.

  • RAID degraded - RedundancyScrub error message

    Hi all,
    Even though my problem is not on an X-Serve box but on a MacPro (2008), I run OSX Server 10.5.4 and have not found anything in the MacPro or Server OS forums, hence my post here in the hope of an experienced admin being able to help here.
    After having a new RAID 5 setup on 4x500GB Seagate SATA drives and Apple's RAID card running fine for over 2 months, I ran the 10.5.4 update (after much deliberation on the pro's and con's) and promptly had this issue. Only found out a few days after running the update though, and after going through all logs, I could pin the problem to exactly the time of immediately after the OS update. I have searched high and low to find anything in the RAID and OS Server forums, but haven't found anything there to resolve this error message with or without cli:
    The "RedundancyScrub" command could not be executed. (The request failed because a volume service is currently running)
    Yes, have tried to verify RAID, only to get an error popup when trying to verify...
    Could not verify the RAID set "RS1". there was a problem communicating with the device.
    Since it is basically still running ok, but degraded, I need to do something, even if it means a wipe and re-install of the entire Server OS and RAID setup. I have had a few issues with Leopard anyway, and was thinking of doing a final clean install when .5 update is out that may tidy up a few more issues. But, any help in the meantime from an experienced RAID or X-Serve admin would be much appreciated.
    Thanks
    Chris

    I had to go on to Apple's list to find this, sorry about the length...it fixed my problem
    A client of mine accidentally removed a drive from his RAID-5
    array on an Intel XServe. He reinserted the drive and all three 1
    TB drives show up as good, but the set is showing as degraded.
    The log shows Degraded RAID Set R0-1 No Spare Available for
    Rebuild.
    When I attempt to run a verify, it says "Could not verify the
    RAIDset R0-1, There was a problem communicating with the device."
    After that the log shows "The redundancy Scrub command could not
    be executed. The request failed because a volume service is
    currently running."
    Googling this brings up a similar problem that two other people
    are having, but no one has a solution. Have any of you run into
    this?
    Hi, Mike,
    If you just replaced Apple Drive modules, the new one still will
    not be as spare drive for your raid set. You have to tell RAID
    card a spare drive came.
    You can do for this both CLI and GUI.
    CLI:
    - - use 'raidutil' command.
    $ sudo raidutil modify drive --addglobalspare -d <DriveBayNumber>.
    For more information, see raidutil(8).
    GUI:
    - - Launch /Applications/Utilities/RAID Utility.app
    Select 'Make Spare...' from menu 'RAID' after you selected the
    new drive.
    Cheers,
    - -takanori
    I think you misunderstand what happened, perhaps I should have been
    more specific. T
    he person took a drive out of a working RAID-5 while it was
    running. Then he replaced it. He pulled the drive from the wrong
    XServe in a rack, intending to pull a bad drive from a different
    server.
    Now we are unable to clear the "Degraded" message that it has. My
    assumption would be that it should have simply gone into a degraded
    state and then rebuilt itself automatically.
    Perhaps the drive should be pulled again and formatted, and then
    inserted and added as a spare?
    Hi, Mike,
    Ok. I got your condition.
    I had same situation couple weeks ago. I've also thought when drive
    module of raid set set back, Apple RAID Card would start rebuilding
    immediately . But it would not. So you have to mark spare to the
    drive module which was pulled out accidentally and set back same bay
    of xserve. You don't have to format it again. You can mark it spare
    drive right way.
    In my case, Apple RAID Card said it was a part of another raid set
    when I set the drive module back after my staff pulled it out
    carelessly. So Apple RAID Card never start rebuilding using with a
    part of raid set. I think it is good as fool-safe mechanism.
    The way of marking spare is as same as what I wrote above. If you
    decide to use CLI, you may have to use another subcommand of
    raidutil before making it spare. I repaired my raid set of xserve
    with RAID Utility.app just select 'Make Spare...' from menu 'RAID'.
    Careful and cheers,
    - -takanori
    Thank you Takanori, that was the thing to do. The RAID is rebuilding
    now.

  • Suggestion for fixing broken raid set

    Jesus. Again. Twice in 6 months. Raid set failure.
    2:58pm today : Drive 3:50014ee2aede46eb missing - Previous drive status was inuse
    2:59pm today : Degraded RAID set RS1
    2:59pm today : Degraded RAID set RS1 - No spare available for rebuild
    Jesus...
    After launching the Raid Utility, I notice that one drive is actually missing from the drive bays.. Its just gone, and I have not done anything to it. This happened last December as well. Hard booting the drive (pull out, push back in) worked last time to get the drive online, but jesus. Twice? I should maybe replace the drive? I am using Apple Raid Card, people say it turns to be pretty strict about the drives state, but why the **** it keeps disappearing from the system?!
    I was already on the phone with one Apple consultant about this, and I think everything is pretty OK. I have good backups, and gladly, the OS RAID set is ok. Only our accounts and work files were in there, and all is secured. But this is really stressful.. Feels like I can't trust these drives one bit. And they are good drives, standard hardware what comes with Mac Pro.
    Just when I was starting to think that everything is finally working smoothly..
    Any recommendations about how to act now. I know what I have to do but, it would be encouraging if I would get some steps to how to fix it. Working order etc.
    Everything works tho, taking one separate set of backups at the moment just in case. I just need to get my act together and fix it. God I am annoyed tho.
    Good weekend to everyone tho. Comments are appreciated.

    Yeah, I already ordered a new drive.
    I can't do any tests for the bad drive cos it just disappeared from the system totally. I guess it could come online if I hard boot it again (like when the raid set broke last december), but I don't feel like doing it before I get the new drive. Need to analyze it on another computer.
    The consultant I talked with earlier mentioned that the RAID card is pretty strict about the condition of the drive. But I would like to know if that is why the drive keeps disappearing, if it really totally ejects it from the system . I heard that some cases the the Raid Utility just shows the red light what indicates the drive state if there is a problem, but for me the drive is just gone totally.
    Hope that the new drive arrives soon. Will the Raid rebuild itself if I insert the new drive and mark it as global spare ? That's what I understood from reading the Raid Utility manual.

  • RWS.Proxy and ECP.Proxy health checks, localhost, and SSL

    RWS.Proxy and ECP.Proxy health sets are both failing. In both of the errors, I find the following:
    [000.000] Starting HTTP request task
    [000.000] Waiting 59000 ms
    [000.000] Issuing GET against https://localhost/ecp/
    [000.000] Awaiting GET response
    [000.000] Performing SSL validation
    [000.000] Performing SSL validation
    [000.000] Failed with exception: The underlying connection was closed: An unexpected error occurred on a receive.
    [000.000] Starting HTTP request task
    [000.000] Waiting 59000 ms
    [000.000] Issuing GET against https://localhost/ecp/ReportingWebService/
    [000.000] Awaiting GET response
    [000.000] Performing SSL validation
    [000.000] Performing SSL validation
    [000.000] Failed with exception: The underlying connection was closed: An unexpected error occurred on a receive.
    We require SSL on all connections. We use a third party certificate with multiple SANs. Since the probe is trying to use https://localhost, it fails because the name doesn't match.
    I figure I have a few options: first, is there a way to change the URL that the probe uses to check. This seems to me to be the 'rightest' way I could fix this. Second could I alter the binding of the site so that the localhost hostname uses a dedicated,
    self signed, trusted cert? Last, is there any way to simply disable the specific probes? We're a single server low volume setup and I'm not convinced that I need the probes anyway.
    IS this a common issue? Outside of the warnings that SCOM throws at me, it is also causing a large volume of logs to be generated.
    Justin Cervero - MS Enterprise Admin - Appalachian State University

    Hi,
    I am afraid it’s hard coded. Just like the “Test-Outlookwebsiervices” command, it will also try the “localhost” and reports errors about certificate host name mismatch issue.
    We can safely ignore this report.
    Thanks,
    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact
    [email protected]
    Simon Wu
    TechNet Community Support

  • Mailbox Transport health state unhelathy

    Hi  team,
    I check my Exchange 2013 Mailbox server, abd the result, Mailbox Transport health set become unhealthy
    and after i check into mailbox transport detail, there is some error. here the details
    [PS] C:\Windows\system32>get-healthreport -server BCEJKT-MBX2-SVR | where {$_.alertvalue -ne "healthy"} | ft -auto
    Server          State         HealthSet                       AlertValue LastTransitionTime  MonitorCount
    NotApplicable FIPS                            Unhealthy  8/6/2014 3:41:14 AM 22
    NotApplicable Monitoring                      Unhealthy  8/6/2014 3:56:32 AM 9
    NotApplicable MailboxTransport                Unhealthy  8/6/2014 4:11:41 AM 56
    NotApplicable MSExchangeCertificateDeployment Disabled   1/1/0001 7:00:00 AM 2
    Server          State           Name                 TargetResource       HealthSetName   AlertValue ServerComp
                               onent
    NotApplicable   Mapi.Submit.Monitor  MailboxTransport     MailboxTransport Unhealthy  None
    NotApplicable   MailboxDeliveryAvail                      MailboxTransport Unhealthy  None
                                    abilityMonitor                            
    NotApplicable   TransportDeliveryFai                      MailboxTransport Disabled   None
                                    luresDeliveryStoreDr                      
                                    iver560Monitor
    what error means?
    and, why the state  "NotApplicable" ?
    Is there any services trouble (disturbed)?
    Please give me details :)
    Thanks 
    Regards

    Hi,
    There is no official document explaining the state "NotApplicable". Search all related articles about HealthSet, the state is always NotApplicable.
    From the output of running Get-HealthReport cmdlet, FIPS, Monitoring and MailboxTransport health set are unhealthy. Please use the Test-ServiceHealth cmdlet to check result.
    Besides, please check the application log and system log for events related to this feature.
    Best regards,
    Belinda
    Belinda Ma
    TechNet Community Support

  • Need help with RAID Card and degraded Raid-5 errors

    Dear all,
    I recently purchased a used Apple RAID card for my 2008 Mac Pro 8-Core. The installation went smooth, the card was immediately recognized and the battery reconditioned within one night.
    So I started setting up a Raid Set with the 4 identical drives which I already used before as a software Raid. But each time the Raid Level-5 Volume is created, somewhat later the status turns red and the Raid is listed as "degraded"!
    A closer look at log reveals:
    +19:42:54 Drive carrier 00:01 inserted+
    +19:42:27 Background task aborted: Task=Init,Scope=DRVGRP,Group=RS1+
    +19:42:27 Degraded RAID set RS1 - No spare available for rebuild+
    +19:42:26 Degraded RAID set RS1+
    +19:42:22 Drive carrier 00:01 removed+
    +15:10:57 Created volume “R1V1” on RAID set “RS1”+
    So it seems that the drive from Bay 1 somehow gets lost (removed) a few hours after the volume is being created and anysoon later it's being "reinserted"...
    Of course, the drive is NOT removed, nobody touched the Mac Pro! Also I did the same procedure 3 times and the result was always the same.
    I also tried setting up JBOD and different RAID levels which do all work without a problem. Only when choosing RAID5 (what I intentionally bought the card for), the problem reappears
    Anyone any solution or hint for me concerning this problem? Many thanks in advance!

    One drive completely broke down later. Replaced that drive and since the problem's gone!

Maybe you are looking for

  • Responsive projects display incorrect content due to orientation of tablet.

    In a responsive project, how does Captivate determine which view should be displayed on which type of device? For example, does it use pixel width or device type? We have had difficulty designing for tablet devices because when the tablet is held in

  • Select records based on first n distinct values of column

    I need to write a query in plsql to select records for first 3 distinct values of a single column (below example, ID )and all the rows for next 3 distinct values of the column and so on till the end of count of distinct values of a column. eg: ID nam

  • HREAP VLAN Mapping

    /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-

  • How to locate JSP Page in the server

    Hi, I have the following requirement in OA Framework . There is a existing Customized OA page in the PO Module.My requirement is to add a text field in the existing page and based on a input to the text field ,i have to retrive the data in the page.

  • Can I delete an old vault?

    Hi. I went from Aperture 1.5 to 2.x successfully. I don't see anything missing and I'm running smoothly. Before the upgrade I duped my 1.5 vault and library and tucked them away on FW for safekeeping. Now that I'm satisfied with the performance can I