Hyper-V Guest Cluster Node Failing Regularly

Hi,
We currently have a 4-node Server 2012 R2 Cluster witch hosts among other things, a 3 node Guest Cluster running a single clustered file service.  
Around once a week, the guest cluster node that is currently hosting the clustered file service will fail.  It's as if the VM is blue screening.  That in itself is fairly anoying and I'll be doing all the updates and checking event log for clues
as to the cause.  
The problem then is that whichever physical cluster node that is hosting the VM when it fails,  will not unlock some of the VM's files.  The Virtual machine configuration lists as Online Pending.  This means that the failed VM cannot be restarted
on any other cluster node.  The only fix is to drain the physical host it failed on, and reboot. 
Looking for suggestions on how to fix the following.
1. Crashing guest file cluster node
2. Failed VM with shared VHDX requiring Phyiscal host reboot.
Event messages for the physical host that was hosting the failed vm in order that they occured.
Hyper-V-Worker: Event ID 18590 - 'FS-03' has encountered a fatal error.  The guest operating system reported that it failed with the following error codes: ErrorCode0: 0x9E, ErrorCode1: 0x6C2A17C0, ErrorCode2: 0x3C, ErrorCode3: 0xA, ErrorCode4:
0x0.  If the problem persists, contact Product Support for the guest operating system.  (Virtual machine ID 36166B47-D003-4E51-AFB5-7B967A3EFD2D)
FailoverClustering: Event ID 1069 - Cluster resource 'Virtual Machine FS-03' of type 'Virtual Machine' in clustered role 'FS-03' failed.
Hyper-V-High-Availability: Event ID 21128 - 'Virtual Machine FS-03' failed to shutdown the virtual machine during the resource termination. The virtual machine will be forcefully stopped.
Hyper-V-High-Availability: Event ID 21110 - 'Virtual Machine FS-03' failed to terminate.
Hyper-V-VMMS: Event ID 20108 - The Virtual Machine Management Service failed to start the virtual machine '36166B47-D003-4E51-AFB5-7B967A3EFD2D': The group or resource is not in the correct state to perform the requested operation. (0x8007139F).
Hyper-V-High-Availability: Event ID 21107 - 'Virtual Machine FS-03' failed to start.
FailoverClustering: Event ID 1205 - The Cluster service failed to bring clustered role 'FS-03' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.

Hi,
I don’t found the similar issue, Does your cluster can pass the cluster validation? Does all your Hyper-V host compatible with Server 2012r2? Have you try to disable all your
AV soft and firewall? Please rerun Storage validation on the Cluster in non-production hours, the cluster validation report will quickly locate the issue.
More information:
Cluster
http://technet.microsoft.com/en-us/library/dd581778(v=ws.10).aspx
Hope this helps.
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place.

Similar Messages

  • Host server live migration causing Guest Cluster node goes down

    Hi 
    I have two node Hyper host cluster , Im using converged network for Host management,Live migartion and cluster network. And Separate NICs for ISCSI multi-pathing. When I live migrate the Guest node from one host to another , within guest cluster the node
    is going down.  I have increased clusterthroshold and clusterdelay values.  Guest nodes are connecting to ISCSI network directly from ISCSI initiator on Server 2012. 
    The converged networks for management ,cluster and live migration networks are built on top of a NIC Team with switch Independent mode and load balancing as Hyper V port. 
    I have VMQ enabled on Converged fabric  and jumbo frames enabled on ISCSI. 
    Can Anyone guess why would live migration cause failure on the guest node. 
    thanks
    mumtaz 

    Repost here: http://social.technet.microsoft.com/Forums/en-US/winserverhyperv/threads
    in the Hyper-V forum.  You'll get a lot more help there.
    This forum is for Virtual Server 2005.

  • Cluster node fails after testing removing both interconnects in a two node

    Hi,
    cluster node panics and fails to join cluster after testing removing both interconnects in a two node cluster. cluster is up on one node , but the panic'ed node fails to rejoin cluster saying no sufficient quorum yet and both clinterconn failed (even after conencting the interconn). Quorum device used is a shared disk.
    Is this a bug?
    Any workaround or solution?
    Cluster is 3.2 SPARC
    Thanking you
    Ushas Symon

    Sounds like a networking problem to me. If the failed node genuinely can't communicate with the remaining node then it will not be allowed to join the cluster, hence the quorum message. I would suspect either:
    * Misconnected cables
    * A switch that has block or disabled the port
    * A failed auto-negotiation
    This is of course without knowing anything about what your network infrastructure actually is!
    Tim
    ---

  • Hyper-V Failover Cluster Node Corruption

    Dear All,
                Some of my nodes are showing abnormal behavior.  They are restarting every now and then.  I had updated the cluster nodes, but all updates were OS specific, there was nothing specific
    with respect to hardware update.
    I have analyzed crash dumps and find out that following is causing the crash:
    page_fault_in_nonpaged_area
    anyone has any idea about this?
    Thanks in advance.

    Hi ,
    What is the OS of the cluster node ?
    Did you try to remove the protection client for troubleshooing ?
    If it is a 2008R2 cluster , please refer to this thread :
    http://social.technet.microsoft.com/Forums/en-US/32ab6a85-6002-4c3c-97ea-27cb1091e9b3/windows-cluster-server-is-getting-restarted?forum=winservergen
    Hope it helps
    Best Regards
    Elton Ji
    We
    are trying to better understand customer views on social support experience, so your participation in this
    interview project would be greatly appreciated if you have time.
    Thanks for helping make community forums a great place.

  • Guest Cluster error in Hyper-V Cluster

    Hello everybody,
    in my environment I do have an issue with failover clusters (Exchange, Fileserver) while performing a live migration of one virtual clusternode. The clustergroup is going offline.
    The environment is the following:
    2x Hyper-V Clusters: Hyper-V-Cluster1 and Hyper-V-Cluster2 (Windows Server 2012 R2) with 5 Nodes per Cluster
    1x Scaleout Fileserver (Windows Server 2012 R2) with 2 Nodes
    1x Exchange Cluster (Windows Server 2012 R2) with EX01 VM running on Hyper-V-Cluster1 and EX02 VM running on Hyper-V-Cluster2
    1x Fileserver Failover Cluster (Windows Server 2012 R2) with FS01 VM running on Hyper-V-Cluster1 and FS02 VM running on Hyper-V-Cluster2
    The physical networks on the Hyper-V Nodes are redundant with 2x 10Gb/s uplinks to 2x physical switches for VMs in a LBFO Team:
    New-NetLbfoTeam
    -Name 10Gbit_TEAM -TeamMembers 10Gbit_01,10Gbit_02
    -TeamingMode SwitchIndependent -LoadBalancingAlgorithm HyperVPort
    The SMB 3 traffic runs on 2x 10Gb/s NIC without NIC-Teaming (SMB-Multichannel).
    SMB is used for livemigrations.
    The VMs for clustering were installed according to the technet guideline:
    http://technet.microsoft.com/en-us/library/dn265980.aspx
    Because my Hyper-V Uplinks are allready redundant, I am using one NIC inside the VM.
    As I understand, there is no advantage of using two NICs inside the VM as long they are connected to the same vSwitch.
    Now, when I want to perform a hardware maintenance, I have to livemigrate the EX01 VM from Hyper-V-Cluster1-Node-1 to Hyper-V-Cluster1-Node-2.
    EX02 VM still runs untouched on Hyper-V-Cluster2-Node-1.
    At the end of the livemigration I see error 1135 (source: FailoverClustering) on EX01 VM, which says that EX02 VM was removed from Failover Cluster and I have to check my network.
    The clustergroup of exchange is offline after that event and I have to bring it online again manually.
    Any ideas what can cause this behavior?
    Thanks.
    Greetings,
    torsten

    Hello again,
    I found the cause and the solution :-)
    In the article here: http://technet.microsoft.com/en-us/library/dn440540.aspx
    is the description of my cluster failure:
    ########## relevant part from article #######################
    Protect against short-term network interruptions
    Failover cluster nodes use the network to send heartbeat packets to other nodes of the cluster. If a node does not receive a response from another node for a specified period of time, the cluster removes the node from cluster membership. By default, a guest
    cluster node is considered down if it does not respond within 5 seconds. Other nodes that are members of the cluster will take over any clustered roles that were running on the removed node.
    Typically, during the live migration of a virtual machine there is a fast final transition when the virtual machine is stopped on the source node and is running on the destination node. However, if something causes the final transition to take longer than
    the configured heartbeat threshold settings, the guest cluster considers the node to be down even though the live migration eventually succeeds. If the live migration final transition is completed within the TCP time-out interval (typically around 20 seconds),
    clients that are connected through the network to the virtual machine seamlessly reconnect.
    To make the cluster heartbeat time-out more consistent with the TCP time-out interval, you can change the
    SameSubnetThreshold and CrossSubnetThreshold cluster properties from the default of 5 seconds to 20 seconds. By default, the cluster sends a heartbeat every 1 second. The threshold specifies how many heartbeats to miss in succession
    before the cluster considers the cluster node to be down.
    After changing both parameters in failover cluster as described the error is gone.
    Greetings,
    torsten

  • Cluster Quorum Disk failing inside Guest cluster VMs in Hyper-V Cluster using Virtual Disk Sharing Windows Server 2012 R2

    Hi, I'm having a problem in a VM Guest cluster using Windows Server 2012 R2 and virtual disk sharing enabled. 
    It's a SQL 2012 cluster, which has around 10 vhdx disks shared this way. all the VHDX files are inside LUNs on a SAN. These LUNs are presented to all clustered members of the Windows Server 2012 R2 Hyper-V cluster, via Cluster Shared Volumes.
    Yesterday happened a very strange problem, both the Quorum Disk and the DTC disks got the information completetly erased. The vhdx disks themselves where there, but the info inside was gone.
    The SQL admin had to recreated both disks, but now we don't know if this issue was related to the virtualization platform or another event inside the cluster itself.
    Right now I'm seen this errors on one of the VM Guest:
     Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          3/4/2014 11:54:55 AM
    Event ID:      1069
    Task Category: Resource Control Manager
    Level:         Error
    Keywords:      
    User:          SYSTEM
    Computer:      ServerDB02.domain.com
    Description:
    Cluster resource 'Quorum-HDD' of type 'Physical Disk' in clustered role 'Cluster Group' failed.
    Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster
    Manager or the Get-ClusterResource Windows PowerShell cmdlet.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}" />
        <EventID>1069</EventID>
        <Version>1</Version>
        <Level>2</Level>
        <Task>3</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2014-03-04T17:54:55.498842300Z" />
        <EventRecordID>14140</EventRecordID>
        <Correlation />
        <Execution ProcessID="1684" ThreadID="2180" />
        <Channel>System</Channel>
        <Computer>ServerDB02.domain.com</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="ResourceName">Quorum-HDD</Data>
        <Data Name="ResourceGroup">Cluster Group</Data>
        <Data Name="ResTypeDll">Physical Disk</Data>
      </EventData>
    </Event>
    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          3/4/2014 11:54:55 AM
    Event ID:      1558
    Task Category: Quorum Manager
    Level:         Warning
    Keywords:      
    User:          SYSTEM
    Computer:      ServerDB02.domain.com
    Description:
    The cluster service detected a problem with the witness resource. The witness resource will be failed over to another node within the cluster in an attempt to reestablish access to cluster configuration data.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}" />
        <EventID>1558</EventID>
        <Version>0</Version>
        <Level>3</Level>
        <Task>42</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2014-03-04T17:54:55.498842300Z" />
        <EventRecordID>14139</EventRecordID>
        <Correlation />
        <Execution ProcessID="1684" ThreadID="2180" />
        <Channel>System</Channel>
        <Computer>ServerDB02.domain.com</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="NodeName">ServerDB02</Data>
      </EventData>
    </Event>
    We don't know if this can happen again, what if this happens on disk with data?! We don't know if this is related to the virtual disk sharing technology or anything related to virtualization, but I'm asking here to find out if it is a possibility.
    Any ideas are appreciated.
    Thanks.
    Eduardo Rojas

    Hi,
    Please refer to the following link:
    http://blogs.technet.com/b/keithmayer/archive/2013/03/21/virtual-machine-guest-clustering-with-windows-server-2012-become-a-virtualization-expert-in-20-days-part-14-of-20.aspx#.Ux172HnxtNA
    Best Regards,
    Vincent Wu
    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

  • Guest VM failover cluster on Hyper-V 2012 Cluster does not work across hosts

    Hi all,
    We are evaluating Hyper-V on Windows Server 2012, and I have bumped in to this problem:
    I have a Exchange 2010SP2 DAG installed on 2 vms in our Hyper-V cluster (a DAG forms a failover cluster, but does not use any shared storage). As long as my vms are on the same host, all is good. However, if I live migrate or shutdown-->move-->start one
    of the guest nodes on another pysical host, it loses connectivity with the cluster. "regular" network is fine across hosts, and I can ping/browse one guest node from the other. I have tried looking for guidance for Exchange on Hyper-V clusters but have not
    been able to find anything.
    According to the Exchange documentation this configuration is supported, so I guess I'm asking for any tips and pointers on where to troubleshoot this.
    regards,
    Trond

    Hi All,
    so some updates...
    We have a ticket logged with Microsoft, more of a check box exercise to reassure the business we're doing the needful.  Anyway, they had us....
    Apply hotfix http://support.microsoft.com/kb/2789968?wa=wsignin1.0  to both guest DAG nodes, which seems pretty random, but they wanted to update the TCP/IP stack...
    There was no change in error, move guest to another Hyper-V node, and the failover cluster, well, fails with the following event ids I the node that fails...
    1564 -File share witness resource 'xxxx)' failed to arbitrate for the file share 'xxx'. Please ensure that file share '\xxx' exists and is accessible by the cluster..
    1069 - Cluster resource 'File Share Witness (xxxxx)' in clustered service or application 'Cluster Group' failed
    1573 - Node xxxx  failed to form a cluster. This was because the witness was not accessible. Please ensure that the witness resource is online and available
    The other node stays up, and the Exchange DB's mounted on that node stay up, the ones mounted on the way that fails failover to the remaining node...
    So we then
    Removed 3 x Nic's in one of the 4 x NIC teams, so, leaving a single NIC in the team (no change)
    Removed one NIC from the LACP group on each Hyper-V host
    Created new Virtual Switch using this simple trunk port NIC on each Hyper-V host
    Moved the DAG nodes to this vSwitch
    Failover cluster works as expected, guest VM's running on separate Hyper-V hosts, when on this vswitch with single NIC
    So Microsoft were keen to close the call, as there scope was, I kid you not, to "consider this issue
    resolved once we are able to find the cause of the above mentioned issue", which we have now done, as in, teaming is the cause... argh.
    But after talking, they are now escalating internally.
    The other thing we are doing, is building Server 2010 Guests, and installing Exchange 2010 SP3, to get a Exchange 2010 DAG running on Server 2010 and see if this has the same issue, as people indicate that this is perhaps not got the same problem.
    Cheers
    Ben
    Name                   : Virtual Machine Network 1
    Members                : {Ethernet, Ethernet 9, Ethernet 7, Ethernet 12}
    TeamNics               : Virtual Machine Network 1
    TeamingMode            : Lacp
    LoadBalancingAlgorithm : HyperVPort
    Status                 : Up
    Name                   : Parent Partition
    Members                : {Ethernet 8, Ethernet 6}
    TeamNics               : Parent Partition
    TeamingMode            : SwitchIndependent
    LoadBalancingAlgorithm : TransportPorts
    Status                 : Up
    Name                   : Heartbeat
    Members                : {Ethernet 3, Ethernet 11}
    TeamNics               : Heartbeat
    TeamingMode            : SwitchIndependent
    LoadBalancingAlgorithm : TransportPorts
    Status                 : Up
    Name                   : Virtual Machine Network 2
    Members                : {Ethernet 5, Ethernet 10, Ethernet 4}
    TeamNics               : Virtual Machine Network 2
    TeamingMode            : Lacp
    LoadBalancingAlgorithm : HyperVPort
    Status                 : Up
    A Cloud Mechanic.

  • Hyper-V guest SQL 2012 cluster live migration failure

    I have two IBM HX5 nodes connected to IBM DS5300. Hyper-V 2012 cluster was built on blades. In HV cluster was made six virtual machines, connected to DS5300 via HV Virtual SAN. These VMs was formed a guest SQL Cluster. Databases' files are placed on
    DS5300 storage and available through VM FibreChannel Adapters. IBM MPIO Module is installed on all hosts and VMs.
    SQL Server instances work without problem. But! When I try to live migrate SQL VM to another HV node an SQL Instance fails. In SQL error log I see:
    2013-06-19 10:39:44.07 spid1s      Error: 17053, Severity: 16, State: 1.
    2013-06-19 10:39:44.07 spid1s      SQLServerLogMgr::LogWriter: Operating system error 170(The requested resource is in use.) encountered.
    2013-06-19 10:39:44.07 spid1s      Write error during log flush.
    2013-06-19 10:39:44.07 spid55      Error: 9001, Severity: 21, State: 4.
    2013-06-19 10:39:44.07 spid55      The log for database 'Admin' is not available. Check the event log for related error messages. Resolve any errors and restart the database.
    2013-06-19 10:39:44.07 spid55      Database Admin was shutdown due to error 9001 in routine 'XdesRMFull::CommitInternal'. Restart for non-snapshot databases will be attempted after all connections to the database are aborted.
    2013-06-19 10:39:44.31 spid36s     Error: 17053, Severity: 16, State: 1.
    2013-06-19 10:39:44.31 spid36s     fcb::close-flush: Operating system error (null) encountered.
    2013-06-19 10:39:44.31 spid36s     Error: 17053, Severity: 16, State: 1.
    2013-06-19 10:39:44.31 spid36s     fcb::close-flush: Operating system error (null) encountered.
    2013-06-19 10:39:44.32 spid36s     Error: 17053, Severity: 16, State: 1.
    2013-06-19 10:39:44.32 spid36s     fcb::close-flush: Operating system error (null) encountered.
    2013-06-19 10:39:44.32 spid36s     Error: 17053, Severity: 16, State: 1.
    2013-06-19 10:39:44.32 spid36s     fcb::close-flush: Operating system error (null) encountered.
    2013-06-19 10:39:44.33 spid36s     Starting up database 'Admin'.
    2013-06-19 10:39:44.58 spid36s     349 transactions rolled forward in database 'Admin' (6:0). This is an informational message only. No user action is required.
    2013-06-19 10:39:44.58 spid36s     SQLServerLogMgr::FixupLogTail (failure): alignBuf 0x000000001A75D000, writeSize 0x400, filePos 0x156adc00
    2013-06-19 10:39:44.58 spid36s     blankSize 0x3c0000, blkOffset 0x1056e, fileSeqNo 1313, totBytesWritten 0x0
    2013-06-19 10:39:44.58 spid36s     fcb status 0x42, handle 0x0000000000000BC0, size 262144 pages
    2013-06-19 10:39:44.58 spid36s     Error: 17053, Severity: 16, State: 1.
    2013-06-19 10:39:44.58 spid36s     SQLServerLogMgr::FixupLogTail: Operating system error 170(The requested resource is in use.) encountered.
    2013-06-19 10:39:44.58 spid36s     Error: 5159, Severity: 24, State: 13.
    2013-06-19 10:39:44.58 spid36s     Operating system error 170(The requested resource is in use.) on file "v:\MSSQL\log\Admin\Log.ldf" during FixupLogTail.
    2013-06-19 10:39:44.58 spid36s     Error: 3414, Severity: 21, State: 1.
    2013-06-19 10:39:44.58 spid36s     An error occurred during recovery, preventing the database 'Admin' (6:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected,
    contact Technical Support.
    In windows system log I see a lot of warnings like this:
    - <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
    - <System>
      <Provider
    Name="Microsoft-Windows-Ntfs" Guid="{3FF37A1C-A68D-4D6E-8C9B-F79E8B16C482}" />
      <EventID>140</EventID>
      <Version>0</Version>
      <Level>3</Level>
      <Task>0</Task>
      <Opcode>0</Opcode>
      <Keywords>0x8000000000000008</Keywords>
      <TimeCreated
    SystemTime="2013-06-19T06:39:44.314400200Z" />
      <EventRecordID>25239</EventRecordID>
      <Correlation
    />
      <Execution
    ProcessID="4620" ThreadID="4284" />
      <Channel>System</Channel>
      <Computer>sql-node-5.local.net</Computer>
      <Security
    UserID="S-1-5-21-796845957-515967899-725345543-17066" />
      </System>
    - <EventData>
      <Data Name="VolumeId">\\?\Volume{752f0849-6201-48e9-8821-7db897a10305}</Data>
      <Data Name="DeviceName">\Device\HarddiskVolume70</Data>
      <Data Name="Error">0x80000011</Data>
      </EventData>
     </Event>
    The system failed to flush data to the transaction log. Corruption may occur in VolumeId: \\?\Volume{752f0849-6201-48e9-8821-7db897a10305}, DeviceName: \Device\HarddiskVolume70.
    ({Device Busy}
    The device is currently busy.)
    There aren't any error or warning in HV hosts.

    Hello,
    I am trying to involve someone more familiar with this topic for a further look at this issue. Sometime delay might be expected from the job transferring. Your patience is greatly appreciated.
    Thank you for your understanding and support.
    Regards,
    Fanny Liu
    If you have any feedback on our support, please click 
    here.
    Fanny Liu
    TechNet Community Support

  • Hyper-V 2012 R2 Cluster Creation Fails

    I am trying to create a 2 Node Hyper-v 2012 R2 Cluster.  The Cluster Validation passes with no errors or warnings but the the cluster creation fails. 
    The error is similar to
    here . 
    In this case he solved it by joining the Nodes to a Windows 2012 Domain.  We don't have that option in  our environment. 
    In the System Logs Events 7024 The cluster service terminated... The cluster join operation failed,  and 7031 Cluster Services Terminated unexpectedly. 
    Anyone have an idea?
    Todd

    Hi Todd,
    For troubleshooting , please try to create a new OU then move cluster nodes computer to that new OU then block inheritance , restart the nodes .
    After this try to use domain admin account logon the cluster nodes to build cluster again .
    Best Regards,
    Elton Ji
    Please remember to mark the replies as answers if they help and unmark them if they provide no help. If you have feedback for TechNet Subscriber Support, contact [email protected] .

  • Second drives on Hyper-V guests failing suddenly

    In past 2 months or so, we have lost 3-4 Hyper-V guest systems that had a second drive attached via the virtual SCSI adapter. Up until this time, these servers have run flawlessly for 3 years+. The drives appear on the host, but if you try to access
    them, they tell you that the disk cannot be found. However, the .vhd is full size and sitting in the SharedCluster storage folder where they belong.
    Even if I create a new server or drive from scratch, in short order the second drive becomes unusable, even if I create it as an IDE device instead of SCSI.
    I have 2 host servers running 2008 R2 Enterprise connected to a Equallogics SAN via iSCSI in a 2 node cluster.
    Oddly, the boot drives seem to not have any issues, on old or new servers. It's only the second drive. Very odd, and scary. Any ideas out there?

    Hi Kevin,
    Please check the event log on each cluster node .
    Have you restarted the cluster ?
    Just one CSV ? I mean that the second drive and boot drive are not on the same LUN ?
    Best Regards
    Elton JI
    We
    are trying to better understand customer views on social support experience, so your participation in this
    interview project would be greatly appreciated if you have time.
    Thanks for helping make community forums a great place.

  • Hyper-V Failover Cluster virtual guests suddenly reboots

    The environment is Server 2012 R2 using dual clusters--a Hyper-V Failover Cluster running guest application virtual machines and a Scale-Out File Server Cluster using Tiered Storage Spaces which are used to supply SMB3 shares
    for Quorum and CSV. Has anyone had this problem?

    Anything relevant in the host or guest event logs? I would also check the cluster event logs to see if there are any indications there as well.
    Does the guest go down hard or gracefully reboot?
    Need more info.
    Andy Syrewicze
    Come talk more about Hyper-V and the Microsoft Server Stack at
    Syrewiczeit.com and the Altaro Hyper-V Hub!
    Post are my own and in no way reflect the views of my employer or any other entity in which I produce technical content for.

  • WDRuntimeException: Failed to create J2EE cluster node in SLD

    Hello,
    I am getting the below error, but to my knowledge I have everything set up properly.  Let me briefly outline the logistics (I am running everything LOCALLY (will move to remote later)):
    WAS 6.4 <b>SP12</b>
    Set up JCo and tests fine
    Set up Visual Administrator / SLD Data Supplier / HTTP and CIM configured and seem to test fine
    Created SLD and it tests OK
    Created Technical Landscape
    I have noticed that in SP12, in the SLD config I actually have a NEW category called "<b>System Landscape</b>" above my "Technical Landscape" link.  I have not seen this option in previous versions SP9 or SP11.  Is it mandatory to configure this?
    Also, I created a model for Adaptive RFC and found the function I needed successfully.
    Anyway, here is the error when trying to deploy...
    com.sap.tc.webdynpro.services.exceptions.WDRuntimeException: Error while obtaining JCO connection.
         at com.sap.tc.webdynpro.services.datatypes.core.DataTypeBroker$1.fillSldConnection(DataTypeBroker.java:90)
    Caused by: com.sap.tc.webdynpro.services.sal.sl.api.WDSystemLandscapeException: Error while obtaining JCO connection.
    Caused by: com.sap.tc.webdynpro.services.exceptions.WDRuntimeException: Failed to create J2EE cluster node in SLD for 'J2E.SystemHome.bc347792': com.sap.lcr.api.cimclient.LcrException: CIM_ERR_NOT_FOUND: No such instance: SAP_J2EEEngineCluster.CreationClassName="SAP_J2EEEngineCluster",Name="J2E.SystemHome.bc347792"
    Any help will be appreciated!

    I figured it out for those that may have a similar problem.
    Although I had created and tested my JCo's properly and they were working fine, somehow, and I still don't know why, they went RED in the JCo Maintainence screen. 
    I had to "create" again and it works fine now.

  • Windows Server 2008 R2 SP1 2-Node cluster - Replace failed node

    Hi -  I have a two node Windows Server 2008 R2 SP1 fail-over cluster (DHCP, File, Print) where one of the nodes have failed beyond recovery. What I would like to do is to evict the failed cluster node and install a new machine with Windows Server 2008
    R2  SP1 and re use the same name and Ip adress and then join this machine as a node in the cluster. 
    Is there any recommended steps to do this, i'm mostly thinking about the part of re-using the same name and ip address for the new node? (e.g. is there any cleanup more than evict the node?)
    Enfo Zipper
    Christoffer Andersson – Principal Advisor
    http://blogs.chrisse.se - Directory Services Blog

    Hi,
    I agree with Noah Sparks, you can evict the corrupt node, if you reinstall then add the new server system you can just join it to cluster, it shouldn’t any error.
     If you have seen some the evicted node rejoin to cluster meet “The cluster node is already a member of the cluster” error, it may need pack the KB2549472
    hotfix.
    The related KB:
    How to Evict a Node from a Windows Server 2008 Failover Cluster
    http://technet.microsoft.com/en-us/library/bb676524(v=exchg.80).aspx
    Cluster node cannot rejoin the cluster after the node is restarted or removed from the cluster in Windows Server 2008 R2
    http://support.microsoft.com/kb/2549472
    Hope this helps.
    We
    are trying to better understand customer views on social support experience, so your participation in this
    interview project would be greatly appreciated if you have time.
    Thanks for helping make community forums a great place.

  • Node failed to join the cluster because it ould not send and receive failure detection network messages

    One of my customers has a Windows Server 2008 R2 cluster for an Exchange 2010 Mailbox Database Availability Group.  Lately, they've been having problems with one of their nodes (the one node that is on a different subnet in a different datacenter) where
    their Exchange databases aren't replicating.  While looking into this issue it seems that the problem is the Network Manager isn't started because the cluster service is failing.  Since the issue seems to be with the cluster service, and not Exchange,
    I'm asking here. 
    When the cluster service starts, it appears to start working, but within a few minutes the following is logged in the system event log.
    FailoverClustering
    1572
    Critical
    Cluster Virtual Adapter
    Node 'nodename' failed to join the cluster because it could not send and receive failure detection network messages with other cluster nodes. ...
    It seems that the problem is with the 169.254 address on the cluster virtual adapter.  An entry in the cluster.log file says: Aborting connection because NetFT route to node nodename on virtual IP 169.254.1.44:~3343~ has failed to come up. 
    In my experience, you never have to mess with the cluster virtual adapter.  I'm not sure what happened here, but I doubt it has been modified.  I need the cluster to communicate with its other nodes on our routed 10. network.  I've never experienced
    this before and found little in my searches on the subject.  Any idea how I can fix this?
    Thanks,
    Joe
    Joseph M. Durnal MCM: Exchange 2010 MCITP: Enterprise Messaging Administrator, Exchange 2010 MCITP: Enterprise Messaging Administrator, MCITP: Enterprise Administrator

    Hi,
    I suspected an issue with communication on UDP port 3343. Please confirm the set rules for port 3343 on all the nodes in firewall and enabled all connections for all the profiles
    in firewall on all the nodes are opened, or confirm the connectivity of all the node.
    Use ipconfig /flushdns to update all the node DNS register, then confirm the DNS in your DNS server entry is correct.
    The similar issue article:
    Exchange 2010 DAG - NetworkManager has not yet been initialized
    https://blogs.technet.com/b/dblanch/archive/2012/03/05/exchange-2010-dag-networkmanager-has-not-yet-been-initialized.aspx?Redirected=true
    Hope this helps.
    We
    are trying to better understand customer views on social support experience, so your participation in this
    interview project would be greatly appreciated if you have time.
    Thanks for helping make community forums a great place.

  • OUI failed to Select Cluster node on AIX- 9i RAC with HACMP

    Dear
    It seems HACMP cluster is working fine.
    rlogin, rcp are also working.
    But oracle installer failed to pop up the "Cluster Node Selection" window.
    [p650_cdr1][root]/> lssrc -a | grep -E "ES|svcs"
    clcomdES clcomdES 35888 active
    topsvcs topsvcs 34634 active
    grpsvcs grpsvcs 23258 active
    emsvcs emsvcs 26588 active
    emaixos emsvcs 25356 active
    clstrmgrES cluster 26090 active
    clsmuxpdES cluster 37556 active
    clinfoES cluster 26118 active
    grpglsm grpsvcs inoperative
    [p650_cdr1][root]/> lssrc -g cluster
    Subsystem Group PID Status
    clstrmgrES cluster 26090 active
    clsmuxpdES cluster 37556 active
    clinfoES cluster 26118 active
    [p650_cdr1][root]/>
    Would you please let me know what to do now?
    Oracle : 9.2.0.1
    $ oslevel -r
    5200-04
    $
    HACMP :5.2
    Regards
    Faruque
    Message was edited by: Faruque
    fahmed

    The problem is resolved. A patch(IY73937) for cluster was required.
    Thanks and regards
    Faruque

Maybe you are looking for

  • How to create a VI with multiple windows/displays?

    Hi, I'm new to LabVIEW and just trying to increase my understanding so please bear with me.I want to write a VI where the user is presented with a series of windows and depending on the button, different windows are displayed. What would be the best

  • E Mailed Zip Files showing up as Password Protected

    I export photos to a folder, create zip file, and then e mail them as a regular part of my business. Sometimes these show up to the recipient as a "Password Protected" file. Any idea why???

  • How can I keep my Cap 7 file from sending a 0 score to LMS?

    I have a Captivate 7 file for a training program.  The file includes some question slides for knowledge check/review purposes - however, we don't want the file to send a score (not even a 0 score) to our LMS.  We only want the LMS to reflect whether

  • How to find database duplicates

    I have a dynamic site that I made using Dreamweaver 802. It has a database that has really grown and as I enter new names into the database, I have found that there are some duplicates. Is there a way to find duplicates in either Dreamweaver or maybe

  • Unable to interpret "650.00u00A0 " as a number. Runtime Error : CONVT_NO_NUMBER

    Hi , This is a issue which has to be resolved ASAP as all the checks are being thrown Dump in Production. It would be great if any of you could help me in this regard. I have been encountering a runtime error as "CONVT_NO_NUMBER" with short text as "