VMs on Failover Cluster hanging in "Locked" state

I'm managing a Windows Server 2012 R2 2-node cluster that is backed up using a separate DPM 2012 R2 server. All VMs are on Cluster Shared Storage.
I've noticed that some VMs keep hanging in a "Running (Locked)" state (in FOCM, in HVM it shows a "Backing up..." state). Things I noticed while the VMs are in this state are:
every subsequent DPM backup will fail (it looks like DPM is the cause of this "Locked" state)
the VM can't be moved to another node
if the VM is backed up "Online", I can see AVHD files -- the VHD files have the date of the last succesful DPM backup
if the VM is backed up "Offline", I can't find any snapshots (the VHD files have the date of the last succesful DPM backup, which is weird, since the data in the VM is actually changing)
The only way out of this situation is to shut down the VMs and reboot both cluster nodes. Of course, this isn't something that I like to do on a weekly basis.
My questions:
- What can I do to prevent this "Locked" problem? (the last 2 months I've experienced this problem with 5 different VMs)
- Is there another way to get out of the "Locked" situation? Preferably one that doesn't require a cluster reboot.
- Are there any logs I can check to get more information about this problem?
Thanks in advance!

I don't have a couple of those hotfixes. Mainly because the hotfix page states that I only should install it if I experience the problems mentioned -- which isn't always the case.
I have a service interval coming up and will install the relevant patches. I'll report back when I have new information.
Thanks so far!

Similar Messages

  • Install SQL Server 2012 SP1 on a Windows Server 2012 R2 Failover Cluster - hangs at "Running discovery on remote machine" on VMWare VSphere 5.5 Update 1

    <p>Hi,</p><p>I'm trying to install SQL Server 2012 SP1 on the first node of a Windows Server 2012 R2 failover cluster.</p><p>The install hangs whilst displaying the "Please wait while Microsoft SQL Server 2012 Servce
    Pack 1 Setup processes the current operation." message.</p><p>The detail.txt log file shows as follows:</p><p>(01) 2014-07-17 15:36:35 Slp: -- PidInfoProvider : Use cached PID<br />(01) 2014-07-17 15:36:35 Slp: -- PidInfoProvider
    : NormalizePid is normalizing input pid<br />(01) 2014-07-17 15:36:35 Slp: -- PidInfoProvider : NormalizePid found a pid containing dashes, assuming pid is normalized, output pid<br />(01) 2014-07-17 15:36:35 Slp: -- PidInfoProvider : Use cached
    PID<br />(01) 2014-07-17 15:36:35 Slp: Completed Action: FinalCalculateSettings, returned True<br />(01) 2014-07-17 15:36:35 Slp: Completed Action: ExecuteBootstrapAfterExtensionsLoaded, returned True<br />(01) 2014-07-17 15:36:35 Slp: ----------------------------------------------------------------------<br
    />(01) 2014-07-17 15:36:35 Slp: Running Action: RunRemoteDiscoveryAction<br />(01) 2014-07-17 15:36:36 Slp: Running discovery on local machine<br />(01) 2014-07-17 15:36:36 Slp: Discovery on local machine is complete<br />(01) 2014-07-17
    15:36:36 Slp: Running discovery on remote machine: XXX-XXX-01</p><p>After about 4 hours and 10 minutes, the step seems to time out and move on, however it doesn't seem to have discovered what it needs to and the setup subsuently fails</p><p></p>

    Hi,
    Sorry Information you provided did not helped can you post content of both summary file and details,txt file on shared location for analysis.
    Can you download Service pack again and try once more
    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it.
    My TechNet Wiki Articles

  • Exchange 2013 MBX in DAG along with Hyper-V and Failover Cluster

    Hi Guys! I've tried to find out an answer of my question or some kind of solution, but with no luck that's why I am writing here. The case is as follows. I have two powerful server boxes and iSCSI storage and I have to design high availability
    solution, which includes SCOM 2012, SC DPM 2012 and exchange 2013 (two CAS+HUB servers and two MBX servers).
    Let me tell you how I plan to do that and you will correct me if proposed solution is wrong.
    1. On both hosts - add Hyper-V role.
    2. On both hosts - add failover clustering role.
    3. Create 2 VMs through failover cluster manager, VMs will be stored on a iSCSI LUN, the first one VM for SCOM 2012 and the second one for SCDPM 2012. Both VMs will be added as failover resource.
    4. Create 4 VMs - 2 for CAS+HUB role and 2 for MBX role, VMs will be stored on a iSCSI LUN as well.
    5. Create a DAG within the two MBX servers.
    In general, that's all. What I wonder is whether I can use failover clustering to acheive High Availability for 2 VMs and at the same time to create DAG between MBX-servers and NLB between CAS-servers?
    Excuse me for this question, but I am not proficient in this matter.

    Hi,
    As far as I know, it’s supported to create DAG for mailbox server installed in hyper-v server.
    And since load balance has been changed for CAS 2013, it is more worth with DNS round robin instead of NLB. However, you can use NLB in Exchange 2013.
    For more information, you can refer to the following article:
    http://technet.microsoft.com/en-us/library/jj898588(v=exchg.150).aspx
    If you have any question, please feel free to let me know.
    Thanks,
    Angela Shi
    TechNet Community Support

  • HyperV 2012 R2 Failover cluster, HV problem, all VMs restart

    Hello, I have 2 node Failover cluster with two nodes, Hyperv 2012, multipath SAS storage MSA2000. But hardware problem with one node (node2). It shutdown unexpectly. When It hapens NODE1 restar all VMs it is normal? It was configured by cluster validation
    tool. There is no witness. I don't clearly understand what happens if one node crash. KR.

    As Eric has said it will start the VM's in a crash consistent state on the non crashed host.
    But from your example I take your seeing your guests on the non crashed host restart. If this is the case I would say yes! I have seen this happen before. It can happen if your not using quorum because only one node has a vote. I would recommend you create
    a witness, on your MSA 2000 carve out 1 GB and do a disk witness. Or if you have a server not in the VM cluster you could do a file share witness, file share is my preferred. Once you have a witness in play you will see all of your hosts having a vote. Look
    in the cluster manager at the nodes section. You should see a vote column. Currently it will say 1/0, once the witness is created it will show 1/1.

  • Failover Cluster - GHOST VMS / ROLES

    I mean Ghost as in mysterious non-existent machine, not the old Norton program.  I've periodically had random cluster crashes, mainly due to my own negligence.  99%
    of the time everything comes back up normally.  However periodically a machine will have very strange symptoms that i'm unable to resolve.  The only resolution I've found is to create a new VM and link to the the old VHD.  A description of the
    machines with this issue:
    Shown in Failover Cluster role as Running but cannot Connect, turn off, shutdown, etc.
    Login to Host machine for the VM and open Hyper-V Manager the machine does not exist.  The only place this machine seems to exist is in Failover cluster.
    No details available on the Summary Tab, machine doesn't actually appear to be running despite what the console says.
    Under the resources tab for that Machine is shows the VM as running, but the VM Configuration as failed.
    Unable to bring the configuration back online.  Error is "The group or resource is not the correct state to perform the requested operation"
    I've seen other vague areas about null context pointers or something alone those lines.  I've tried researching the users methods to no avail.  How can I fix these? Or at least remove them when i've recreated the machine.

    Hi,
    Unfortunately, the available information is not enough to have a clear view of the occurred behavior.Could you provide more information about your environment.  The server version of the problem on, when this problem occurs the system log record information,
    screenshots is the best information.
    If you are using Server 2012R2 failover cluster please install the following update:
    Recommended hotfixes and updates for Windows Server 2012 R2-based failover clusters
    http://support.microsoft.com/kb/2920151
    More information:
    Event Logs
    http://technet.microsoft.com/en-us/library/cc722404.aspx
    Thanks.
    We
    are trying to better understand customer views on social support experience, so your participation in this
    interview project would be greatly appreciated if you have time.
    Thanks for helping make community forums a great place.

  • Event Logs of VMs Migration in Failover Cluster of Hyper-V Hosts

    Hello All,
    We're running Failover Cluster of Hyper-V hosts of Windows Server 2012 R2. Using SCVMM 2012 R2 with UR5 for management.
    If any host gets down unexpectedly (due to any reason power/bugcheck/hardware failure or what so ever), then the VMs on that host, of course, get migrated (either quick or live) to some other host within the cluster.
    I want to have logs/events of this VMs migration. I want to know that which of the VMs were residing on that host at that time of failure. Of course, we can't have this info in the Cluster events is Failover Cluster Manager. I am unable to find this info
    anywhere. I have searched in Event Viewer --> Administrative Roles --> Hyper-V. I have searched a lot in the SCVMM, but no success.
    Please help me in finding the exact location of these logs/events. I would also like to know that if the VM was quick migrated or live migrated, and to which host the VM got migrated.
    I'd be highly grateful.
    Thanks in anticipation.
    Regards,
    Hasan Bin Hasib

    This post was cross-posted in the clustering forum.  As noted in that forum, a failure of a host does not initiate a quick or live migration.  Migration requires both the source and destination nodes be operational during the entire migration
    process.  Should a host fail, it is impossible for that host to participate in a migration.  In the case of a host failure, the VM is restarted on another node of the cluster.  You can still use the information provided by Elton for viewing
    events in the event log.  If you want to see the exact sequence of log entries, perform quick/live migrations in a lab and notices the changes in the event log.  You can also fail a host and see the sequence of log entries.
    . : | : . : | : . tim

  • Moving VMs from Standalone Hyper-V to Hyper-v Failover Cluster

    Hello All.................I need to move Virtual Machines from Standalone Hyper-V based on Windows Server 2012 R2 to a Hyper-V Failover Cluster based on Windows Server 2012 R2.
    -  The VMs OS is Windows Server 2012 R2
    -  Applications installed on VMs are SCCM, SCOM, SharePoint, etc.
    1.  I am looking for a Checklist that I can run through and move them.  To make sure I do not miss out anything.
    2.  Would a downtime be required in this scenario while the VM is being moved?
    3.  What requires special attention in this scenario?

    Thanks for the reply.
    So, I do not need to do anything and just simply add both the cluster and standalone into VMM and then simply use the Move option.  Please, confirm if I have understood it correct.
    Yes. But be carefull. First shut down your VM's
    if possible and configure them in such a way they don't automatically start. If you install the SCVMM Agent, some Hyper-V related services are restarted, when some VMs start during that process you system may stall. I have seen this multipe times.
    There is another dirty option that worked of us perfectly. Export the VMs. Then import those VMs to a local disk on one Hyper-V Server. Refresh the Hyper-V Server and Virtual Machines on that Hyper-V Server. Give it a few minutes. Then do
    a Live Migration while at the same time makeing it "Highly Available" by moving it to your CSV. Works flawlessly. DO NOT IMPORT THE VMS TO YOUR CSV, otherwise you can't change it to "Highly Avaiable" unless you have
    two CSVs. Importing it straight to a CSV does not make them Highly Available.
    Boudewijn Plomp | BPMi Infrastructure & Security
    This posting is provided "AS IS" with no warranties, and confers no rights. Please remember, if you see a post that helped you please click "Vote as Helpful", and if it answered your question, please click "Mark as Answer".

  • SCVMM created VMs not displayed in Failover Cluster Manager

    I have a 2012 Hyper-V failover cluster setup and recently added SCVMM 2012 SP1 to the mix so I could perform some P2V migrations and familiarize myself with its other many capabilities. I noticed that if I create a VM inside SCVMM it doesn't show up in the
    FCM UI with the other VMs I created from FCM. VMs that you create in FCM do get picked up by SCVMM however. Is this by design?
    Thanks,
    Greg

    For my issue above, this was because I'd not noticed and thus not ticked the box on the Live Migrate wizard that says "Make this VM highly available".
    I moved the VM out of the cluster, manually deleted the failed "SCVMM <VMName> Resource", then moved the VM back onto the cluster again but this time ticking the box to make the VM highly available. All looked fine in failover cluster manager.
    I do rather wonder why SCVMM designers think I might want to migrate a VM onto a Hyper-V cluster and NOT want it to be highly available...? Likewise, to be able to move the VM back out again to a standalone host once it's correctly in the cluster, you have
    to untick the "Make this VM highly available box". Surely this should just be done automatically in the background?

  • Create failover cluster to host Windows 2012 DC, Exchange 2013 and SQL as VMs

    One of our clients has running Windows Essential 2012, SQL and exchange 2007 as VM on VMware for 4 years without major issue. However, the physical server is getting old and have some hardware issues recently. They have budgets to buy two Dell servers, EqualLogic
    SAN, Windows server 2012 Datacenter and Exchange 2013. Is it possible for them to create failover cluster to host Windows 2012 DC, Exchange 2013 and SQL as VMs?
    Bob Lin, MCSE & CNE Networking, Internet, Routing, VPN Troubleshooting on
    http://www.ChicagoTech.net
    How to Setup Windows, Network, VPN & Remote Access on
    http://www.howtonetworking.com

    We will move all VMs from VMware to Hyper-V. Thank you.
    Bob Lin, MCSE & CNE Networking, Internet, Routing, VPN Troubleshooting on <p><a href="http://www.chicagotech.net"><span style="color:#0033cc">http://www.ChicagoTech.net<br/> </span></a></p>
    How to Setup Windows, Network, VPN &amp; Remote Access on <p><a href="http://www.howtonetworking.com"><span style="color:#0033cc">http://www.howtonetworking.com<br/> </span></a></p>

  • HyperV Failover Cluster - twice some vms lost network

    So i run a 4 node Hyper V Failover Cluster and twice now.... out of months of operations out of the blue on a node a portion of the VMs just lose network access(this has happened on two different nodes). I can just pause the node and everything migrates
    off, and then its back up and going. Give it some time and i can unpause and move back. I am looking for ideas on what could be causing this.
    There servers are Dell Power Edge 620 with the latest MS patches and Dell drivers and firmware. On my public side i have a 2 nic team using MS software teaming.

    What are the NICs?
    If Broadcom Netextreme disable VMQ as there's a known issue with them resulting in network loss
    They are broadcom :/ 
    Thanks for the tip.
    I will google around but do you have any links for this information?
    *EDIT*
    http://support2.microsoft.com/kb/2986895

  • Unable to create cluster, hangs on forming cluster

     
    Hi all,
    I am trying to create a 2 node cluster on two x64 Windows Server 2008 Enterprise edition servers. I am running the setup from the failover cluster MMC and it seems to run ok right up to the point where the snap-in says creating cluster. Then it seems to hang on "forming cluster" and a message pops up saying "The operation is taking longer than expected". A counter comes up and when it hits 2 minutes the wizard cancels and another message comes up "Unable to sucessfully cleanup".
    The validation runs successfully before I start trying to create the cluster. The hardware involved is a HP EVA 6000, two Dell 2950's
    I have included the report generated by the create cluster wizard below and the error from the event log on one of the machines (the error is the same on both machines).
    Is there anything I can do to give me a better indication of what is happening, so I can resolve this issue or does anyone have any suggestions for me?
    Thanks in advance.
    Anthony
    Create Cluster Log
    ==================
    Beginning to configure the cluster <cluster>.
    Initializing Cluster <cluster>.
    Validating cluster state on node <Node1>
    Searching the domain for computer object 'cluster'.
    Creating a new computer object for 'cluster' in the domain.
    Configuring computer object 'cluster' as cluster name object.
    Validating installation of the Network FT Driver on node <Node1>
    Validating installation of the Cluster Disk Driver on node <Node1>
    Configuring Cluster Service on node <Node1>
    Validating installation of the Network FT Driver on node <Node2>
    Validating installation of the Cluster Disk Driver on node <Node2>
    Configuring Cluster Service on node <Node2>
    Waiting for notification that Cluster service on node <Node2>
    Forming cluster '<cluster>'.
    Unable to successfully cleanup.
    To troubleshoot cluster creation problems, run the Validate a Configuration wizard on the servers you want to cluster.
    Event Log
    =========
    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          29/08/2008 19:43:14
    Event ID:      1570
    Task Category: None
    Level:         Critical
    Keywords:     
    User:          SYSTEM
    Computer:      <NODE 2>
    Description:
    Node 'NODE2' failed to establish a communication session while joining the cluster. This was due to an authentication failure. Please verify that the nodes are running compatible versions of the cluster service software.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{baf908ea-3421-4ca9-9b84-6689b8c6f85f}" />
        <EventID>1570</EventID>
        <Version>0</Version>
        <Level>1</Level>
        <Task>0</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2008-08-29T18:43:14.294Z" />
        <EventRecordID>4481</EventRecordID>
        <Correlation />
        <Execution ProcessID="2412" ThreadID="3416" />
        <Channel>System</Channel>
        <Computer>NODE2</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="NodeName">node2</Data>
      </EventData>
    </Event>
    ====
    I have also since tried creating the cluster with the firewall and no success.
    I have tried creating the node from the other cluster and this did not work either
    I tried creating a cluster with just  a single node and this did create a cluster. I could not join the other node and the network name resource did not come online either. The below is from the event logs.
    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          01/09/2008 12:42:44
    Event ID:      1207
    Task Category: Network Name Resource
    Level:         Error
    Keywords:     
    User:          SYSTEM
    Computer:      Node1.Domain
    Description:
    Cluster network name resource 'Cluster Name' cannot be brought online. The computer object associated with the resource could not be updated in domain 'Domain' for the following reason:
    Unable to obtain the Primary Cluster Name Identity token.
    The text for the associated error code is: An attempt has been made to operate on an impersonation token by a thread that is not currently impersonating a client.
    The cluster identity 'CLUSTER$' may lack permissions required to update the object. Please work with your domain administrator to ensure that the cluster identity can update computer objects in the domain.

    I am having the exact same issue... but these are on freshly created virtual machines... no group policy or anything...
    I am 100% unable to create a Virtual Windows server 2012 failover cluster using two virtual fiber channel adapters to connect to the shared storage.
    I've tried using GUI and powershell, I've tried adding all available storage, or not adding it, I've tried renaming the server and changing all the IP addresses....
    To reproduce:
    1. Create two identical Server 2012 virtual machines
    (My Config: 4 CPU's, 4gb-8gb dynamic memory, 40gb HDD, two network cards (one for private, one for mgmt), two fiber cards to connect one to each vsan.)
    2. Update both VM's to current windows updates
    3. Add Failover Clustering role, Reboot, and try to create cluster.
    Cluster passed all validation tests perfectly, but then it gets to "forming cluster" and times out =/
    Any assistance would be greatly appreciate.

  • Failover Cluster Network Name Failed and Can't be Repaired

    I have an issue that seem to be a different problem than any others have encountered.
    I've scoured everything I can find and nothing has fixed my problem.
    The problem starts with the common problem of the cluster network name failing on my 2 node server 2012 file server cluster.  The computer object was still in AD and appeared to be fine so it was not the common problem of the object
    getting deleted somehow.  At the time, there was no other object with that name in the recycling bin, so I don't think it was mistakenly deleted and quickly recreated to cover any tracks, so to speak.
    Following one guide, I tried to find the registry key that corresponded with the GUID of the object, but neither node in the cluster had it in its registry (which may be part of the problem).
    Since it was in the failed state, I tried to do the repair on the object to no avail.
    We run a "locked down" DC environment so all computer objects have to be pre-provisioned.  They were all pre-provisioned successfully and successfully assigned during cluster creation.  The cluster was running with no issues for a month
    or so before this problem came up.
    When I do a repair on the object while taking diagnostic logs the following 4609 error appears:
    The action 'Repair' did not complete. - System.ApplicationException: An error occurred resetting the password for 'Cluster Name'. ---> System.ComponentModel.Win32Exception: Unknown error (0x80005000)
    There appears to be a corresponding 4771 error with a failure code 0x18 that comes from the security log of the DC that states there was a Kerberos pre-authentication failure for the cluster network name object (Domain\Clustername$)
    I believe this is what is causing the repair failure.  All the information I found related to security error 4771 was either a bad credentials given for a user account or the fix was to reconnect the computer to the domain.  I can't seem to find
    a way to do this with the cluster network name.  If there's a way please let me know.
    I've tried a number of things, like resetting the object, disabling it, deleting and creating a new object with the same name, deleting that new object and recovering the original, etc...
    Can anyone shed some light on what is going on and hopefully how to fix it other than rebuilding the cluster?  I'm quite close to just tearing it down and building it back up but am hesitant because this cluster in currently in production...
    Any help would be appreciated

    Hi,
    I don’t find out the similar issue with yours, base on my experience, the 4096 error
     often caused by the CSV disk issue, and the 0x80005000 error some time caused by the repetitive computer object in OU. Please check the above related part or run the validate test then post the error information.
    Although I do have a CSV, there doesn't seem to be any problems with it and it was running just fine for a month or so before the problem started.  I double checked and there is no duplicate computer objects, maybe I don't understand what you mean by
    repetitive, could you explain further?
    The cluster validates successfully with a few warnings:
    Validating cluster resource Name: DT-FileCluster.
    This resource is marked with a state of 'Failed' instead of
    'Online'. This failed state indicates that the resource had a problem either
    coming online or had a failure while it was online. The event logs and cluster
    logs may have information that is helpful in identifying the cause of the
    failure.
    - This is because the cluster name is in the failed state
    Validating the service principal names for Name:
    DT-FileCluster.
    The network name Name: DT-FileCluster does not have a valid
    value for the read-only property 'ObjectGUID'. To validate the service principal
    name the read-only private property 'ObjectGuid' must have a valid value. To
    correct this issue make sure that the network name has been brought online at
    least once. If this does not correct this issue you will need to delete the
    network name and re-create it.
    - This is definitely related to the problem and the GUID probably got removed when we attempted a fix by resetting the object and trying the repair from the failover cluster manager.
    The user running validate, does not have permissions to create
    computer objects in the 'ad.unlv.edu' domain.
    - This is correct, we run a restricted domain.  I have a delegated OU that I can pre-provision accounts in.  The account was pro-provisioned successfully and was at one point setup and working just fine.
    There are no other errors nor warnings.

  • How to assign SMB storage to CSV in HV failover cluster?

    I have a Hyper-V Cluster that looks like this:
    Clustered-Hyper-V-Diagram
    2012 R2 Failover Cluster
    2 Hyper-V nodes
    iSCSI Disk Witness on isolated "Cluster Only" Network
    "Cluster and Client" Network with nic-team connectivity to 2012 R2 File Server
    Share configured using: server manager > file and storage services > shares > tasks > new share > SMB Share - Applications > my RAID 1 volume.
    My question is this: how do I configure a Clustered Shared Volume?  How do I present the Shared Folder to the cluster?
    I can create/add VMs from Cluster Manager > Roles > Virtual Machines using \\SMB\Share for the location of the vhd...  but how do I use a CSV with this config?  Am I missing something?

    right click one of the disks that you assigned to cluster as available storage
    I don't yet have any disks assigned to the cluster as available storage.
    Just for grins, I added an 8Gb iSCSI lun and added it to a CSV:
    PS C:\> get-clusterresource
    Name State OwnerGroup ResourceType
    Cluster IP Address Online Cluster Group IP Address
    Cluster Name Online Cluster Group Network Name
    witness Online Cluster Group Physical Disk
    PS C:\> Get-ClusterSharedVolume
    Name State Node
    test8Gb Online CLUSTERNODE01
    All well and good, but from what I've read elsewhere...
    SMB 3.0 via a 2012 File server can only be added to a Hyper-V CSV cluster using the VMM component of System Center 2012.  That is the only way to import an SMB 3 share for CSV storage usage.
    http://community.spiceworks.com/topic/439383-hyper-v-2012-and-smb-in-a-csv
    http://technet.microsoft.com/en-us/library/jj614620.aspx

  • Failover Cluster Hyper-V Storage Choice

    I am trying to deploy a 2 nodes Hyper-V Failover cluster in a closed environment.  My current setup is 2 servers as hypervisors and 1 server as AD DC + Storage server.  All 3 are running Windows Server 2012 R2.
    Since everything is running on Ethernet, my choice of storage is between iSCSI and SMB3.0 ?
    I am more inclined to use SMB3.0 and I did find some instructions online about setting up a Hyper-V cluster connecting to a SMB3.0 File server cluster.  However, I only have budget for 1 storage Server.  Is it a good idea to choice SMB over iSCSI
    in this scenario?  ( where there is only 1 storage for the Hyper-V Cluster ). 
    What do I need to pay attention to in this setup?  Except some unavoidable single points of failures. 
    In the SMB3.0 File server cluster scenario that I mentioned above, they had to use SAS drives for the file server cluster (CSV).  I am guessing in my scenario, SATA drives should be fine, right?

    "I suspect that Starwind solution achieves FT by running shadow copies of VMs on the partner Hypervisor"
    No, it does not run shadow VMs on the partner hypervisor.  Starwind is a product in a family known as 'software defined storage'.  There are a number of solutions on the market.  They all provide a similar service in that they allow for the
    use of local storage, also known as Direct Attached Storage (DAS), instead of external, shared storage for clustering.  Each of these products provides some method to mirror or 'RAID' the storage among the nodes in the software defined storage. 
    So, yes, there is some overhead to ensure data redundancy, but none of this class of product will 'shadow' VMs on another node.  Products like Starwind, Datacore, and others are nice entry points to HA without the expense of purchasing an external storage
    shelf/array of some sort because DAS is used instead.
    1) "Software Defined Storage" is a VERY wide term. Many companies use it for solutions that DO require actual hardware to run on. Say Nexenta claims they do SDS and they need a separate physical servers running Solaris and their (Nexenta) storage app. Microsoft
    we all love so much because they give us infrastructure we use to make our living also has Clustered Storage Spaces MSFT tells is a "Software Defined Storage" but they need physical SAS JBODs, SAS controllers and fabric to operate. These are hybrid software-hardware
    solutions. More pure ones don't need any hardware but they still share actual server hardware with hypervisor (HP VSA, VMware Virtual SAN, oh, BTW, it does require flash to operate so it's again not pure software thing). 
    2) Yes there are number of solutions but devil is in details. Technically all virtualization world is sliding away from ancient way of VM-running storage virtualization stacks to ones being part of a hypervisor (VMware Virtual Storage Appliance replaced
    with VMware Virtual SAN is an excellent example). So talking about Hyper-V there are not so many companies who have implemented VM-less solutions. Except the ones you've named it's also SteelEye and that's all probably (Double-Take cannot replicate running
    VMs effectively so cannot be counted). Running storage virtualization stack as part of a Hyper-V has many benefits compared to VM-running stuff:
    - Performance. Obviously kernel-space running DMA engines (StarWind) and polling driver model (DataCore) are faster in terms of latency and IOPS compared to VM-running I/O all routed over VMBus and emulated storage and network hardware.
    - Simplicity. With native apps it's click and install. With VMs it's UNIX management burden (BTW, who will update forked-out Solaris VSA is running on top of? Sun? Out of business. Oracle? You did not get your ZFS VSA from Oracle. Who?) and always "hen and
    chicken" issue. Cluster starts, it needs access to shared storage to spawn a VMs but VMs are inside a VM VSA that need to be spawned. So first you start storage VMs, then make them sync (few terabytes, maybe couple of hours to check access bitmaps for volumes)
    and only after that you can start your other production VMs. Very nice!
    - Scenario limitations. You want to implement a SCV for Scale-Out File Servers? You canont use HP VSA or StorMagic because SoFS and Hyper-V roles cannot mix on the same hardware. To surf SMB 3.0 tide you need native apps or physical hardware behind. 
    That's why current virtualization leader VMware had clearly pointed where these types of things need to run - side-by-side with hypervisor kernel.
    3) DAS is not only cheaper but also faster then SAN and NAS (obviously). So sure there's no "one size fits all" but unless somebody needs a a) very high LUN density (Oracle or huge SQL database or maybe SAP) and b) very strict SLAs (friendly telecom company
    we provide Tier2 infrastructure for runs cell phone stats on EMC, $1M for a few terabytes. Reason? EMC people have FOUR units like that marked as a "spare" and have requirement to replace failed one in less then 15 minutes) there's no point to deploy hardware
    SAN / NAS for shared storage. SAN / NAS is an sustained innovation and Virtual SAN is disruptive. Disruptive comes to replace sustained for 80-90% of business cases to allow sustained live in a niche deployments. Clayton Christiansen\s "Innovator's Dilemma".
    Classic. More here:
    Disruptive Innovation
    http://en.wikipedia.org/wiki/Disruptive_innovation
    So I would not consider Software Defined Storage as a poor-mans HA or usable to Test & Development only. Thing is ready for prime time long time ago. Talk to hardware SAN VARs if you have connections: How many stand-alone units did they sell to SMBs
    & ROBO deployments last year?
    StarWind VSAN [Virtual SAN] clusters Hyper-V without SAS, Fibre Channel, SMB 3.0 or iSCSI, uses Ethernet to mirror internally mounted SATA disks between hosts.

  • Missing a VM in Failover Cluster Manager

    We did eventually cut the power when shutting down the second of the two hosts in the cluster, because it got stuck at some stage during shutdown. The reason we shut down was that there were problems piling up while we moved VMs between the two nodes.
    After boot, we no longer see one of the virtual machines in the Failover Cluster Manager. In Hyper-V Manager we see it, and it runs just fine.
    How can we add the missing VM back in Failover Cluster Manager? Is there a simple way?
    All the servers are Win 2008 R2.
    Maybe a simple question - I don't know - but we just don't have enough knowledge right now, nor has searching the net resulted in anything useful. Appreciate a bit of help holding our heads above water while we learn to swim.
    Bent Tranberg

    You mean what I see in Cluster Events in Failover Cluster Manager? I saved it, opened it in Event Viewer, and saved it as tab delimited text. Is that ok?
    I have removed the text after the event IDs whenever the text was exactly identical to the preceding event with the same event ID, to shorten my post. I also cut away a long list of repetitive events where you see the ellipsis'.
    Level Date and Time
    Source Event ID
    Task Category
    Error 05.01.2012 10:25:15
    Microsoft-Windows-FailoverClustering
    1205 Resource Control Manager
    The Cluster service failed to bring clustered service or application 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.
    Error 05.01.2012 10:25:15
    Microsoft-Windows-FailoverClustering
    1069 Resource Control Manager
    Cluster resource 'Cluster IP Address' in clustered service or application 'Cluster Group' failed.
    Error 05.01.2012 10:25:15
    Microsoft-Windows-FailoverClustering
    1049 IP Address Resource
    Cluster IP address resource 'Cluster IP Address' cannot be brought online because a duplicate IP address '10.10.1.16' was detected on the network.  Please ensure all IP addresses are unique.
    Error 05.01.2012 10:25:07
    Microsoft-Windows-FailoverClustering
    1069
    Error 05.01.2012 09:24:43
    Microsoft-Windows-FailoverClustering 1205
    Error 05.01.2012 09:24:43
    Microsoft-Windows-FailoverClustering
    1069
    Error 05.01.2012 09:24:43
    Microsoft-Windows-FailoverClustering
    1049
    Error 05.01.2012 09:24:35
    Microsoft-Windows-FailoverClustering
    1069
    Error 04.01.2012 14:04:33
    Microsoft-Windows-FailoverClustering 1205
    Error 04.01.2012 14:04:33
    Microsoft-Windows-FailoverClustering
    1069
    Error 04.01.2012 14:04:33
    Microsoft-Windows-FailoverClustering
    1049
    Error 04.01.2012 14:04:25
    Microsoft-Windows-FailoverClustering
    1069
    Error 04.01.2012 14:04:25
    Microsoft-Windows-FailoverClustering
    1049
    Critical 04.01.2012 13:59:40
    Microsoft-Windows-FailoverClustering
    1135 Node Mgr
    Cluster node 'VSH2' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a
    Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected
    such as hubs, switches, or bridges.
    Error 04.01.2012 13:06:52
    Microsoft-Windows-FailoverClustering
    1205
    Error 04.01.2012 12:00:38
    Microsoft-Windows-FailoverClustering 1205
    Error 04.01.2012 12:00:38
    Microsoft-Windows-FailoverClustering
    1069
    Error 04.01.2012 12:00:37
    Microsoft-Windows-FailoverClustering
    1049
    Error 04.01.2012 12:00:25
    Microsoft-Windows-FailoverClustering
    1069
    Error 04.01.2012 12:00:25
    Microsoft-Windows-FailoverClustering
    1049
    Critical 04.01.2012 11:54:36
    Microsoft-Windows-FailoverClustering
    1146 Resource Control Manager
    The cluster resource host subsystem (RHS) stopped unexpectedly. An attempt will be made to restart it. This is usually due to a problem in a resource DLL. Please determine which resource DLL is causing the issue and report the problem to the resource
    vendor.
    Error 04.01.2012 11:54:36
    Microsoft-Windows-FailoverClustering
    1230 Resource Control Manager
    Cluster resource 'SCVMM AP Configuration' (resource type '', DLL 'vmclusres.dll') either crashed or deadlocked. The Resource Hosting Subsystem (RHS) process will now attempt to terminate, and the resource will be marked to run in a separate monitor.
    Critical 04.01.2012 11:49:35
    Microsoft-Windows-FailoverClustering
    1146
    Error 04.01.2012 11:49:35
    Microsoft-Windows-FailoverClustering
    1230
    Critical 04.01.2012 11:44:33
    Microsoft-Windows-FailoverClustering
    1146
    Error 04.01.2012 11:34:23
    Microsoft-Windows-FailoverClustering
    1205
    Error 04.01.2012 11:34:23
    Microsoft-Windows-FailoverClustering
    1069
    Error 04.01.2012 11:34:23
    Microsoft-Windows-FailoverClustering
    1049
    Error 04.01.2012 11:24:22
    Microsoft-Windows-FailoverClustering 1069
    Critical 04.01.2012 11:24:22
    Microsoft-Windows-FailoverClustering
    1564 File Share Witness Resource
    File share witness resource 'File Share Witness' failed to arbitrate for the file share '\\sm\QuorumFolder_Do_Not_Delete'. Please ensure that file share '\\sm\QuorumFolder_Do_Not_Delete' exists and is accessible by the cluster.
    Error 04.01.2012 11:24:21
    Microsoft-Windows-FailoverClustering
    1069
    Warning 04.01.2012 11:24:21
    Microsoft-Windows-FailoverClustering
    1562 File Share Witness Resource
    File share witness resource 'File Share Witness' failed a periodic health check on file share '\\sm\QuorumFolder_Do_Not_Delete'. Please ensure that file share '\\sm\QuorumFolder_Do_Not_Delete' exists and is accessible by the cluster.
    Error 04.01.2012 11:23:39
    Microsoft-Windows-FailoverClustering
    1069 Resource Control Manager
    Cluster resource 'SCVMM vs2005Bent' in clustered service or application 'SCVMM vs2005Bent Resources' failed.
    Error 04.01.2012 11:19:39
    Microsoft-Windows-FailoverClustering
    1069
    Error 04.01.2012 11:19:19
    Microsoft-Windows-FailoverClustering
    1069
    Critical 04.01.2012 11:05:36
    Microsoft-Windows-FailoverClustering
    1135 Node Mgr
    Cluster node 'VSH2' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a
    Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected
    such as hubs, switches, or bridges.
    Critical 04.01.2012 10:48:51
    Microsoft-Windows-FailoverClustering
    1135
    Critical 04.01.2012 10:41:47
    Microsoft-Windows-FailoverClustering
    1135
    Error 04.01.2012 10:33:51
    Microsoft-Windows-FailoverClustering
    1205
    Error 04.01.2012 10:33:51
    Microsoft-Windows-FailoverClustering
    1069
    Error 04.01.2012 10:33:51
    Microsoft-Windows-FailoverClustering
    1049
    Error 04.01.2012 08:33:10
    Microsoft-Windows-FailoverClustering 1069
    Error 04.01.2012 08:33:10
    Microsoft-Windows-FailoverClustering
    1049
    Bent Tranberg

Maybe you are looking for

  • Why can't I log in to Safari? Why doesn't my grid appear anymore?

    Why can't I log into safari? Why doesn't my grid appear anymore?

  • Conversion of PO into Sale order

    Hi All, Is it possible to convert a PO to a sale order in the same client for two different plant in two diffrent company codes. I mean to say a PO is generated from one plant in company code 1 should become a saleorder for company code 2 in second p

  • DVCPRO HD - Non Drop Time Line?

    I am just beginning to edit a project with all DVCPRO-HD media. Does anyone know if it is important to deselect the 'Drop Frame' choice in the User Preferences Timeline tab?

  • Managed server startup failed

    Hi, I get the error below when starting a managed server.           Strange, because I did not have any problems before...           I did update my license file and I created an extra managed server           (cloned from this one, but with other IP

  • My problem in using weblogic Datasource and proxy user

    Hello I create a DataSource in Weblogic that connect to the database by a proxy user and I have a client application that use this DataSource and create a proxy session , I've written my client application (it's a stand alone client application) code