"Optimize" a CSV volume

I have a H-V 2012 cluster with 3 nodes. I use a CSV volume to store the VMs. I have all the latest patches installed. I have an Equallogic SAN (PS4000) with the latest firmware on it providing the LUN for the CSV. Everything in my environment is
supposed to support re-thinning (or unmapping, or whatever the right term is) the LUN. I have about 500 GB of unused space on the 2TB volume, and the volume was thin provisioned. A restore of some very large vhd files from backup caused the
thin provisioned volume to grow and to use almost the entire volume at one point but the corrupt VHDs have since been deleted. Now I have "dirty" blocks in the LUN that I want to reclaim into free space on the SAN. This all happens, apparently,
when Server 2012 performs an "Optimize" on the disks. In my environment this is scheduled to happen once a week. It did apparently do something this last week, because my volume utilization on the SAN went from 96% to 91%. Not even
close to reclaiming all dirty blocks, but it's a start I guess. So now I went in to the "Defragment and Optimize Drives" utility and told it to commence a manual optimization. Nothing happens and event viewer give me this error:
The volume VMStorage1 (C:\ClusterStorage\Volume1) was not optimized because an error was encountered: CSVFS failed operation as volume is not in redirected mode. (0x8007174F)
So my questions are these:
Shouldn't it put the CSV in redirect mode if it needs to do this in order to optimize the drive automatically?
If it can't do this automatically, how did it return 5% of the CSV SAN volume to free space last week?
Can I put the volume in redirect mode manually and do the optimize manually? Redirect mode is not supposed to be necessary in 2012 CSV any more- at least not for backup. Why here?
Will my environment re-thin, Unmap, whatever? It appears it MIGHT. Does it take several iterations (ie weeks)?
Can anyone explain this incredibly vague and cloaked process from a Windows server 2012 perspective?
Thank you for any help!
DML
DLovitt

Hi,
In CSVv2.0, every effort was made to expand the number of scenarios that would use Direct I/O over Redirected I/O.
Direct I/O delivers faster performance with lower network overhead. Emphasis is on using Direct I/O for all types of file open actions.
Direct I/O uses buffered reads and writes which means it can take advantage of the Windows Cache Manager. As an example, Direct I/O results in better virtual machine creation times and improved copy performance. In CSVv1.0, to get the highest performance
during a copy operation, the destination node had to be the Coordinator node for the destination CSV volume.
CSVv2.0 uses a new algorithm for determine what types of I/O are redirected.
Oplocks are used as a distributed locking mechanism to determine if I/O can go via a direct path.
All of the optimizations and performance improvements are for naught if the file system cannot remain available for the applications to use.
The new file system check and repair capability goes a long way towards ensuring file system availability.
The new file system health-checking model coupled with new functionality in chkdsk helps in this area.
In addition, CSV volumes can take advantage of these new capabilities.
Thanks.
Kevin Ni
Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

Similar Messages

Unable to delete the files from CSV volumes on HyperV Cluster

Hello There,
I have a HyperV failover cluster with CSV Volumes recently i moved some of the VMs to another cluster.
VMs are moved but i still have the VM files on the CSV volumes which are occupying the disk space i tried to delete the VHD files / VM folders which are moved but it doesn't delete the files, please suggest.
This file when i browse it from a server and delete the file it disappears but when i revisit the folder i find the files still on the disk, i did try to delete the files directly
from the server through command line as it is running server core.
Regards,
Maqsood
Maqsood Mohammed Senior Systems Engineer MCITP-Enterprise Admin & ITILv3 Foundation Certified

HyperV is good about not allowing you to delete files while they are still in use. You can try to reboot host, make sure all parts if your VM moved to the new location. If that VHD is associated with a VM on any host you will not be able to delete
it. Delete the VM's that may have links to it. Not knowing you configuration could it be a parent disk? Be carful because if you can't delete it it's likely in use, I've see VHD merge after you delete a VM too preventing you from deleting the files.
You may just want to wait a day or so and see if it free's up. If it is doing a merge reboot will pause and restart it so you won't be able to remove until the merge is done, once you delete a VM and a merge starts there is no way to tell if it's merging,
watch the size and timestamp of the VHD is it changing if it is something is using it.

Attach CSV volumes from filter driver during system startup

Hi,
We have written a filter driver to track Hyper-V CSV volumes in order to track the modifications in those volumes for backup purpose. When the Hyper-V host is running, we are able to attach the CSV volumes from the driver without any issues. But during Hyper-V
host startup, our driver failed to attach the csv volumes.
We suspect that filter driver failed to attach CSV volume, since Cluster service was not started at that point of time during the system start up. If we attach the CSV volume later, it works. However, for the continuos tracking we want our driver
to track the modifications from the system startup itself. I believe, we need to load the filter driver once after the csv service is started.
Configurations of filter driver are as following(inf file).
DisplayName = %ServiceName%
Description = %ServiceDescription%
ServiceBinary = %12%\%DriverName%.sys ;%windir%\system32\drivers\
Dependencies = FltMgr
ServiceType = 2 ;SERVICE_FILE_SYSTEM_DRIVER
StartType = 2 ;SERVICE_SYSTEM_START
ErrorControl = 1 ;SERVICE_ERROR_NORMAL
LoadOrderGroup = "FSFilter Activity Monitor"
AddReg = Minispy.AddRegistry
;Instances specific information.
DefaultInstance = "Minispy - Top Instance"
Instance1.Name = "Minispy - Middle Instance"
Instance1.Altitude = "370000"
Instance1.Flags = 0x1 ; Suppress automatic attachments
Instance2.Name = "Minispy - Bottom Instance"
Instance2.Altitude = "361000"
Instance2.Flags = 0x1 ; Suppress automatic attachments
Instance3.Name = "Minispy - Top Instance"
Instance3.Altitude = "385100"
Instance3.Flags = 0x1 ; Suppress automatic attachment
Can you please let us know how to fix this issue? Whether we need to change any configuration in inf file?
For Online backup use StoreGrid. Its really cool

Hi,
We have written a filter driver to track Hyper-V CSV volumes in order to track the modifications in those volumes for backup purpose. When the Hyper-V host is running, we are able to attach the CSV volumes from the driver without any issues. But during Hyper-V
host startup, our driver failed to attach the csv volumes.
We suspect that filter driver failed to attach CSV volume, since Cluster service was not started at that point of time during the system start up. If we attach the CSV volume later, it works. However, for the continuos tracking we want our driver
to track the modifications from the system startup itself. I believe, we need to load the filter driver once after the csv service is started.
Configurations of filter driver are as following(inf file).
DisplayName = %ServiceName%
Description = %ServiceDescription%
ServiceBinary = %12%\%DriverName%.sys ;%windir%\system32\drivers\
Dependencies = FltMgr
ServiceType = 2 ;SERVICE_FILE_SYSTEM_DRIVER
StartType = 2 ;SERVICE_SYSTEM_START
ErrorControl = 1 ;SERVICE_ERROR_NORMAL
LoadOrderGroup = "FSFilter Activity Monitor"
AddReg = Minispy.AddRegistry
;Instances specific information.
DefaultInstance = "Minispy - Top Instance"
Instance1.Name = "Minispy - Middle Instance"
Instance1.Altitude = "370000"
Instance1.Flags = 0x1 ; Suppress automatic attachments
Instance2.Name = "Minispy - Bottom Instance"
Instance2.Altitude = "361000"
Instance2.Flags = 0x1 ; Suppress automatic attachments
Instance3.Name = "Minispy - Top Instance"
Instance3.Altitude = "385100"
Instance3.Flags = 0x1 ; Suppress automatic attachment
Can you please let us know how to fix this issue? Whether we need to change any configuration in inf file?
For Online backup use StoreGrid. Its really cool

Error 2927 when creating VM from template to remote SMB storage or CSV volume.

Hi everybody! When you try to create a VM from a template cause an error (2927), this occurs only if I create a VM on remote SMB storage or CSV volume, if I create a VM from a template on a local repository server (local disk C:\ for example) Hyper-V,
while no such problem and everything goes well. Already during the week I can not solve this problem, and and tried many methods but they do not help. Please help me to solve this problem. OS 2012R2 with last update, VMM2012R2 with last update.
Thx!

Yes I have installed UR2 for VMM and also applied the T-SQL script, I created new test environment, added Hyper-V hosts in my environment.

Strange behavior in the occupation of a 10 TB CSV volume...

Hi all;
I have a CSV volume (10 TB in size) that hosts VMs from two clustered Hyper-V machines. When I sums the actual size of the VMs, the OS reports 4.1 TB used, but the Cluster Manager reports just only 150 GB free!!
As you know, CSV appears as only Mount Points in Windows Explorer, How can I find why the space is full and with which content?
Thanks
Please VOTE as HELPFUL if the post helps you and remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading
the thread.

Hi,
I'm thinking to output all files with size to check the result. Please try:
Get-ChildItem c:\clusterstorage\volumeX |select fullname,length |export-csv c:\xxx.csv
I assume that some files may not be listed if the running account do not have permission on some files or folders. This is a common cause of missing disk space (but not so big difference) but it still worth to have a try.
Please remember to mark the replies as answers if they help and un-mark them if they provide no help. If you have feedback for TechNet Support, contact [email protected]

CSV Volume low on disk space warning alltough the space isn't used (VSS?)

HI Guys,
we are running a HV-2012 Cluster. most of the VM's are backuped by DPM 2012 r2.
Recently i noticed some "CSV low disk Space Warning" withing SCOM.
Today i looked at one of the CSV's.
The Size is 1800GB and if you look at the Size of all the Folders on the Volume within the Explorer they are 1250 GB.
After that i looked a the Failover Cluster Manager -> Storage -> Disk: This console states that there are only 36GB Free on the Volume.
My guess is that the VSS Snapshots use up the Disk-Space? But if i do a vssadmin list shadowstorage
only drive c is displayed.
Could this end up in a Problem or is the vss Space freed automaticly?
Or is there something else wrong?
Anyone a clue on this?
regards
Stefan

First thing is that you're running out of space and need to move quickly to make space while you figure this out. When you run out of space, the VMs will shut down.
How many VMs you have on that CSV and many VHD(x) disks you have an the CSV? Another question that matters here is how fast is your backup for each VM?
Regardless of the backup software being DPM or any other VSS based backup, it works by creating a checkpoint of the VM, which creates a snapshot of its disks. A volume snapshot turns the VM disk into a read only disk and starts a change log disk (.avhdx)
file. The crash consistent read-only .vhdx file is then copied by the backup software agent from the host to the backup server. Upon successful completion, the agent deletes the VM checkpoint, which merges the disk snapshot, merging the .avhdx and .vhdx files
back into one file.
So, if you're backup time for this VM takes hours, during that time, the VM disk space usage can grow a lot more that what it would usually use. If you run out of disk space then, backup will fail, the VM will be suspended, and you cannot bring it online,
since the merge needs even more disk space.
Sam Boutros, Senior Consultant, Software Logic, KOP, PA http://superwidgets.wordpress.com (Please take a moment to Vote as Helpful and/or Mark as Answer, where applicable) _________________________________________________________________________________
Powershell: Learn it before it's an emergency http://technet.microsoft.com/en-us/scriptcenter/powershell.aspx http://technet.microsoft.com/en-us/scriptcenter/dd793612.aspx

Losing Access to Cluster Shared Volumes: Cluster Shared Volume 'Volume1' ('CSV Disk1') has entered a paused state because of '(c0000435)'

Hi,
Just built a Server 2012 R2 Hyper-V failover cluster connected to Equallogic 4110 storage arrays with latest firmware and HIT kits.
When creating a clone or vm from a template we see that the cluster loses access to the storage csv volume that is hosted on the equallogic storage with the following errors:
Cluster Shared Volume 'Volume1' ('CSV Disk1') has entered a paused state because of '(c0000435)'. All I/O will temporarily be queued until a path to the volume is reestablished.
Can anyone shed any light onto this issue?
Full details below:
Log Name: System
Source: Microsoft-Windows-FailoverClustering
Date: 06/08/2014 09:31:17
Event ID: 5120
Task Category: Cluster Shared Volume
Level: Error
Keywords:
User: SYSTEM
Computer: SVR1
Description:
Cluster Shared Volume 'Volume1' ('CSV Disk1') has entered a paused state because of '(c0000435)'. All I/O will temporarily be queued until a path to the volume is reestablished.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-Windows-FailoverClustering" Guid="{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}" />
<EventID>5120</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>38</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000000</Keywords>
<TimeCreated SystemTime="2014-08-06T08:31:17.330643100Z" />
<EventRecordID>36230</EventRecordID>
<Correlation />
<Execution ProcessID="2336" ThreadID="3524" />
<Channel>System</Channel>
<Computer>SVR1</Computer>
<Security UserID="S-1-5-18" />
</System>
<EventData>
<Data Name="VolumeName">Volume1</Data>
<Data Name="ResourceName">CSV Disk1</Data>
<Data Name="ErrorCode">(c0000435)</Data>
</EventData>
</Event>
Microsoft Partner

Hi rEMOTE_eVENT,
Could you tell us how you clone a vm “When creating a clone or vm from a template” , did your cluster can pass the cluster validation test, the copied vm have the same
BIOSGUID information and etc. Please try to use the general installed system to install failover cluster.
More information:
How to use uniquely identify a virtual machine in Hyper-V
http://blogs.technet.com/b/jhoward/archive/2008/09/16/how-to-use-uniquely-identify-a-virtual-machine-in-hyper-v.aspx
The similar thread:
How to Clone VMs in Hyper-V
http://social.technet.microsoft.com/Forums/windowsserver/en-US/67c4c555-14fd-4164-bf5b-59ce883c8b18/how-to-clone-vms-in-hyperv?forum=winserverhyperv
I’m glad to be of help to you!
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place.

Windows 2012 Nodes - Slow CSV Performance - Need help to resolve my iSCSI issue configuration

I spent weeks going over the forums and the net for any publications and advice on how to optimize iSCSI connections and i'm about to give up. I really need some help in determining if its something i'm not configuring right or maybe its an equipment
issue.
Hardware:
2x Windows 2012 Hosts with 10 Nics (same NIC configuration) in a Failover Cluster sharing a CSV LUN.
3x NICs Teamed for Host/Live Migration (192.168.0.x)
2x NICS teamed for Hyper-V Switch 1 (192.168.0.x)
1x NIC teamed for Hyper-V Switch 2 (192.168.10.x)
4x NICs for iSCSI traffic (192.168.0.x, 192.168.10.x, 192.168.20.x 192.168.30.x)
Jumbo frames and flow control turned on all the NICs on the host. IpV6 disabled. Client for Microsoft Network, File/Printing Sharing Disabled on iSCSI NICs.
MPIO Least Queue selected. Round Robin gives me an error message saying "The parameter is incorrect. The round robin policy attempts to evenly distribute incoming requests to all processing paths. "
Netgear ReadyNas 3200
4x NICs for iSCSI traffic ((192.168.0.x, 192.168.10.x, 192.168.20.x 192.168.30.x)
Network Hardware:
Cisco 2960S managed switch - Flow control on, Spanning Tree on, Jumbo Frames at 9k - this is for the .0 subnet
Netgear unmanaged switch - Flow control on, Jumbo Frames at 9k - this is for .10 subnet
Netgear unmanaged switch - Flow control on, Jumbo Frames at 9k - this is for .20 subnet
Netgear unmanaged switch - Flow control on, Jumbo Frames at 9k - this is for .30 subnet
Host Configuration (things I tried turning on and off):
Autotuning
RSS
Chimney Offload
I have 8 VMs stored in the CSV. When try to load all 8 up at the same time, they bog down.  Each VM loads very slowly and when they eventually come up, most of the important services did not start. I have to load
them up 1 or 2 at a time. Even then the performance is nothing like if they were loading up on the Host itself (VHD stored on the host's hdd). This is what prompted me to add in more iSCSI connections to see if I can improve the VM's
performance.  Even with 4 iSCSI connections, I feel nothing has changed. The VMs still start up slowly and services do not load right. If I distribute the load with 4 VMs on Host 1 and 4 VMs on Host 2, the load up
times do not change.
As a manual test for file copy speed, I moved the cluster resources to Host 1 and copied a VM from the CSV and onto the Host.   The speed would start out around 250megs/sec and then eventually drop down to about 50/60 megs/sec. If I turn
off all iSCSI connections except one, it get the same speed. I can verify from the Windows Performance Tab under Task Manager that all the NICS are distributing traffic evenly, but something is just limiting the flow. Like what I stated on top,
I played around with autotuning, RSS and chimney offload and none of it makes a difference.
The VMs have been converted to VHDx and to fixed size. That did not help.
Is there something I'm not doing right? I am working with Netgear support and they are puzzled as well. The ReadyNas device should easily be able to handle it.
Please help! I pulled my hair out over this for the past two months and I'm about to give up and just ditch clustering all together and just run the VMs off the hosts themselves.
George

A few things...
For starters, I recommend opening a case with Microsoft support. They will be able to dig in and help you...
Turn on the CSV Cache, it will boost your performance
http://blogs.msdn.com/b/clustering/archive/2012/03/22/10286676.aspx
A file copy has no resemblance of the unbuffered I/O a VM does... so don't use that as a comparison, as you are comparing apples to oranges.
Do you see any I/O performance difference between the coordinator node and the non-coordinator nodes? Basically, see which node owns the cluster Physical Disk resource... measure the performance. Then move the Physical Disk resource for the
CSV volume to another node, and repeat the same measure of performance... then compare them.
Your IP addressing seems odd... you show multiple networks on 192.168.0.x and also on 192.168.10.x.   Remember that clustering only recognizes and uses 1 logical interface per IP subnet. I would triple check all your IP schemes...
to ensure they are all different logical networks.
Check you binding order
Make sure you NIC drivers and NIC firmware are updated
Make sure you don't have IPsec enabled, that will significantly impact your network performance
For the iSCSI Software Initiator, when you did your connection... make sure you didn't do a 'Quick Connect'... that will do a wildcard and connect over any network. You want to specify your dedicated iSCSI network
No idea what the performance capabilities of the ReadyNas is... this could all likely be associated with the shared storage.
What speed NIC's are you using?   I hope at least 10 GB...
Hope that helps...
Elden
Hi Elden,
2. CSV is turned on, I have 4GB dedicated from each host to it. With IOmeter running within the VMs, I do see the read speed jumped up 4-5x fold but the write speed stays the same (which according to the doc it should). But even with the read
speed that high, the VMs are not starting up quickly.
4. I do not see any difference with IO with coordinator and non coordinator nodes.
5. I'm not 100% sure what your saying about my IPs. Maybe if I list it out, you can help explain further.
Host 1 - 192.168.0.241 (Host/LM IP), Undefined IP on the 192.168.0.x network (Hyper-V Port 1), Undefined IP on the 192.168.10.x network (Hyper- V port 2), 192.168.0.220 (iSCSI 1), 192.168.10.10 (iSCSI2), 192.168.20.10(iSCSI 3), 192.168.30.10 (iSCSI 4)
The Hyper-V ports are undefined because the VMs themselves have static ips.
0.220 host NIC connects with the .231 NIC of the NAS
10.10 host NIC connects with the 10.100 NIC of the NAS
20.10 host NIC connects with the 20.100 NIC of the NAS
30.10 host NIC connects with the 30.100 NIC of the NAS
Host 2 - 192.168.0.245 (Host/LM IP), Undefined IP on the 192.168.0.x network (Hyper-V Port 1), Undefined IP on the 192.168.10.x network (Hyper- V port 2), 192.168.0.221 (iSCSI 1), 192.168.10.20 (iSCSI2), 192.168.20.20(iSCSI 3), 192.168.30.20 (iSCSI 4)
The Hyper-V ports are undefined because the VMs themselves have static ips.
0.221 host NIC connects with the .231 NIC of the NAS
10.20 host NIC connects with the 10.100 NIC of the NAS
20.20 host NIC connects with the 20.100 NIC of the NAS
30.20 host NIC connects with the 30.100 NIC of the NAS
6. Binding orders are all correct.
7. Nic drivers are all updated. Didn't check the firmware.
8. I do not know about IPSec...let me look into it.
9. I did not do quick connect, each iscsi connection is defined using a specific source ip and specific target ip.
These are all 1gigabit nics, which is the reason why I have so many NICs...otherwise there would be no reason for me to have 4 iscsi connections.

Windows Server 2012 - Hyper-V - Cluster Sharded Storage - VHDX unexpectedly gets copied to System Volume Information by "System", Virtual Machines stops respondig

We have a problem with one of our deployments of Windows Server 2012 Hyper-V with a 2 node cluster connected to a iSCSI SAN.
Our setup:
Hosts - Both run Windows Server 2012 Standard and are clustered.
HP ProLiant G7, 24 GB RAM. This is the primary host and normaly all VMs run on this host.
HP ProLiant G5, 20 GB RAM. This is the secondary host that and is intended to be used in case of failure of the primary host.
We have no antivirus on the hosts and the scheduled ShadowCopy (previous version of files) is switched off.
iSCSI SAN:
QNAP NAS TS-869 Pro, 8 INTEL SSDSA2CW160G3 160 GB i a RAID 5 with a Host Spare. 2 Teamed NIC.
Switch:
DLINK DGS-1210-16 - Both the network cards of the Hosts that are dedicated to the Storage and the Storage itself are connected to the same switch and nothing else is connected to this switch.
Virtual Machines:
3 Windows Server 2012 Standard - 1 DC, 1 FileServer, 1 Application Server.
1 Windows Server 2008 Standard Exchange Server.
All VMs are using dynamic disks (as recommended by Microsoft).
Updates
We have applied the most resent updates to the Hosts, VMs and iSCSI SAN about 3 weeks ago with no change in our problem and we continually update the setup.
Normal operation:
Normally this setup works just fine and we see no real difference in speed in startup, file copy and processing speed in LoB applications of this setup compared to a single host with two 10000 RPM Disks. Normal network speed is 10-200 Mbit, but occasionally
we see speeds up to 400 Mbit/s of combined read/write for instance during file repair.
Our Problem:
Our problem is that for some reason a random VHDX gets copied to System Volume Information by "System" of the Clusterd Shared Storage (i.e. C:\ClusterStorage\Volume1\System Volume Information).
All VMs stops responding or responds very slowly during this copy process and you can for instance not send CTRL-ALT-DEL to a VM in the Hyper-V console, or for instance start task manager when already logged in.
This happens at random and not every day and different VHDX files from different VMs gets copied each time. Some time it happens during daytime wich causes a lot of problems, especially when a 200 GB file gets copied (which take a lot of time).
What it is not:
We thought that this was connected to the backup, but the backup had finished 3 hours before the last time this happended and the backup never uses any of the files in System Volume Information so it is not the backup.
An observation:
When this happend today I switched on ShadowCopy (previous files) and set it to only to use 320 MB of storage and then the Copy Process stopped and the virtual Machines started responding again. This could be unrelated since there is no way to see
how much of the VHDX that is left to be copied, so it might have been finished at the same time as I enabled ShadowCopy (previos files).
Our question:
Why is a VHDX copied to System Volume Information when scheduled ShadowCopy (previous version of files) is switched off? As far as I know, nothing should be copied to this folder when this functionis switched off?
List of VSS Writers:
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2012 Microsoft Corp.
Writer name: 'Task Scheduler Writer'
   Writer Id: {d61d61c8-d73a-4eee-8cdd-f6f9786b7124}
   Writer Instance Id: {1bddd48e-5052-49db-9b07-b96f96727e6b}
   State: [1] Stable
   Last error: No error
Writer name: 'VSS Metadata Store Writer'
   Writer Id: {75dfb225-e2e4-4d39-9ac9-ffaff65ddf06}
   Writer Instance Id: {088e7a7d-09a8-4cc6-a609-ad90e75ddc93}
   State: [1] Stable
   Last error: No error
Writer name: 'Performance Counters Writer'
   Writer Id: {0bada1de-01a9-4625-8278-69e735f39dd2}
   Writer Instance Id: {f0086dda-9efc-47c5-8eb6-a944c3d09381}
   State: [1] Stable
   Last error: No error
Writer name: 'System Writer'
   Writer Id: {e8132975-6f93-4464-a53e-1050253ae220}
   Writer Instance Id: {7848396d-00b1-47cd-8ba9-769b7ce402d2}
   State: [1] Stable
   Last error: No error
Writer name: 'Microsoft Hyper-V VSS Writer'
   Writer Id: {66841cd4-6ded-4f4b-8f17-fd23f8ddc3de}
   Writer Instance Id: {8b6c534a-18dd-4fff-b14e-1d4aebd1db74}
   State: [5] Waiting for completion
   Last error: No error
Writer name: 'Cluster Shared Volume VSS Writer'
   Writer Id: {1072ae1c-e5a7-4ea1-9e4a-6f7964656570}
   Writer Instance Id: {d46c6a69-8b4a-4307-afcf-ca3611c7f680}
   State: [1] Stable
   Last error: No error
Writer name: 'ASR Writer'
   Writer Id: {be000cbe-11fe-4426-9c58-531aa6355fc4}
   Writer Instance Id: {fc530484-71db-48c3-af5f-ef398070373e}
   State: [1] Stable
   Last error: No error
Writer name: 'WMI Writer'
   Writer Id: {a6ad56c2-b509-4e6c-bb19-49d8f43532f0}
   Writer Instance Id: {3792e26e-c0d0-4901-b799-2e8d9ffe2085}
   State: [1] Stable
   Last error: No error
Writer name: 'Registry Writer'
   Writer Id: {afbab4a2-367d-4d15-a586-71dbb18f8485}
   Writer Instance Id: {6ea65f92-e3fd-4a23-9e5f-b23de43bc756}
   State: [1] Stable
   Last error: No error
Writer name: 'BITS Writer'
   Writer Id: {4969d978-be47-48b0-b100-f328f07ac1e0}
   Writer Instance Id: {71dc7876-2089-472c-8fed-4b8862037528}
   State: [1] Stable
   Last error: No error
Writer name: 'Shadow Copy Optimization Writer'
   Writer Id: {4dc3bdd4-ab48-4d07-adb0-3bee2926fd7f}
   Writer Instance Id: {cb0c7fd8-1f5c-41bb-b2cc-82fabbdc466e}
   State: [1] Stable
   Last error: No error
Writer name: 'Cluster Database'
   Writer Id: {41e12264-35d8-479b-8e5c-9b23d1dad37e}
   Writer Instance Id: {23320f7e-f165-409d-8456-5d7d8fbaefed}
   State: [1] Stable
   Last error: No error
Writer name: 'COM+ REGDB Writer'
   Writer Id: {542da469-d3e1-473c-9f4f-7847f01fc64f}
   Writer Instance Id: {f23d0208-e569-48b0-ad30-1addb1a044af}
   State: [1] Stable
   Last error: No error
Please note:
Please only answer our question and do not offer any general optimization tips that do not directly adress the issue! We want the problem to go away, not to finish a bit faster!

Hallo Lawrence!
Thankyou for youre reply, some comments to help you and others who read this thread:
First of all, we use Windows Server 2012 and the VHDX as I wrote in the headline and in the text in my post. We have not had this problem in similar setups with Windows Server 2008 R2, so the problem seem to be introduced in Windows Server 2012.
These posts that you refer to seem to be outdated and/or do not apply to our configuration:
The post about Dynamic Disks:
http://technet.microsoft.com/en-us/library/ee941151(v=WS.10).aspx is only a recommendation for Windows Server 2008 R2 and the VHD format. Dynamic VHDX is indeed recommended by Microsoft when using Windows Server 2012 (please look in the optimization guide
for Windows Server 2012).
Infact, if we use fixed VHDX then we would have a bigger problem since fixed VHDX are generaly larger then Dynamic Disks, i.e. more data would be copied and that would take longer time = the VMs would be unresponsive for a longer time.
The post "What's the deal with the System Volume Information folder"
http://blogs.msdn.com/b/oldnewthing/archive/2003/11/20/55764.aspx is for Windows XP / Windows Server 2003 and some things has changed since then. for instance In Windows Server 2012, Shadow Copies cannot be controlled by going to Control panel -> System.
Instead you right-click on a Drive (i.e. a Volume, for instance the C drive/Volume) in Computer and then click "Configure Shadow Copies".
Windows Server 2008 R2 Backup problem
http://social.technet.microsoft.com/Forums/en/windowsbackup/thread/0fc53adb-477d-425b-8c99-ad006e132336 - This post is about the Antivirus software trying to scan files used during backup that exists in the System Volume Information folder and we do not
have any antivirus software installed on our hosts as I stated in my post.
Comment that might help us:
So according to “System Volume Information” definition, the operation you mentioned is Volume Shadow Copy. Check event viewer to find Volume Shadow Copy related event logs and post them.
Why?
Furhter investigation suggests that a volume shadow copy is somehow created even though the Schedule for Shadows Copies is turned off for all drives. This happens at random and we have not found any pattern. Yesterday this operation took almost all available
disk space (over 200 GB), but all the disk space was released when I turned on scheduled Shadow Copies for the CSV.
I therefore draw these conclusions:
The CSV Volume has about 600 GB of disk space and since Volume Shadows Copy used 200 GB, or about 33% of the disk space, and the default limit is 10% then I conclude that for some reason the unscheduled Volume Shadow Copy did not have any limit (or ignored
the limit).
When I turned on the Schedule I also change the limit to the minimum amount which is 320 MB and this is probably what released the disk space. That is, the unscheduled Volume Shadow Copy operation was aborted and it adhered to the limit and deleted the
Volume Shadow Copy it had taken.
I have also set the limit for Volume Shadow Copies for all other volumes to 320 MB by using the "Configure Shadow Copies" Window that you open by right clicking on a drive (volume) in Computer and then selecting "Configure Shadow Copies...".
It is important to note that setting a limit for Shadow Copy Storage, and disabaling the Schedule are two different things! It is possible to have unlimited storage for Shadow Copies when the Schedule is disabled, however I do not know if this was the case
Before I enabled Shadow Copies on the CSV since I did not look for this.
I now have defined a limit for Shadow Copy Storage to 320 MB on all drives and then no VHDX should be copied to System Volume Information since they are all larger than 320 MB.
Does this sound about right or am I drawing the wrong conclusions?
Limits for Shadow Copies:
Below we list the limits for our two hosts:
"Primary Host":
C:\>vssadmin list shadowstorage
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2012 Microsoft Corp.
Shadow Copy Storage association
   For volume: (\\?\Volume{e3ad7feb-178b-11e2-93e8-806e6f6e6963}\)\\?\Volume{e3ad7feb-178b-11e2-93e8-806e6f6e6963}\
   Shadow Copy Storage volume: (\\?\Volume{e3ad7feb-178b-11e2-93e8-806e6f6e6963}\)\\?\Volume{e3ad7feb-178b-11e2-93e8-806e6f6e6963}\
   Used Shadow Copy Storage space: 0 bytes (0%)
   Allocated Shadow Copy Storage space: 0 bytes (0%)
   Maximum Shadow Copy Storage space: 320 MB (91%)
Shadow Copy Storage association
   For volume: (E:)\\?\Volume{dc0a177b-ab03-44c2-8ff6-499b29c3d5cc}\
   Shadow Copy Storage volume: (E:)\\?\Volume{dc0a177b-ab03-44c2-8ff6-499b29c3d5cc}\
   Used Shadow Copy Storage space: 0 bytes (0%)
   Allocated Shadow Copy Storage space: 0 bytes (0%)
   Maximum Shadow Copy Storage space: 320 MB (0%)
Shadow Copy Storage association
   For volume: (G:)\\?\Volume{f58dc334-17be-11e2-93ee-9c8e991b7c20}\
   Shadow Copy Storage volume: (G:)\\?\Volume{f58dc334-17be-11e2-93ee-9c8e991b7c20}\
   Used Shadow Copy Storage space: 0 bytes (0%)
   Allocated Shadow Copy Storage space: 0 bytes (0%)
   Maximum Shadow Copy Storage space: 320 MB (3%)
Shadow Copy Storage association
   For volume: (C:)\\?\Volume{e3ad7fec-178b-11e2-93e8-806e6f6e6963}\
   Shadow Copy Storage volume: (C:)\\?\Volume{e3ad7fec-178b-11e2-93e8-806e6f6e6963}\
   Used Shadow Copy Storage space: 0 bytes (0%)
   Allocated Shadow Copy Storage space: 0 bytes (0%)
   Maximum Shadow Copy Storage space: 320 MB (0%)
C:\>cd \ClusterStorage\Volume1
Secondary host:
C:\>vssadmin list shadowstorage
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2012 Microsoft Corp.
Shadow Copy Storage association
   For volume: (\\?\Volume{b2951138-f01e-11e1-93e8-806e6f6e6963}\)\\?\Volume{b2951138-f01e-11e1-93e8-806e6f6e6963}\
   Shadow Copy Storage volume: (\\?\Volume{b2951138-f01e-11e1-93e8-806e6f6e6963}\)\\?\Volume{b2951138-f01e-11e1-93e8-806e6f6e6963}\
   Used Shadow Copy Storage space: 0 bytes (0%)
   Allocated Shadow Copy Storage space: 0 bytes (0%)
   Maximum Shadow Copy Storage space: 35,0 MB (10%)
Shadow Copy Storage association
   For volume: (D:)\\?\Volume{5228437e-9a01-4690-bc40-1df85a0e6736}\
   Shadow Copy Storage volume: (D:)\\?\Volume{5228437e-9a01-4690-bc40-1df85a0e6736}\
   Used Shadow Copy Storage space: 0 bytes (0%)
   Allocated Shadow Copy Storage space: 0 bytes (0%)
   Maximum Shadow Copy Storage space: 27,3 GB (10%)
Shadow Copy Storage association
   For volume: (C:)\\?\Volume{b2951139-f01e-11e1-93e8-806e6f6e6963}\
   Shadow Copy Storage volume: (C:)\\?\Volume{b2951139-f01e-11e1-93e8-806e6f6e6963}\
   Used Shadow Copy Storage space: 0 bytes (0%)
   Allocated Shadow Copy Storage space: 0 bytes (0%)
   Maximum Shadow Copy Storage space: 6,80 GB (10%)
C:\>
There is something strange about the limits on the Secondary host!
I have not in any way changed the settings on the Secondary host and as you can see, the Secondary host has a maximum limit of only 35 MB storage on the CSV, but it also shows that this is 10% of the Volume. This is clearly not the case since 10% if 600
GB = 60 GB!
The question is, why does it by default set a too small limit (i.e. < 320 MB) on the CSV and is this the cause of the problem? I.e. is the limit ignored since it is smaller than the smallest amount you can provide using the GUI?
Is the default 35 MB maximum Shadow Copy limit a bug, or is there any logical reason for setting a limit that according to the GUI is too small?

File copy speeds to CSV vs non-CSV

I'm working on bringing up a 2012 R2 cluster and doing a basic test. In this cluster, I have two adapters for iSCSI traffic, one for network traffic, and one for the heartbeat. Cluster node has all the current updates on it. Everything
is set up correctly as far as I can see. I'm taking a folder with 1GB of random files in it and copying it from the C: drive of a node to an iSCSI LUN. If I have the LUN set up as a non-CSV disk, the copy happens about three time faster than if
I have it set up as a CSV disk. All I'm doing is using FCM to change the disk from CSV to non-CSV (right-click, Remove from CSV, right-click, Add to CSV). I can swap it back and forth and each time the copy process is about three time slower when
it's a CSV. Am I missing something here? I've been through all the usual stuff with regard to the iSCSI adapters, MPIO, drivers, etc. But I don't think that would have anything to do with this anyway. The disk is accessed the same with
regard to all that whether it's CSV or not, unless I'm missing something. Right now, I only have a single node configured in the cluster, so it's definitely not anything to do with the CSV being in redirected mode.
I'm not trying to establish any particular transfer speed, I know file transfers are different than actual workloads and performance tools like iometer when it comes to actual numbers. But it seems to me like the transfers should be close
to the same whether the disk is a CSV or not, since I'm not changing anything else.

Which system owns the CSV? If the system from which you are copying does not own the CSV then all the metadata updates have to go across the network to be handled by the node that does own the CSV. If you are copying a lot of little
files, there is more metadata.
Actually, metadata updates always happen in redirected IO from what I'm reading, that has been the part that I was missing. This explains it.
https://technet.microsoft.com/en-us/library/jj612868.aspx?f=255&MSPPError=-2147217396 "When certain small changes occur in the file system on a CSV volume, this metadata must be synchronized on each of the physical nodes that access the
LUN, not only on the single coordinator node... These metadata update operations occur in parallel across the cluster networks by using SMB 3.0. "
So a file copy, even when done on a coordinator node, does the metadata updates in redirected mode. Other articles seem to say the same thing, though not always clearly. So it's still accurate to say that a file copy isn't the best way to measure
CSV performance, but there doesn't seem to be a lot of pointing to the (I think) important distinction regarding how the metadata updates work. From what I can see, that distinction is probably trumping anything else such as who is the
coordinator node, CSV cache, etc. For me anyway, it makes a 3X performance difference, so I think that's pretty significant.

Hyper-V cluster Backup causes virtual machine reboots for common Cluster Shared Volumes members.

I am having a problem where my VMs are rebooting while other VMs that share the same CSV are being backed up. I have provided all the information that I have gather to this point below. If I have missed anything, please let me know.
My HyperV Cluster configuration:
5 Node Cluster running 2008R2 Core DataCenter w/SP1. All updates as released by WSUS that will install on a Core installation
Each Node has 8 NICs configured as follows:
NIC1 - Management/Campus access (26.x VLAN)
NIC2 - iSCSI dedicated (22.x VLAN)
NIC3 - Live Migration (28.x VLAN)
NIC4 - Heartbeat (20.x VLAN)
NIC5 - VSwitch (26.x VLAN)
NIC6 - VSwitch (18.x VLAN)
NIC7 - VSwitch (27.x VLAN)
NIC8 - VSwitch (22.x VLAN)
Following hotfixes additional installed by MS guidance (either while build or when troubleshooting stability issue in Jan 2013)
KB2531907 - Was installed during original building of cluster
KB2705759 - Installed during troubleshooting in early Jan2013
KB2684681 - Installed during troubleshooting in early Jan2013
KB2685891 - Installed during troubleshooting in early Jan2013
KB2639032 - Installed during troubleshooting in early Jan2013
Original cluster build was two hosts with quorum drive. Initial two hosts were HST1 and HST5
Next host added was HST3, then HST6 and finally HST2.
NOTE: HST4 hardware was used in different project and HST6 will eventually become HST4
Validation of cluster comes with warning for following things:
Updates inconsistent across hosts
  I have tried to manually install "missing" updates and they were not applicable
  Most likely cause is different build times for each machine in cluster
   HST1 and HST5 are both the same level because they were built at same time
   HST3 was not rebuilt from scratch due to time constraints and it actually goes back to Pre-SP1 and has a larger list of updates that others are lacking and hence the inconsistency
   HST6 was built from scratch but has more updates missing than 1 or 5 (10 missing instead of 7)
   HST2 was most recently built and it has the most missing updates (15)
Storage - List Potential Cluster Disks
  It says there are Persistent Reservations on all 14 of my CSV volumes and thinks they are from another cluster.
  They are removed from the validation set for this reason. These iSCSI volumes/disks were all created new for
  this cluster and have never been a part of any other cluster.
When I run the Cluster Validation wizard, I get a slew of Event ID 5120 from FailoverClustering. Wording of error:
  Cluster Shared Volume 'Volume12' ('Cluster Disk 13') is no longer available on this node because of
  'STATUS_MEDIA_WRITE_PROTECTED(c00000a2)'. All I/O will temporarily be queued until a path to the
  volume is reestablished.
Under Storage and Cluster Shared VOlumes in Failover Cluster Manager, all disks show online and there is no negative effect of the errors.
Cluster Shared Volumes
We have 14 CSVs that are all iSCSI attached to all 5 hosts. They are housed on an HP P4500G2 (LeftHand) SAN.
I have limited the number of VMs to no more than 7 per CSV as per best practices documentation from HP/Lefthand
VMs in each CSV are spread out amonst all 5 hosts (as you would expect)
Backup software we use is BackupChain from BackupChain.com.
Problem we are having:
When backup kicks off for a VM, all VMs on same CSV reboot without warning. This normally happens within seconds of the backup starting
What have to done to troubleshoot this:
We have tried rebalancing our backups
  Originally, I had backup jobs scheduled to kick off on Friday or Saturday evening after 9pm
  2 or 3 hosts would be backing up VMs (Serially; one VM per host at a time) each night.
  I changed my backup scheduled so that of my 90 VMs, only one per CSV is backing up at the same time
   I mapped out my Hosts and CSVs and scheduled my backups to run on week nights where each night, there
   is only one VM backed up per CSV. All VMs can be backed up over 5 nights (there are some VMs that don't
   get backed up). I also staggered the start times for each Host so that only one Host would be starting
   in the same timeframe. There was some overlap for Hosts that had backups that ran longer than 1 hour.
  Testing this new schedule did not fix my problem. It only made it more clear. As each backup timeframe
  started, whichever CSV the first VM to start was on would have all of their VMs reboot and come back up.
I then thought maybe I was overloading the network still so I decided to disable all of the scheduled backup
and run it manually. Kicking off a backup on a single VM, in most cases, will cause the reboot of common
CSV members.
Ok, maybe there is something wrong with my backup software.
  Downloaded a Demo of Veeam and installed it onto my cluster.
  Did a test backup of one VM and I had not problems.
  Did a test backup of a second VM and I had the same problem. All VMs on same CSV rebooted
Ok, it is not my backup software. Apparently it is VSS. I have looked through various websites. The best troubleshooting
site I have found for VSS in one place it on BackupChain.com (http://backupchain.com/hyper-v-backup/Troubleshooting.html)
I have tested almost every process on there list and I will lay out results below:
  1. I have rebooted HST6 and problems still persist
  2. When I run VSSADMIN delete shadows /all, I have no shadows to delete on any of my 5 nodes
   When I run VSSADMIN list writers, I have no error messages on any writers on any node...
  3. When I check the listed registry key, I only have the build in MS VSS writer listed (I am using software VSS)
  4. When I run VSSADMIN Resize ShadowStorge command, there is no shadow storage on any node
  5. I have completed the registration and service cycling on HST6 as laid out here and most of the stuff "errors"
   Only a few of the DLL's actually register.
  6. HyperV Integration Services were reconciled when I worked with MS in early January and I have no indication of
   further issue here.
  7. I did not complete the step to delete the Subscriptions because, again, I have no error messages when I list writers
  8. I removed the Veeam software that I had installed to test (it hadn't added any VSS Writer anyway though)
  9. I can't realistically uninstall my HyperV and test VSS
  10. Already have latest SPs and Updates
  11. This is part of step 5 so I already did this. This seems to be a rehash of various other stratgies
I have used the VSS Troubleshooter that is part of BackupChain (Ctrl-T) and I get the following error:
  ERROR: Selected writer 'Microsoft Hyper-V VSS Writer' is in failed state!
  - Status: 8 (VSS_WS_FAILED_AT_PREPARE_SNAPSHOT)
  - Writer Failure code: 0x800423f0 (<Unknown error code>)
  - Writer ID: {66841cd4-6ded-4f4b-8f17-fd23f8ddc3de}
  - Instance ID: {d55b6934-1c8d-46ab-a43f-4f997f18dc71}
  VSS snapshot creation failed with result: 8000FFFF
VSS errors in event viewer. Below are representative errors I have received from various Nodes of my cluster:
I have various of the below spread out over all hosts except for HST6
Source: VolSnap, Event ID 10, The shadow copy of volume took too long to install
Source: VolSnap, Event ID 16, The shadow copies of volume x were aborted because volume y, which contains shadow copy storage for this shadow copy, wa force dismounted.
Source: VolSnap, Event ID 27, The shadow copies of volume x were aborted during detection because a critical control file could not be opened.
I only have one instance of each of these and both of the below are from HST3
Source: VSS, Event ID 12293, Volume Shadow Copy Service error: Error calling a routine on a Shadow Copy Provider {b5946137-7b9f-4925-af80-51abd60b20d5}. Routine details RevertToSnashot [hr = 0x80042302, A Volume Shadow Copy Service component encountered an
unexpected error.
Source: VSS, Event ID 8193, Volume Shadow Copy Service error: Unexpected error calling routine GetOverlappedResult. hr = 0x80070057, The parameter is incorrect.
So, basically, everything I have tried has resulted in no success towards solving this problem.
I would appreciate anything assistance that can be provided.
Thanks,
Charles J. Palmer
Wright Flood

Tim,
Thanks for the reply. I ran the first two commands and got this:
Name
Role Metric
Cluster Network 1
3  10000
Cluster Network 2 - HeartBeat                              1   1300
Cluster Network 3 - iSCSI                                    0  10100
Cluster Network 4 - LiveMigration                         1   1200
When you look at the properties of each network, this is how I have it configured:
Cluster Network 1 - Allow cluster network communications on this network and Allow clients to connect through this network (26.x subnet)
Cluster Network 2 - Allow cluster network communications on this network. New network added while working with Microsoft support last month. (28.x subnet)
Cluster Network 3 - Do not allow cluster network communications on this network. (22.x subnet)
Cluster Network 4 - Allow cluster network communications on this network. Existing but not configured to be used by VMs for Live Migration until MS corrected. (20.x subnet)
Should I modify my metrics further or are the current values sufficient.
I worked with an MS support rep because my cluster (once I added the 5th host) stopped being able to live migrate VMs and I had VMs host jumping on startup. It was a mess for a couple of days. They had me add the Heartbeat network as part of the solution
to my problem. There doesn't seem to be anywhere to configure a network specifically for CSV so I would assume it would use (based on my metrics above) Cluster Network 4 and then Cluster Network 2 for CSV communications and would fail back to the Cluster Network
1 if both 2 and 4 were down/inaccessible.
As to the iSCSI getting a second NIC, I would love to but management wants separation of our VMs by subnet and role and hence why I need the 4 VSwitch NICs. I would have to look at adding an additional quad port NIC to my servers and I would be having to
use half height cards for 2 of my 5 servers for that to work.
But, on that note, it doesn't appear to actually be a bandwidth issue. I can run a backup for a single VM and get nothing on the network card (It caused the reboots before any real data has even started to pass apparently) and still the problem occurs.
As to Backup Chain, I have been working with the vendor and they are telling my the issue is with VSS. They also say they support CSV as well. If you go to this page (http://backupchain.com/Hyper-V-Backup-Software.html)
they say they support CSVs. Their tech support has been very helpful but unfortunately, nothing has fixed the problem.
What is annoying is that every backup doesn't cause a problem. I have a daily backup of one of our machines that runs fine without initiating any additional reboots. But most every other backup job will trigger the VMs on the common CSV to reboot.
I understood about the updates but I had to "prove" it to the MS tech I was on the phone with and hence I brought it up. I understand on the storage as well. Why give a warning for something that is working though... I think that is just a poor indicator
that it doesn't explain that in the report.
At a loss for what else I can do,
Charles J. Palmer

Access is denied messages in Win2012 R2 Failover Cluster validation report and CSV entering a paused state

Been having some issues with nodes basically dropping out of clusters config.
Error showing was
"Cluster Shared Volume 'Volume1' ('Data') has entered a paused state because of '(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished."
All nodes (Poweredge 420) connected a Dell MD3200 shared SAS storage.
Nodes point to Virtual 2012 R2 DC's
Upon running validation with just two nodes, get the same errors over and over again.
Bemused!
List Software Updates
Description: List software updates that have been applied on each node.
An error occurred while executing the test.
An error occurred while getting information about the software updates installed on the nodes.
One or more errors occurred.
Creating an instance of the COM component with CLSID {4142DD5D-3472-4370-8641-DE7856431FB0} from the IClassFactory failed due to the following error: 80070005 Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)).
and
List Disks
Description: List all disks visible to one or more nodes. If a subset of disks is specified for validation, list only disks in the subset.
An error occurred while executing the test.
Storage cannot be validated at this time. Node 'zhyperv2.KISLNET.LOCAL' could not be initialized for validation testing. Possible causes for this are that another validation test is being run from another management client, or a previous validation test was
unexpectedly terminated. If a previous validation test was unexpectedly terminated, the best corrective action is to restart the node and try again.
Access is denied
The event viewer on one of the hosts shows
Cluster node 'zhyperv2' lost communication with cluster node 'zhyperv1'. Network communication was reestablished. This could be due to communication temporarily being blocked by a firewall or connection security policy update. If the problem persists
and network communication are not reestablished, the cluster service on one or more nodes will stop. If that happens, run the Validate a Configuration wizard to check your network configuration. Additionally, check for hardware or software errors related
to the network adapters on this node, and check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.
The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected
such as hubs, switches, or bridges.
Only other warning is because the 4 nic ports in each node server are teamed on one ip address split over two switches - I am not concernd about this and could if required split then pairs, I think this is a red herring????

Hi,
Such events happen because of the following reason:
1- Client for Microsoft Networks and File and Printer Sharing for Microsoft Networks not enabled on all network interfaces. Check this KB article: http://support.microsoft.com/kb/2008795
. Please make sure these two protocols are enabled on all cluster networks
2- Network connectivity issue can cause this event as well. Please make sure the network cabling/Cards/Switches are correctly configured and working as expected
3- Connectivity issue with the storage can also cause this event. Please make sure all the nodes are connected to storage. Check HBA/Cabling connectivity to SAN. Make sure
that the SAN drivers are up-to-date.
4- Antivirus may interrupt network communication and cause this failure. Please exclude CSV volumes from being scanned by AV: http://social.technet.microsoft.com/wiki/contents/articles/953.microsoft-anti-virus-exclusion-list.aspx
5- Disable TCP Chimney related settings on all cluster nodes. http://support.microsoft.com/kb/951037
6- Please check the Network Binding Order (http://social.technet.microsoft.com/Forums/windowsserver/en-US/2535c73a-a347-4152-be7a-ea7b24159520/hyperv-r2-csv-cluster-recommended-binding-order?forum=windowsserver2008r2highavailability)
7- Firewall Rules For All Inbound and Outbound For Cluster and Hyper-V for all the Profiles
8- Update NIC Driver/Firmware.
9- Check Compatibility of the NIC with Windows Server 2012R2
10- Set-NetAdapterRss - Resources and Tools for IT Professionals | TechNet : http://technet.microsoft.com/en-us/library/jj130863.aspx
11- Check the Following Article http://social.technet.microsoft.com/Forums/windowsserver/en-US/e06fede9-931c-4dee-8379-4fd985e20f0a/hypervvmswitch-eventid-106
12- General Updates to be applied on the nodes :
Windows RT 8.1, Windows 8.1, and Windows Server 2012 R2 update rollup: November 2013 : http://support.microsoft.com/kb/2887595
Windows 8.1 and Windows Server 2012 R2 General Availability Update Rollup :
http://support.microsoft.com/kb/2883200
Hope this helps.
We
are trying to better understand customer views on social support experience, so your participation in this
interview project would be greatly appreciated if you have time.
Thanks for helping make community forums a great place.

Cluster shared volume disappear... STATUS_MEDIA_WRITE_PROTECTED(c00000a2)

Hi all, I am having an issue hopefully someone can help me with. I have recently inherited a 2 node cluster,
both nodes are one half of an ASUS
RS702D-E6/PS8 so both nodes should be near identical. They are both running Hyper-V Server 2008 R2 hosting some 14 VM's.
Each node is hooked up via cat5e to a Promise
VessRAID 1830i via iSCSI using one of the servers onboard NICs each, whose cluster network is setup as Disabled for cluster use (the way I think it is supposed to be not the way I had originally inherited it) on it's own Class A Subnet and on it's own
private physical switch...
The SAN hosts a 30GB CSV Witness Disk and 2 2TB CSV Volumes, one for each node labeled Volume1 and Volume2. Some VHD's on each.
The Cluster Clients connect to the rest of the company via the Virtual ExternalNIC adapters created in Hyper-V manager but physically are off of Intel ET Dual Gigabit adapters
wired into our main core switch which is set up with class c subnets.
I also have a crossover cable wired up running to the other ports on the Intel ET Dual Port NICs using yet a third Class B Subnet and is configured in the Failover Cluster
Manger as internal so there are 3 ipv4 Cluster networks total.
Even though the cluster passes the validation tests with flying colors I am not convinced all is well. With Hyperv1 or node 1,
I can move the CSV's and machines over to hyperv2 or node 2, stop the cluster service on 1 and perform maintenance such as a reboot or install patches if needed. When it reboots or I restart the cluster service to bring it back online,
it is well behaved leaving hyperv2 the owner of all 3 CSV's Witness, Volume 1 and 2. I can then pass them back or split them up any which way and at no point is cluster service interrupted or noticed by users, duh I know this is how it is SUPPOSED to work
but...
if I try the same thing with Node 2, that is move the witness and volumes to node 1 as owner and migrate all VM's over, stop cluster service on node 2, do whatever I have
to do and reboot, as soon as node 2 tries to go back online, it tries to snatch volume 2 back, but it never succeeds and then the following error is logged in cluster event log:
Hyperv1
Event ID: 5120
Source: Microsoft-Windows-FailoverClustering
Task Category: Cluster Shared Volume
The listed message is:
Cluster Shared Volume 'Volume2' ('HyperV1 Disk') is no longer available on this node because of 'STATUS_MEDIA_WRITE_PROTECTED(c00000a2)'. All I/O will temporarily be queued until
a path to the volume is reestablished.
Followed 4 seconds later by:
Hyperv1
event ID: 1069
Source: Microsoft-Windows-FailoverClustering
Task Catagory: Resource Control Manager
Message: Cluster Resource 'Hyperv1 Disk in clustered service or application '75d88aa3-8ecf-47c7-98e7-6099e56a097d'
failed.
- AND -
2 of the following:
Hyperv1
event ID: 1038
Source: Microsoft-Windows-FailoverClustering
Task Catagory: Physical Disk Resource
Message: Ownership of cluster disk 'HyperV1 Disk' has been unexpectedly lost by this node. Run the Validate
a Configuration wizard to check your storage configuration.
Followed 1 second later by another 1069 and then various machines are failing messages.
If you browse to
\\hyperv-1\c$\clusterstorage\ or
\\hyperv-2\c$\Clusterstorage\, Volume 2 is indeed missing!!
This has caused me to panic a few times as the first time I saw this I thought everything was lost but I can get it back by stopping the service on node 1 or shutting it
down, restarting node 2 or the service on node 2 and waiting forever for the disk to list as failed and then shortly thereafter it comes back online. I can then boot node 1 back up and let it start servicing the cluster again. It doesn’t pull the same
craziness node 2 does when it comes online; it leaves all ownership with 2 unless I tell I to move.
I am very new to clusters and all I know at this point is this is pretty cool stuff but basically if it is running don’t mess with it is the attitude I have taken
with it but there is a significant amount of money tied up in this hardware and we should be able to leverage this as needed, not wonder if it is going to act up again.
To me it seems for a ‘failover’ cluster it should be way more robust than this...
I can go into way more detail if needed but I didn’t see any other posts on this specific issue no matter what forum I scoured. I’m obviously looking for advice
on how to get this resolved as well as advice on whether or not I wired the cluster networks correctly. I am also not sure about what protocols are bound to what nics anymore and what the binding order should be, could this be what is causing my issue?
I have NVSPBIND and NVSPSCRUB on both boxes if needed.
Thanks!
-LW

Hello Ravikumar,
Thanks about your attention!
All disks are Online, see below the status of disks, but the problem continues, any ideas?
PS.: For your information, all disks are delivered to hosts by SAN/HBA and all tests from Cluster Validation are passed.
PS C:\Users\hyperv_admin> Get-ClusterSharedVolume
Name State Node
hyperv-04_vol1_fc
Online vmserver27
hyperv-04_vol2_fc
Online vmserver26
hyperv-04_vol3_sata
Online vmserver25
hyperv-04_vol4_sata
Online vmserver27
See below the patches applied on my hosts:
KB2263829 http://support.microsoft.com/?kbid=2263829
KB2425227 http://support.microsoft.com/?kbid=2425227
KB2484033 http://support.microsoft.com/?kbid=2484033
KB2488113 http://support.microsoft.com/?kbid=2488113
KB2492386 http://support.microsoft.com/?kbid=2492386
KB2494016 http://support.microsoft.com/?kbid=2494016
KB2494162 http://support.microsoft.com/?kbid=2494162
KB2505438 http://support.microsoft.com/?kbid=2505438
KB2506014 http://support.microsoft.com/?kbid=2506014
KB2506212 http://support.microsoft.com/?kbid=2506212
KB2506928 http://support.microsoft.com/?kbid=2506928
KB2507618 http://support.microsoft.com/?kbid=2507618
KB2509553 http://support.microsoft.com/?kbid=2509553
KB2510531 http://support.microsoft.com/?kbid=2510531
KB2511250 http://support.microsoft.com/?kbid=2511250
KB2511455 http://support.microsoft.com/?kbid=2511455
KB2512715 http://support.microsoft.com/?kbid=2512715
KB2515325 http://support.microsoft.com/?kbid=2515325
KB2518869 http://support.microsoft.com/?kbid=2518869
KB2520235 http://support.microsoft.com/?kbid=2520235
KB2522422 http://support.microsoft.com/?kbid=2522422
KB2525835 http://support.microsoft.com/?kbid=2525835
KB2529073 http://support.microsoft.com/?kbid=2529073
KB2531907 http://support.microsoft.com/?kbid=2531907
KB2533552 http://support.microsoft.com/?kbid=2533552
KB2536275 http://support.microsoft.com/?kbid=2536275
KB2536276 http://support.microsoft.com/?kbid=2536276
KB2541014 http://support.microsoft.com/?kbid=2541014
KB2544521 http://support.microsoft.com/?kbid=2544521
KB2544893 http://support.microsoft.com/?kbid=2544893
KB2545698 http://support.microsoft.com/?kbid=2545698
KB2547666 http://support.microsoft.com/?kbid=2547666
KB2550886 http://support.microsoft.com/?kbid=2550886
KB2552040 http://support.microsoft.com/?kbid=2552040
KB2552343 http://support.microsoft.com/?kbid=2552343
KB2556532 http://support.microsoft.com/?kbid=2556532
KB2560656 http://support.microsoft.com/?kbid=2560656
KB2563227 http://support.microsoft.com/?kbid=2563227
KB2564958 http://support.microsoft.com/?kbid=2564958
KB2567680 http://support.microsoft.com/?kbid=2567680
KB2570947 http://support.microsoft.com/?kbid=2570947
KB2572077 http://support.microsoft.com/?kbid=2572077
KB2584146 http://support.microsoft.com/?kbid=2584146
KB2585542    http://support.microsoft.com/?kbid=2585542
KB2588516 http://support.microsoft.com/?kbid=2588516
KB2598845 http://support.microsoft.com/?kbid=2598845
KB2603229 http://support.microsoft.com/?kbid=2603229
KB2607047 http://support.microsoft.com/?kbid=2607047
KB2608658 http://support.microsoft.com/?kbid=2608658
KB2618451 http://support.microsoft.com/?kbid=2618451
KB2620704 http://support.microsoft.com/?kbid=2620704
KB2620712 http://support.microsoft.com/?kbid=2620712
KB2621440    http://support.microsoft.com/?kbid=2621440
KB2631813 http://support.microsoft.com/?kbid=2631813
KB2632503 http://support.microsoft.com/?kbid=2632503
KB2633873 http://support.microsoft.com/?kbid=2633873
KB2633952 http://support.microsoft.com/?kbid=2633952
KB2636573 http://support.microsoft.com/?kbid=2636573
KB2639308 http://support.microsoft.com/?kbid=2639308
KB2640148 http://support.microsoft.com/?kbid=2640148
KB2641653 http://support.microsoft.com/?kbid=2641653
KB2641690 http://support.microsoft.com/?kbid=2641690
KB2643719 http://support.microsoft.com/?kbid=2643719
KB2644615    http://support.microsoft.com/?kbid=2644615
KB2645640 http://support.microsoft.com/?kbid=2645640
KB2647516 http://support.microsoft.com/?kbid=2647516
KB2647518 http://support.microsoft.com/?kbid=2647518
KB2654428
http://support.microsoft.com/?kbid=2654428
KB2656356 http://support.microsoft.com/?kbid=2656356
KB2660075 http://support.microsoft.com/?kbid=2660075
KB2665364 http://support.microsoft.com/?kbid=2665364
KB2667402 http://support.microsoft.com/?kbid=2667402
KB976902 http://support.microsoft.com/?kbid=976902
KB982018
http://support.microsoft.com/?kbid=982018
Thanks
Ricardo

SOFS, CSV available space changes daily, strange behavour

Hi, I have a a SOFS cluster which is working nicely, around 120 VM, running from 8 CSV volumes.
but I have noticed that the disk usage goes up and down during the day, one CSV only had 5% free when I checked this morning.
But overnight it drops back to 20-30% free. if I do a Volume refresh during the Day I can see the available space dropping, 1 or 2 Gig every hour or so.
now I though this was AVHD snapshots growing but I have checked this, and it not that causing the issue, I also thought it could be DPM, but no backups are running.
could it be memory of the VM's using extra disk on the VM config ?
I haven't noticed this behaviour before, and I am pretty interested in whats going on.
Cheers
Mark

Can only bring some CSV's online on one specific host

Hi.
I have a three node Hyper-V (2012 R2) cluster.
It has 12 CSV's connected. Two of the CSV's can only be brought online on one of the hosts. It seems, they are locked to that host in some way.
The end result is that when I take that host down for maintenance, the two CSV's go offline and all VM's on them, crash.
I suspect a particular VM of causing this but I don't know how to fix the problem.
Anyone?
Thanks in advance.
/Michael

What size are the CSV and is the storage connected via iSCSI, Fibre or SMB?
If Fibre or iSCSI, check that the SAN is configured to allows all three hosts to connect to the storage group and each has access to all LUNS from a SAN point of view. If iSCSI check the Initiator and the MPIO settings on all three host etc.
With regards to the Access denied, have a scan through this link it maybe of help and check if the VM worker process (VMMS) has the relevant permissions on the CSV volume.
http://blogs.technet.com/b/askcore/archive/2014/10/01/virtual-machine-checkpoint-fails-with-access-denied-when-running-on-a-clustered-shared-volume.aspx
If possible, I think i would attempt to move the VMs of the storage, remove and re-present it or increase the size of the other CSV. The number of CSV is down to design, connections the SAN can support and various other things and although 12 is a lot of
CSV for three nodes, as mentioned above it should work and there are many cluster with a lot more.
I know its a pain, but I would really try to run the validation wizard to check widows updates match etc. Also have you checked all firmware updates for HBAs or NICs etc?
Kind Regards
Michael Coutanche
Blog:Twitter:LinkedIn:
Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

"Optimize" a CSV volume

Similar Messages

Maybe you are looking for