DPM 2010 fails to backup virtual guests on a Hyper-V Cluster Shared Volume using hardware VSS provider

We have been battling with this problem for 2 years now and have raised calls with Microsoft and EMC both of which have not resulted in a resolution.
We have an 8 node CSV with about 40 virtual servers on it.  The Hyper-V hosts are Windows 2008 R2 SP1 servers with the Hyper-V role installed and the SAN is a Clarion CX3-10c and we are using the EMC 4.7.1 hardware provider for snaphots. The problem
we have is that snapshots are not 'always' getting created on the SAN and the recovery points in DPM fail.  It often takes several attempts to re-run the job before it successfully works.
The MaxParallelBackups registry key is set to 1 on the DPM server so we are running jobs serially on each node and we have aligned all the VMs so the Hyper-V owner of each VM is the same owner of the CSV that those VMs have their resources on. We have done
this to avoid any ownership changes to reduce the risk of failures.  This has had some success and it is not always the same servers that fail which gives this problem an unfortunate degree on intermittency!
There is no DataSourceGroups.xml anymore after I have tried to implement previously (it's effectiveness unproven though). We get a few days grace sometimes where backups are running with no problems but the longest run of success we have had is about 2-3
weeks which is our most recent run and it's cessation has led me to post this on the forum. Since Wednesday last week we have had regular failures with between 9 - 12 failures each night.  The annoying part of this is that we hadn't made any changes to
either DPM or the Cluster which makes no real sense.
One thing I could do is split the Protection Groups so all the VMs on Hyper-V host1 are in one group, all the ones on Hyper-V host2 in another and so forth, however if anyone could advise me if this will be any help or not before I do it that would be much
appreciated as I don't want to do this unless I have to in case I hit capacity problems (most VMs are in one large protection group at present)
Any help would be much appreciated (I have logs I can attach if anyone would like to see a sample of when the failures occur),
Thanks
Chris

Hi,
I'm not on the Windows team, so don't know about the inner workings of the fix and what to expect.  However, you can try using diskshadow.exe to delete all the snapshots for the given volume and see if that clears them up.
DISKSHADOW> delete shadows /?
DELETE SHADOWS { ALL | VOLUME <volume> | OLDEST <volume> | SET <setID> | ID <shadowID> | EXPOSED <drive letter, mountPoint or share> }
        Delete shadow copies, both persistent and non-persistent.
        ALL                     All shadow copies.
        VOLUME <volume>         Delete all shadow copies of the given volume.
        OLDEST <volume>         Delete the oldest shadow copy of the given volume.
        SET <setID>             Delete the shadow copies in the shadow copy set specified by the setId parameter.
        ID <shadowID>           Delete the shadow copy specified by the shadowId parameter.
        EXPOSED <exposeName>    Delete the shadow copy that is exposed at the specified drive letter, mount point or share.
        Examples: DELETE SHADOWS ALL
                  DELETE SHADOWS EXPOSED p:
                  DELETE SHADOWS EXPOSED ShareName
So something like:  DELETE SHADOWS VOLUME E:   or  DELETE SHADOWS ALL
Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT]
This posting is provided "AS IS" with no warranties, and confers no rights.

Similar Messages

  • Hyper-V cluster Backup causes virtual machine reboots for common Cluster Shared Volumes members.

    I am having a problem where my VMs are rebooting while other VMs that share the same CSV are being backed up. I have provided all the information that I have gather to this point below. If I have missed anything, please let me know.
    My HyperV Cluster configuration:
    5 Node Cluster running 2008R2 Core DataCenter w/SP1. All updates as released by WSUS that will install on a Core installation
    Each Node has 8 NICs configured as follows:
     NIC1 - Management/Campus access (26.x VLAN)
     NIC2 - iSCSI dedicated (22.x VLAN)
     NIC3 - Live Migration (28.x VLAN)
     NIC4 - Heartbeat (20.x VLAN)
     NIC5 - VSwitch (26.x VLAN)
     NIC6 - VSwitch (18.x VLAN)
     NIC7 - VSwitch (27.x VLAN)
     NIC8 - VSwitch (22.x VLAN)
    Following hotfixes additional installed by MS guidance (either while build or when troubleshooting stability issue in Jan 2013)
     KB2531907 - Was installed during original building of cluster
     KB2705759 - Installed during troubleshooting in early Jan2013
     KB2684681 - Installed during troubleshooting in early Jan2013
     KB2685891 - Installed during troubleshooting in early Jan2013
     KB2639032 - Installed during troubleshooting in early Jan2013
    Original cluster build was two hosts with quorum drive. Initial two hosts were HST1 and HST5
    Next host added was HST3, then HST6 and finally HST2.
    NOTE: HST4 hardware was used in different project and HST6 will eventually become HST4
    Validation of cluster comes with warning for following things:
     Updates inconsistent across hosts
      I have tried to manually install "missing" updates and they were not applicable
      Most likely cause is different build times for each machine in cluster
       HST1 and HST5 are both the same level because they were built at same time
       HST3 was not rebuilt from scratch due to time constraints and it actually goes back to Pre-SP1 and has a larger list of updates that others are lacking and hence the inconsistency
       HST6 was built from scratch but has more updates missing than 1 or 5 (10 missing instead of 7)
       HST2 was most recently built and it has the most missing updates (15)
     Storage - List Potential Cluster Disks
      It says there are Persistent Reservations on all 14 of my CSV volumes and thinks they are from another cluster.
      They are removed from the validation set for this reason. These iSCSI volumes/disks were all created new for
      this cluster and have never been a part of any other cluster.
     When I run the Cluster Validation wizard, I get a slew of Event ID 5120 from FailoverClustering. Wording of error:
      Cluster Shared Volume 'Volume12' ('Cluster Disk 13') is no longer available on this node because of
      'STATUS_MEDIA_WRITE_PROTECTED(c00000a2)'. All I/O will temporarily be queued until a path to the
      volume is reestablished.
     Under Storage and Cluster Shared VOlumes in Failover Cluster Manager, all disks show online and there is no negative effect of the errors.
    Cluster Shared Volumes
     We have 14 CSVs that are all iSCSI attached to all 5 hosts. They are housed on an HP P4500G2 (LeftHand) SAN.
     I have limited the number of VMs to no more than 7 per CSV as per best practices documentation from HP/Lefthand
     VMs in each CSV are spread out amonst all 5 hosts (as you would expect)
    Backup software we use is BackupChain from BackupChain.com.
    Problem we are having:
     When backup kicks off for a VM, all VMs on same CSV reboot without warning. This normally happens within seconds of the backup starting
    What have to done to troubleshoot this:
     We have tried rebalancing our backups
      Originally, I had backup jobs scheduled to kick off on Friday or Saturday evening after 9pm
      2 or 3 hosts would be backing up VMs (Serially; one VM per host at a time) each night.
      I changed my backup scheduled so that of my 90 VMs, only one per CSV is backing up at the same time
       I mapped out my Hosts and CSVs and scheduled my backups to run on week nights where each night, there
       is only one VM backed up per CSV. All VMs can be backed up over 5 nights (there are some VMs that don't
       get backed up). I also staggered the start times for each Host so that only one Host would be starting
       in the same timeframe. There was some overlap for Hosts that had backups that ran longer than 1 hour.
      Testing this new schedule did not fix my problem. It only made it more clear. As each backup timeframe
      started, whichever CSV the first VM to start was on would have all of their VMs reboot and come back up.
     I then thought maybe I was overloading the network still so I decided to disable all of the scheduled backup
     and run it manually. Kicking off a backup on a single VM, in most cases, will cause the reboot of common
     CSV members.
     Ok, maybe there is something wrong with my backup software.
      Downloaded a Demo of Veeam and installed it onto my cluster.
      Did a test backup of one VM and I had not problems.
      Did a test backup of a second VM and I had the same problem. All VMs on same CSV rebooted
     Ok, it is not my backup software. Apparently it is VSS. I have looked through various websites. The best troubleshooting
     site I have found for VSS in one place it on BackupChain.com (http://backupchain.com/hyper-v-backup/Troubleshooting.html)
     I have tested almost every process on there list and I will lay out results below:
      1. I have rebooted HST6 and problems still persist
      2. When I run VSSADMIN delete shadows /all, I have no shadows to delete on any of my 5 nodes
       When I run VSSADMIN list writers, I have no error messages on any writers on any node...
      3. When I check the listed registry key, I only have the build in MS VSS writer listed (I am using software VSS)
      4. When I run VSSADMIN Resize ShadowStorge command, there is no shadow storage on any node
      5. I have completed the registration and service cycling on HST6 as laid out here and most of the stuff "errors"
       Only a few of the DLL's actually register.
      6. HyperV Integration Services were reconciled when I worked with MS in early January and I have no indication of
       further issue here.
      7. I did not complete the step to delete the Subscriptions because, again, I have no error messages when I list writers
      8. I removed the Veeam software that I had installed to test (it hadn't added any VSS Writer anyway though)
      9. I can't realistically uninstall my HyperV and test VSS
      10. Already have latest SPs and Updates
      11. This is part of step 5 so I already did this. This seems to be a rehash of various other stratgies
     I have used the VSS Troubleshooter that is part of BackupChain (Ctrl-T) and I get the following error:
      ERROR: Selected writer 'Microsoft Hyper-V VSS Writer' is in failed state!
      - Status: 8 (VSS_WS_FAILED_AT_PREPARE_SNAPSHOT)
      - Writer Failure code: 0x800423f0 (<Unknown error code>)
      - Writer ID: {66841cd4-6ded-4f4b-8f17-fd23f8ddc3de}
      - Instance ID: {d55b6934-1c8d-46ab-a43f-4f997f18dc71}
      VSS snapshot creation failed with result: 8000FFFF
    VSS errors in event viewer. Below are representative errors I have received from various Nodes of my cluster:
    I have various of the below spread out over all hosts except for HST6
    Source: VolSnap, Event ID 10, The shadow copy of volume took too long to install
    Source: VolSnap, Event ID 16, The shadow copies of volume x were aborted because volume y, which contains shadow copy storage for this shadow copy, wa force dismounted.
    Source: VolSnap, Event ID 27, The shadow copies of volume x were aborted during detection because a critical control file could not be opened.
    I only have one instance of each of these and both of the below are from HST3
    Source: VSS, Event ID 12293, Volume Shadow Copy Service error: Error calling a routine on a Shadow Copy Provider {b5946137-7b9f-4925-af80-51abd60b20d5}. Routine details RevertToSnashot [hr = 0x80042302, A Volume Shadow Copy Service component encountered an
    unexpected error.
    Source: VSS, Event ID 8193, Volume Shadow Copy Service error: Unexpected error calling routine GetOverlappedResult.  hr = 0x80070057, The parameter is incorrect.
    So, basically, everything I have tried has resulted in no success towards solving this problem.
    I would appreciate anything assistance that can be provided.
    Thanks,
    Charles J. Palmer
    Wright Flood

    Tim,
    Thanks for the reply. I ran the first two commands and got this:
    Name                                                            
    Role Metric
    Cluster Network 1                                              
    3  10000
    Cluster Network 2 - HeartBeat                              1   1300
    Cluster Network 3 - iSCSI                                    0  10100
    Cluster Network 4 - LiveMigration                         1   1200
    When you look at the properties of each network, this is how I have it configured:
    Cluster Network 1 - Allow cluster network communications on this network and Allow clients to connect through this network (26.x subnet)
    Cluster Network 2 - Allow cluster network communications on this network. New network added while working with Microsoft support last month. (28.x subnet)
    Cluster Network 3 - Do not allow cluster network communications on this network. (22.x subnet)
    Cluster Network 4 - Allow cluster network communications on this network. Existing but not configured to be used by VMs for Live Migration until MS corrected. (20.x subnet)
    Should I modify my metrics further or are the current values sufficient.
    I worked with an MS support rep because my cluster (once I added the 5th host) stopped being able to live migrate VMs and I had VMs host jumping on startup. It was a mess for a couple of days. They had me add the Heartbeat network as part of the solution
    to my problem. There doesn't seem to be anywhere to configure a network specifically for CSV so I would assume it would use (based on my metrics above) Cluster Network 4 and then Cluster Network 2 for CSV communications and would fail back to the Cluster Network
    1 if both 2 and 4 were down/inaccessible.
    As to the iSCSI getting a second NIC, I would love to but management wants separation of our VMs by subnet and role and hence why I need the 4 VSwitch NICs. I would have to look at adding an additional quad port NIC to my servers and I would be having to
    use half height cards for 2 of my 5 servers for that to work.
    But, on that note, it doesn't appear to actually be a bandwidth issue. I can run a backup for a single VM and get nothing on the network card (It caused the reboots before any real data has even started to pass apparently) and still the problem occurs.
    As to Backup Chain, I have been working with the vendor and they are telling my the issue is with VSS. They also say they support CSV as well. If you go to this page (http://backupchain.com/Hyper-V-Backup-Software.html)
    they say they support CSVs. Their tech support has been very helpful but unfortunately, nothing has fixed the problem.
    What is annoying is that every backup doesn't cause a problem. I have a daily backup of one of our machines that runs fine without initiating any additional reboots. But most every other backup job will trigger the VMs on the common CSV to reboot.
    I understood about the updates but I had to "prove" it to the MS tech I was on the phone with and hence I brought it up. I understand on the storage as well. Why give a warning for something that is working though... I think that is just a poor indicator
    that it doesn't explain that in the report.
    At a loss for what else I can do,
    Charles J. Palmer

  • Cluster Quorum Disk failing inside Guest cluster VMs in Hyper-V Cluster using Virtual Disk Sharing Windows Server 2012 R2

    Hi, I'm having a problem in a VM Guest cluster using Windows Server 2012 R2 and virtual disk sharing enabled. 
    It's a SQL 2012 cluster, which has around 10 vhdx disks shared this way. all the VHDX files are inside LUNs on a SAN. These LUNs are presented to all clustered members of the Windows Server 2012 R2 Hyper-V cluster, via Cluster Shared Volumes.
    Yesterday happened a very strange problem, both the Quorum Disk and the DTC disks got the information completetly erased. The vhdx disks themselves where there, but the info inside was gone.
    The SQL admin had to recreated both disks, but now we don't know if this issue was related to the virtualization platform or another event inside the cluster itself.
    Right now I'm seen this errors on one of the VM Guest:
     Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          3/4/2014 11:54:55 AM
    Event ID:      1069
    Task Category: Resource Control Manager
    Level:         Error
    Keywords:      
    User:          SYSTEM
    Computer:      ServerDB02.domain.com
    Description:
    Cluster resource 'Quorum-HDD' of type 'Physical Disk' in clustered role 'Cluster Group' failed.
    Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster
    Manager or the Get-ClusterResource Windows PowerShell cmdlet.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}" />
        <EventID>1069</EventID>
        <Version>1</Version>
        <Level>2</Level>
        <Task>3</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2014-03-04T17:54:55.498842300Z" />
        <EventRecordID>14140</EventRecordID>
        <Correlation />
        <Execution ProcessID="1684" ThreadID="2180" />
        <Channel>System</Channel>
        <Computer>ServerDB02.domain.com</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="ResourceName">Quorum-HDD</Data>
        <Data Name="ResourceGroup">Cluster Group</Data>
        <Data Name="ResTypeDll">Physical Disk</Data>
      </EventData>
    </Event>
    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          3/4/2014 11:54:55 AM
    Event ID:      1558
    Task Category: Quorum Manager
    Level:         Warning
    Keywords:      
    User:          SYSTEM
    Computer:      ServerDB02.domain.com
    Description:
    The cluster service detected a problem with the witness resource. The witness resource will be failed over to another node within the cluster in an attempt to reestablish access to cluster configuration data.
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-FailoverClustering" Guid="{BAF908EA-3421-4CA9-9B84-6689B8C6F85F}" />
        <EventID>1558</EventID>
        <Version>0</Version>
        <Level>3</Level>
        <Task>42</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2014-03-04T17:54:55.498842300Z" />
        <EventRecordID>14139</EventRecordID>
        <Correlation />
        <Execution ProcessID="1684" ThreadID="2180" />
        <Channel>System</Channel>
        <Computer>ServerDB02.domain.com</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="NodeName">ServerDB02</Data>
      </EventData>
    </Event>
    We don't know if this can happen again, what if this happens on disk with data?! We don't know if this is related to the virtual disk sharing technology or anything related to virtualization, but I'm asking here to find out if it is a possibility.
    Any ideas are appreciated.
    Thanks.
    Eduardo Rojas

    Hi,
    Please refer to the following link:
    http://blogs.technet.com/b/keithmayer/archive/2013/03/21/virtual-machine-guest-clustering-with-windows-server-2012-become-a-virtualization-expert-in-20-days-part-14-of-20.aspx#.Ux172HnxtNA
    Best Regards,
    Vincent Wu
    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

  • DPM 2012 R2 cannot backup VM online

    Hi Everyone,
    I'm having a problem trying to backup a VM with DPM, as it shows "offline" backup only (not online).
    Our environment:
    DPM 2012 R2 server with UR5, running on Windows 2012 R2. This server backs up all Virtual Machines running on a Windows 2012 R2 Hyper-V cluster. All the VMs are running Windows 2008 R2, with the latest Hyper-V Integration Tool (v6.3.9600.16384).
    Only a few VMs on that cluster cannot be backed up "online".
    I already checked the following:
     - The backup (volume snapshot) integration service is installed and enabled in the VM configuration
     - The VM does not have any dynamic disks
     - All Disks on the VM are NTFS
     - The VM Cluster resource is online
     - The VM is running
     - The Shadow storage assignment inside the VM for all volumes are set on the respective volume (not different)
     - The VM has a SCSI controller
     - Although meant for another version of DPM and OS, I tried to add the following registry key:
    HKLM\Software\Microsoft\WindowsNT\CurrentVersion\SystemRestore - REG_DWORD ScopeSnapshots 0x0
    (solution found here: https://mvanvliet.wordpress.com/2013/04/10/issue-windows-server-2012-virtual-machines-can-only-be-backed-up-using-saved-state/)
     - Also tried to remove all VHDX from the VM configuration, save the config and re-add the VHDX
    (solution found here: http://blog.metasplo.it/2014/02/dpm-2012-r2-only-allows-offline-backups.html)
     - All the disks within the VM have plenty of free space (more than 30% on each volumes)
     - Try to move the VM around on different hosts on the Hyper-V cluster
    I've run out of troubleshooting steps...
    Any ideas?
    Thanks,
    Stephane

    Hello Stephane.
    It seems that we have the same Problem.
    Since we have installed the UR5 we have some VM’s that
    can no more backed up Online. Last Sunday we had a crash on a host. After we
    have investigated the issue we think it have to do with the Backup and the UR5.
    We have found some errors where the host have lost the iscsi Target when the
    Backup have begun.
    We have 2 Infrastructures with the same configuration HP
    Store Virtual and the Hardware VSS Provider. We can see that on the Storage no
    Snapshots are created.
    We try now to add (Workaround) a little Disk to this
    Vm’s that is configured as DynamicDisk. So we hope to make an Offline backup.
    I think we have to open a case.
    Andreas

  • DPM 2012 R2 backup Causes Redirected CSV IO on SOFS Cluster.

    Hi, I have a Scale out Storage Spaces Server with 2 nodes, and a 10 node 2012 R2, Hyper-V cluster using this via SMB3.0
    I also have installed a DPM2012 R2 backup server.
    the DPM agent is installed on all nodes of all servers and I have followed the pre-requisite from Microsoft for setting up DPM backup of SMB Hyper-V machines.
    The DPM backups all work fine. but occasionaly I get these errors on the SOFS cluster.
    Cluster Shared Volume 'Volume3' ('Cluster Disk 4') has entered a paused state because of '(c0130021)'. All I/O will temporarily be queued until a path to the volume is reestablished.
    I really thought this issue had been resolved in this revision, this doesn't seem to cause any issues with my VM's that I can notice. and all DPM backups are working fine, but it still causes me concern.
    has anyone else seen this or have any suggestions what I can try to resolve.
    Regards
    Mark Green

    We also encounter this issue. We use Windows Server 2012 R2 and SCVMM 2012 R2 (with RU1). Be carefull with this issue, because it can cause serious issues. Btw, note that Windows Server 2012 R2 used Direct I/O instead of Redirected I/O.
    If you can't find a full fix as we are in right now, there are two things that might offer a work-around for you:
    Disabled ODX (if your storage system does not support it):
    Deploy Windows Offloaded Data Transfers
    http://technet.microsoft.com/en-us/library/jj200627.aspx
    Serialize virtual machine backups per node
    Migrate to a hardware VSS provider
    http://technet.microsoft.com/en-us/library/hh758027.aspx
    The second option works best, because this issue mostly occurs when you run a backup of many VMs at once. It it not a full fix and makes you backup windows much longer, but can avoid you other problems. Also keep a close eye on this link:
    Recommended hotfixes and updates for Windows Server 2012 R2-based failover clusters
    http://support.microsoft.com/kb/2920151
    Boudewijn Plomp, BPMi Infrastructure & Security

  • Windows Server 2012 - Hyper-V - Cluster Sharded Storage - VHDX unexpectedly gets copied to System Volume Information by "System", Virtual Machines stops respondig

    We have a problem with one of our deployments of Windows Server 2012 Hyper-V with a 2 node cluster connected to a iSCSI SAN.
    Our setup:
    Hosts - Both run Windows Server 2012 Standard and are clustered.
    HP ProLiant G7, 24 GB RAM. This is the primary host and normaly all VMs run on this host.
    HP ProLiant G5, 20 GB RAM. This is the secondary host that and is intended to be used in case of failure of the primary host.
    We have no antivirus on the hosts and the scheduled ShadowCopy (previous version of files) is switched off.
    iSCSI SAN:
    QNAP NAS TS-869 Pro, 8 INTEL SSDSA2CW160G3 160 GB i a RAID 5 with a Host Spare. 2 Teamed NIC.
    Switch:
    DLINK DGS-1210-16 - Both the network cards of the Hosts that are dedicated to the Storage and the Storage itself are connected to the same switch and nothing else is connected to this switch.
    Virtual Machines:
    3 Windows Server 2012 Standard - 1 DC, 1 FileServer, 1 Application Server.
    1 Windows Server 2008 Standard Exchange Server.
    All VMs are using dynamic disks (as recommended by Microsoft).
    Updates
    We have applied the most resent updates to the Hosts, VMs and iSCSI SAN about 3 weeks ago with no change in our problem and we continually update the setup.
    Normal operation:
    Normally this setup works just fine and we see no real difference in speed in startup, file copy and processing speed in LoB applications of this setup compared to a single host with two 10000 RPM Disks. Normal network speed is 10-200 Mbit, but occasionally
    we see speeds up to 400 Mbit/s of combined read/write for instance during file repair.
    Our Problem:
    Our problem is that for some reason a random VHDX gets copied to System Volume Information by "System" of the Clusterd Shared Storage (i.e. C:\ClusterStorage\Volume1\System Volume Information).
    All VMs stops responding or responds very slowly during this copy process and you can for instance not send CTRL-ALT-DEL to a VM in the Hyper-V console, or for instance start task manager when already logged in.
    This happens at random and not every day and different VHDX files from different VMs gets copied each time. Some time it happens during daytime wich causes a lot of problems, especially when a 200 GB file gets copied (which take a lot of time).
    What it is not:
    We thought that this was connected to the backup, but the backup had finished 3 hours before the last time this happended and the backup never uses any of the files in System Volume Information so it is not the backup.
    An observation:
    When this happend today I switched on ShadowCopy (previous files) and set it to only to use 320 MB of storage and then the Copy Process stopped and the virtual Machines started responding again. This could be unrelated since there is no way to see
    how much of the VHDX that is left to be copied, so it might have been finished at the same time as I enabled  ShadowCopy (previos files).
    Our question:
    Why is a VHDX copied to System Volume Information when scheduled ShadowCopy (previous version of files) is switched off? As far as I know, nothing should be copied to this folder when this functionis switched off?
    List of VSS Writers:
    vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
    (C) Copyright 2001-2012 Microsoft Corp.
    Writer name: 'Task Scheduler Writer'
       Writer Id: {d61d61c8-d73a-4eee-8cdd-f6f9786b7124}
       Writer Instance Id: {1bddd48e-5052-49db-9b07-b96f96727e6b}
       State: [1] Stable
       Last error: No error
    Writer name: 'VSS Metadata Store Writer'
       Writer Id: {75dfb225-e2e4-4d39-9ac9-ffaff65ddf06}
       Writer Instance Id: {088e7a7d-09a8-4cc6-a609-ad90e75ddc93}
       State: [1] Stable
       Last error: No error
    Writer name: 'Performance Counters Writer'
       Writer Id: {0bada1de-01a9-4625-8278-69e735f39dd2}
       Writer Instance Id: {f0086dda-9efc-47c5-8eb6-a944c3d09381}
       State: [1] Stable
       Last error: No error
    Writer name: 'System Writer'
       Writer Id: {e8132975-6f93-4464-a53e-1050253ae220}
       Writer Instance Id: {7848396d-00b1-47cd-8ba9-769b7ce402d2}
       State: [1] Stable
       Last error: No error
    Writer name: 'Microsoft Hyper-V VSS Writer'
       Writer Id: {66841cd4-6ded-4f4b-8f17-fd23f8ddc3de}
       Writer Instance Id: {8b6c534a-18dd-4fff-b14e-1d4aebd1db74}
       State: [5] Waiting for completion
       Last error: No error
    Writer name: 'Cluster Shared Volume VSS Writer'
       Writer Id: {1072ae1c-e5a7-4ea1-9e4a-6f7964656570}
       Writer Instance Id: {d46c6a69-8b4a-4307-afcf-ca3611c7f680}
       State: [1] Stable
       Last error: No error
    Writer name: 'ASR Writer'
       Writer Id: {be000cbe-11fe-4426-9c58-531aa6355fc4}
       Writer Instance Id: {fc530484-71db-48c3-af5f-ef398070373e}
       State: [1] Stable
       Last error: No error
    Writer name: 'WMI Writer'
       Writer Id: {a6ad56c2-b509-4e6c-bb19-49d8f43532f0}
       Writer Instance Id: {3792e26e-c0d0-4901-b799-2e8d9ffe2085}
       State: [1] Stable
       Last error: No error
    Writer name: 'Registry Writer'
       Writer Id: {afbab4a2-367d-4d15-a586-71dbb18f8485}
       Writer Instance Id: {6ea65f92-e3fd-4a23-9e5f-b23de43bc756}
       State: [1] Stable
       Last error: No error
    Writer name: 'BITS Writer'
       Writer Id: {4969d978-be47-48b0-b100-f328f07ac1e0}
       Writer Instance Id: {71dc7876-2089-472c-8fed-4b8862037528}
       State: [1] Stable
       Last error: No error
    Writer name: 'Shadow Copy Optimization Writer'
       Writer Id: {4dc3bdd4-ab48-4d07-adb0-3bee2926fd7f}
       Writer Instance Id: {cb0c7fd8-1f5c-41bb-b2cc-82fabbdc466e}
       State: [1] Stable
       Last error: No error
    Writer name: 'Cluster Database'
       Writer Id: {41e12264-35d8-479b-8e5c-9b23d1dad37e}
       Writer Instance Id: {23320f7e-f165-409d-8456-5d7d8fbaefed}
       State: [1] Stable
       Last error: No error
    Writer name: 'COM+ REGDB Writer'
       Writer Id: {542da469-d3e1-473c-9f4f-7847f01fc64f}
       Writer Instance Id: {f23d0208-e569-48b0-ad30-1addb1a044af}
       State: [1] Stable
       Last error: No error
    Please note:
    Please only answer our question and do not offer any general optimization tips that do not directly adress the issue! We want the problem to go away, not to finish a bit faster!

    Hallo Lawrence!
    Thankyou for youre reply, some comments to help you and others who read this thread:
    First of all, we use Windows Server 2012 and the VHDX as I wrote in the headline and in the text in my post. We have not had this problem in similar setups with Windows Server 2008 R2, so the problem seem to be introduced in Windows Server 2012.
    These posts that you refer to seem to be outdated and/or do not apply to our configuration:
    The post about Dynamic Disks:
    http://technet.microsoft.com/en-us/library/ee941151(v=WS.10).aspx is only a recommendation for Windows Server 2008 R2 and the VHD format. Dynamic VHDX is indeed recommended by Microsoft when using Windows Server 2012 (please look in the optimization guide
    for Windows Server 2012).
    Infact, if we use fixed VHDX then we would have a bigger problem since fixed VHDX are generaly larger then Dynamic Disks, i.e. more data would be copied and that would take longer time = the VMs would be unresponsive for a longer time.
    The post "What's the deal with the System Volume Information folder"
    http://blogs.msdn.com/b/oldnewthing/archive/2003/11/20/55764.aspx is for Windows XP / Windows Server 2003 and some things has changed since then. for instance In Windows Server 2012, Shadow Copies cannot be controlled by going to Control panel -> System.
    Instead you right-click on a Drive (i.e. a Volume, for instance the C drive/Volume) in Computer and then click "Configure Shadow Copies".
    Windows Server 2008 R2 Backup problem
    http://social.technet.microsoft.com/Forums/en/windowsbackup/thread/0fc53adb-477d-425b-8c99-ad006e132336 - This post is about the Antivirus software trying to scan files used during backup that exists in the System Volume Information folder and we do not
    have any antivirus software installed on our hosts as I stated in my post.
    Comment that might help us:
    So according to “System Volume Information” definition, the operation you mentioned is Volume Shadow Copy. Check event viewer to find Volume Shadow Copy related event logs and post them.
    Why?
    Furhter investigation suggests that a volume shadow copy is somehow created even though the Schedule for Shadows Copies is turned off for all drives. This happens at random and we have not found any pattern. Yesterday this operation took almost all available
    disk space (over 200 GB), but all the disk space was released when I turned on scheduled Shadow Copies for the CSV.
    I therefore draw these conclusions:
    The CSV Volume has about 600 GB of disk space and since Volume Shadows Copy used 200 GB, or about 33% of the disk space, and the default limit is 10% then I conclude that for some reason the unscheduled Volume Shadow Copy did not have any limit (or ignored
    the limit).
    When I turned on the Schedule I also change the limit to the minimum amount which is 320 MB and this is probably what released the disk space. That is, the unscheduled Volume Shadow Copy operation was aborted and it adhered to the limit and deleted the
    Volume Shadow Copy it had taken.
    I have also set the limit for Volume Shadow Copies for all other volumes to 320 MB by using the "Configure Shadow Copies" Window that you open by right clicking on a drive (volume) in Computer and then selecting "Configure Shadow Copies...".
    It is important to note that setting a limit for Shadow Copy Storage, and disabaling the Schedule are two different things! It is possible to have unlimited storage for Shadow Copies when the Schedule is disabled, however I do not know if this was the case
    Before I enabled Shadow Copies on the CSV since I did not look for this.
    I now have defined a limit for Shadow Copy Storage to 320 MB on all drives and then no VHDX should be copied to System Volume Information since they are all larger than 320 MB.
    Does this sound about right or am I drawing the wrong conclusions?
    Limits for Shadow Copies:
    Below we list the limits for our two hosts:
    "Primary Host":
    C:\>vssadmin list shadowstorage
    vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
    (C) Copyright 2001-2012 Microsoft Corp.
    Shadow Copy Storage association
       For volume: (\\?\Volume{e3ad7feb-178b-11e2-93e8-806e6f6e6963}\)\\?\Volume{e3ad7feb-178b-11e2-93e8-806e6f6e6963}\
       Shadow Copy Storage volume: (\\?\Volume{e3ad7feb-178b-11e2-93e8-806e6f6e6963}\)\\?\Volume{e3ad7feb-178b-11e2-93e8-806e6f6e6963}\
       Used Shadow Copy Storage space: 0 bytes (0%)
       Allocated Shadow Copy Storage space: 0 bytes (0%)
       Maximum Shadow Copy Storage space: 320 MB (91%)
    Shadow Copy Storage association
       For volume: (E:)\\?\Volume{dc0a177b-ab03-44c2-8ff6-499b29c3d5cc}\
       Shadow Copy Storage volume: (E:)\\?\Volume{dc0a177b-ab03-44c2-8ff6-499b29c3d5cc}\
       Used Shadow Copy Storage space: 0 bytes (0%)
       Allocated Shadow Copy Storage space: 0 bytes (0%)
       Maximum Shadow Copy Storage space: 320 MB (0%)
    Shadow Copy Storage association
       For volume: (G:)\\?\Volume{f58dc334-17be-11e2-93ee-9c8e991b7c20}\
       Shadow Copy Storage volume: (G:)\\?\Volume{f58dc334-17be-11e2-93ee-9c8e991b7c20}\
       Used Shadow Copy Storage space: 0 bytes (0%)
       Allocated Shadow Copy Storage space: 0 bytes (0%)
       Maximum Shadow Copy Storage space: 320 MB (3%)
    Shadow Copy Storage association
       For volume: (C:)\\?\Volume{e3ad7fec-178b-11e2-93e8-806e6f6e6963}\
       Shadow Copy Storage volume: (C:)\\?\Volume{e3ad7fec-178b-11e2-93e8-806e6f6e6963}\
       Used Shadow Copy Storage space: 0 bytes (0%)
       Allocated Shadow Copy Storage space: 0 bytes (0%)
       Maximum Shadow Copy Storage space: 320 MB (0%)
    C:\>cd \ClusterStorage\Volume1
    Secondary host:
    C:\>vssadmin list shadowstorage
    vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
    (C) Copyright 2001-2012 Microsoft Corp.
    Shadow Copy Storage association
       For volume: (\\?\Volume{b2951138-f01e-11e1-93e8-806e6f6e6963}\)\\?\Volume{b2951138-f01e-11e1-93e8-806e6f6e6963}\
       Shadow Copy Storage volume: (\\?\Volume{b2951138-f01e-11e1-93e8-806e6f6e6963}\)\\?\Volume{b2951138-f01e-11e1-93e8-806e6f6e6963}\
       Used Shadow Copy Storage space: 0 bytes (0%)
       Allocated Shadow Copy Storage space: 0 bytes (0%)
       Maximum Shadow Copy Storage space: 35,0 MB (10%)
    Shadow Copy Storage association
       For volume: (D:)\\?\Volume{5228437e-9a01-4690-bc40-1df85a0e6736}\
       Shadow Copy Storage volume: (D:)\\?\Volume{5228437e-9a01-4690-bc40-1df85a0e6736}\
       Used Shadow Copy Storage space: 0 bytes (0%)
       Allocated Shadow Copy Storage space: 0 bytes (0%)
       Maximum Shadow Copy Storage space: 27,3 GB (10%)
    Shadow Copy Storage association
       For volume: (C:)\\?\Volume{b2951139-f01e-11e1-93e8-806e6f6e6963}\
       Shadow Copy Storage volume: (C:)\\?\Volume{b2951139-f01e-11e1-93e8-806e6f6e6963}\
       Used Shadow Copy Storage space: 0 bytes (0%)
       Allocated Shadow Copy Storage space: 0 bytes (0%)
       Maximum Shadow Copy Storage space: 6,80 GB (10%)
    C:\>
    There is something strange about the limits on the Secondary host!
    I have not in any way changed the settings on the Secondary host and as you can see, the Secondary host has a maximum limit of only 35 MB storage on the CSV, but it also shows that this is 10% of the Volume. This is clearly not the case since 10% if 600
    GB = 60 GB!
    The question is, why does it by default set a too small limit (i.e. < 320 MB) on the CSV and is this the cause of the problem? I.e. is the limit ignored since it is smaller than the smallest amount you can provide using the GUI?
    Is the default 35 MB maximum Shadow Copy limit a bug, or is there any logical reason for setting a limit that according to the GUI is too small?

  • VSS snapshots for DPM 2010 Hyper-V backup conflict with SQL backup on a virtual SQL server

    We currently use DPM 2010 to backup our virtual servers which reside on a 5 node Hyper-V clustered share volume.  DPM uses the hardware VSS writer to backup the Hyper-V guests.   Several of these Hyper-V guests are SQL servers (SQL 2008) and they
    are all configured to run point in time SQL backups using SQL Management Plans.
    The SQL backups are scheduled to run a full database backup on a Friday and differential backups on the other days of the week.  Transaction backups are scheduled to run several times throughout the day.
    However we have recently discovered that there is a conflict between these two methods as it seems as though when a restore is required using a differential SQL backup, it fails as the snapshot created by DPM forces SQL to believe it has had a full backup
    carried out externally from the Management Plan and is therefore unable to perform the restore.
    DPM backs up the Hyper-V guests on a daily basis from 8pm.
    Can anyone provide any advice or guidance on this as we need both types of backup to run successfully.  We are required to backup SQL with point in time backups and we also need to protect the Hyper-V guests in their entirety.

    Thanks Mike,
    I have tried this but unfortunately it has no effect.  The VM has Oracle installed (although not the Oracle VSS Writer).  It is the Oracle application server, not the database server, and the customer has a script that is used to stop and start
    the Oracle application when required.  Through troubleshooting this with them I have noticed that after the WLS_Reports service/process is stopped the backups run successfully but when it is running the backups fail.
    I have also noticed that when I stop the Hyper-V Volume Shadow Copy Requestor the backup runs successfully, which I guess is as expected.
    When the backups fail I get 2 errors in the application log:
    Event Id 12293, VSS - Error calling a routine on a shadow copy provider {GUID for the Hyper-V IC Software Shadow Copy Provider}.  Routine details PreFinalCommitSnapshots ({GUID}, 5) [hr = 0x800705b4, This operation returned because
    the timeout period expired.]
    Event Id 19, vmicvss - Not all the shadow volumes arrived in the guest operating system.
    This is also part of the same problem I have posted here: Backup
    fails for a Hyper-V guest with VSS Writer failures using DPM 2012 R2 - Hyper-V guest has Oracle application installed
    Regards
    Chris

  • Dpm 2010 exchange 2010 backups are failing

    Hi, I'm getting an error which does not seem to match in the DPM 2010 ID codes. Randomly, I have backups of an exchange 2010 DAG that are failing. The backup will run for a short time (5mints..up to 10minuts) and then will fail with the following error:
    Type: Recovery point
    Status: Failed
    Description: Backup failed as another copy of 'user' database is currently being backed up. (ID 32628 Details: Internal error code: 0x80990D51)
     More information
    End time: 4/25/2011 10:37:27 PM
    Start time: 4/25/2011 10:26:55 PM
    Time elapsed: 00:10:31
    Data transferred: 0 MB
    Cluster node -
    Recovery Point Type Incremental Sync
    Source details: 06001
    Protection group: EXCH2010 - 06001
    I've tried restarting the DPM server to troubleshoot. I say this happens randomly because I've had some backups successfully complete against the same exchange server. The only other item that seems odd is when creating protection groups against the DAG,
    the intial lookup seems to take forever. Not sure what to try next. 

    Hello,
    As DPM relies on the exchange writer to take the snapshot for the backup, if the backupinprogress flag is set to true and it does not seem get out of this state even when there is no actual backup in place, then at this point DPM is the victim.
    As far as I am aware the only way is to clear that flag is to:
    a.) Reboot the exchange server
    or
    b.) Restart the information store
    In some cases even:
    1.) Either Activate all other databases on another Node except for the Problem Database, and Dismount the Problem database temporarily
    Or:
    1.)Dismount all Databases on this Node including the problem database
    2.)Restart the Exchange Information Store Service.
    3.)Mount any dismounted databases
    You may want to redirect this question below to the exchange 2010 forum of:
    "if the backupinprogress is set to true when it should be set to false as there really is no backup being taken, then how can I reset it without having to perform:
    a.) reboot of exchange
    b.) restart of the information store"
    http://social.technet.microsoft.com/Forums/en-US/exchange2010/threads
    Regards, Shane. Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. This posting
    is provided "AS IS" with no warranties, and confers no rights.

  • DPM failed to communicate with the protection agent on DPM 2010 SERVER because the agent is not responding. (ID 43 Details: Internal error code: 0x8099090E)

    Hi everyone,
    Backup jobs for protected members are intermittently failing with the following error on the DPM server:
    DPM failed to communicate with the protection agent on <DPM 2010 SERVER> because the agent is not responding. (ID 43 Details: Internal error code: 0x8099090E)
    Why does the DPM server failing to see it's own DPM agent cause the backup job for another server to fail? One day a backup will work fine, the next it may fail; and the next back to normal again...
    The following error is recorded in the Service Control Manager event log on the DPM server just prior to the above error:
    A timeout was reached (30000 milliseconds) while waiting for the DPMRA service to connect.
    Thank you.
    With regards,
    Rob

    Hello,
    I have read these entire postings and see that my problems match most of the above problems. 
    I believe we have tried all of the ideas in this blog and lots of other ideas from other forums and internet searches.
    We have about 80 small databases protected and set at 15 minute incremental and most will work but some fail.  When they fail the most common, but not only error, is something like "DPM
    failed to communicate with the protection agent on <DPM 2010 SERVER> because the agent is not responding. (ID 43 Details: Internal error code: 0x8099090E)". The alert is inactivated in the DPM Console, and the backups resume as normal."
    Since later jobs are successful I thought all was well.  All was well until I went to restore from incremental backups.  We worked for two days (day and night work) to restore
    from a corrupted virtual disk on our SQL Server 2008 R2.  I suspect DPM had something to do with the corrupted virtual disk.  All I know is that we never had this problem until installing DPM.  Here is what we encountered when we went to restore
    from the protection points: 
    *  Restore jobs take a minimum of 15 minutes for jobs that are 45 MB or 2 GB. 
    *  If you pick a backup from the list of recovery points that is not valid the job runs for 15 minutes and then "Failed". 
    *  You cannot rerun the job because SQL Server 2008 has the table being recovered as <tablename> (recovering) and a retry will not work.  Of course time is wasted while waiting
    to see if it worked.
    * Eventually you realize that even after dropping the table in the (recovering) mode in SQL that the restore point must be bad or possibly one of this failed recovery points.
    * So begins the quest to start restoring recovery points one by one and 15 minutes by 15 minutes until you find one that actually restores to a SQL Instance.
    *  If you have 80 of these to do and you average trying three recovery points and each takes 15 minutes, not to mention the time to drop the table in SQL, well that time adds up to 3,600
    minutes of trial and error.  60 hours of trial and error, wow not much of a savings using DPM over a SQL backup plan.
    * And then you have to explain to your customers that their databases were restored but you do not know at what point the DB was restored. 
    All in all it seems like DPM concept is great but like many backups the backup plan looks good on paper but actually restoring a backup is quite a different matter.
    I don't know if anyone has ever solved the problems presented in this forum but if they have then I wish they would post and if no one has solved the problem then shame on DPM.
    Good luck everyone, but I for one have spent about two months on trying to protect and restore consistently.  I have never had one day of consistent and reliable restore points.  I
    am going back to sql management plans for my backups.  I have never, in 10 years had a sql generated backup fail me.  Never.
    gbl

  • DPM failing SQL backups due to error: "the SQL Server instance refused a connection to the protection agent. (ID 30172 Details: Internal error code: 0x80990F85)

    I ran across this error starting on 6/4/2011 and have been unable to find the root of the problem.  In our environment, we have a DPM 2010 server dedicated to backing up all our SQL envrionment (about 45 SQL Servers total).  All of the SQL
    environment is backing up fine except for a SQL Cluster Application.  This particular SQL Instances is part of a 6 node failover cluster with 6 SQL Instances distributed amongst them.  The other 5 SQL instances in the cluster are backing
    up fine; only one instance is failing.  The DPM Alerts section shows this error when attempting to do a SQL backup of one of the databases on this SQL instance:
    Affected area: KEN-PROD-VDB001\POSREPL1\master
    Occurred since: 6/11/2011 11:00:56 PM
    Description: Recovery point creation jobs for SQL Server 2008 database KEN-PROD-VDB001\POSREPL1\master on SQL Server (POSREPL1) - Store Settings.ken-prod-cl004.aarons.aaronrents.com have been failing. The number of failed recovery point creation jobs =
    4.
     If the datasource protected is SharePoint, then click on the Error Details to view the list of databases for which recovery point creation failed. (ID 3114)
     The DPM job failed for SQL Server 2008 database KEN-PROD-VDB001\POSREPL1\master on SQL Server (POSREPL1) - Store Settings.ken-prod-cl004.aarons.aaronrents.com because the SQL Server instance refused a connection to the protection agent. (ID 30172 Details:
    Internal error code: 0x80990F85)
     More information
    Recommended action: This can happen if the SQL Server process is overloaded, or running short of memory. Please ensure that you are able to successfully run transactions against the SQL database in question and then retry the failed job.
     Create a recovery point...
    Resolution: To dismiss the alert, click below
     Inactivate alert
    I have checked the cluster node this particular SQL instance is running on using Perfmon and the machine is nowhere near capacity on CPU, memory, network, or Disk I/O.  I have failed this SQL Application to another node in the cluster and
    receive the same error (this other node has another clustered SQL application on it that is actively running as well as backing up fine).  The only thing that I am aware of that has changed is that we installed SP2 for SQL 2008 about 2 weeks prior
    to when the failures started to occur.  However, we updated all six clustered SQL Instances at the same time and only this one is having this issue so I don't believe that caused the problem.  We are running SQL 2008 SP2 (version 10.0.4000.0)
    on all clustered instances along with DPM 2010 (version 3.0.7696.0) on this particular DPM server that has the issue.
    One last thing, I have also noticed errors in the event log pertaining to the same SQL backups that are failing (but the time stamps are not concurrent with each backup attempt):
    Log Name:      Application
    Source:        MSDPM
    Date:          6/13/2011 1:09:12 AM
    Event ID:      4223
    Task Category: None
    Level:         Error
    Keywords:      Classic
    User:          N/A
    Computer:      KEN-PROD-BS002.aarons.aaronrents.com
    Description:
    The description for Event ID 4223 from source MSDPM cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
    If the event originated on another computer, the display information had to be saved with the event.
    The following information was included with the event:
    DPM writer was unable to snapshot the replica of KEN-PROD-VDB001\POSREPL1\model. This may be due to:
    1) No valid recovery points present on the replica.
    2) Failure of the last express full backup job for the datasource.
    3) Failure while deleting the invalid incremental recovery points on the replica.
    Problem Details:
    <DpmWriterEvent><__System><ID>30</ID><Seq>1833</Seq><TimeCreated>6/13/2011 5:09:12 AM</TimeCreated><Source>f:\dpmv3_rtm\private\product\tapebackup\dpswriter\vssfunctionality.cpp</Source><Line>815</Line><HasError>True</HasError></__System><DetailedCode>-2147212300</DetailedCode></DpmWriterEvent>
    the message resource is present but the message is not found in the string/message table
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="MSDPM" />
        <EventID Qualifiers="0">4223</EventID>
        <Level>2</Level>
        <Task>0</Task>
        <Keywords>0x80000000000000</Keywords>
        <TimeCreated SystemTime="2011-06-13T05:09:12.000000000Z" />
        <EventRecordID>68785</EventRecordID>
        <Channel>Application</Channel>
        <Computer>KEN-PROD-BS002.aarons.aaronrents.com</Computer>
        <Security />
      </System>
      <EventData>
        <Data>DPM writer was unable to snapshot the replica of KEN-PROD-VDB001\POSREPL1\model. This may be due to:
    1) No valid recovery points present on the replica.
    2) Failure of the last express full backup job for the datasource.
    3) Failure while deleting the invalid incremental recovery points on the replica.
    Problem Details:
    &lt;DpmWriterEvent&gt;&lt;__System&gt;&lt;ID&gt;30&lt;/ID&gt;&lt;Seq&gt;1833&lt;/Seq&gt;&lt;TimeCreated&gt;6/13/2011 5:09:12 AM&lt;/TimeCreated&gt;&lt;Source&gt;f:\dpmv3_rtm\private\product\tapebackup\dpswriter\vssfunctionality.cpp&lt;/Source&gt;&lt;Line&gt;815&lt;/Line&gt;&lt;HasError&gt;True&lt;/HasError&gt;&lt;/__System&gt;&lt;DetailedCode&gt;-2147212300&lt;/DetailedCode&gt;&lt;/DpmWriterEvent&gt;
    </Data>
        <Binary>3C00440070006D005700720069007400650072004500760065006E0074003E003C005F005F00530079007300740065006D003E003C00490044003E00330030003C002F00490044003E003C005300650071003E0031003800330033003C002F005300650071003E003C00540069006D00650043007200650061007400650064003E0036002F00310033002F003200300031003100200035003A00300039003A0031003200200041004D003C002F00540069006D00650043007200650061007400650064003E003C0053006F0075007200630065003E0066003A005C00640070006D00760033005F00720074006D005C0070007200690076006100740065005C00700072006F0064007500630074005C0074006100700065006200610063006B00750070005C006400700073007700720069007400650072005C00760073007300660075006E006300740069006F006E0061006C006900740079002E006300700070003C002F0053006F0075007200630065003E003C004C0069006E0065003E003800310035003C002F004C0069006E0065003E003C004800610073004500720072006F0072003E0054007200750065003C002F004800610073004500720072006F0072003E003C002F005F005F00530079007300740065006D003E003C00440065007400610069006C006500640043006F00640065003E002D0032003100340037003200310032003300300030003C002F00440065007400610069006C006500640043006F00640065003E003C002F00440070006D005700720069007400650072004500760065006E0074003E00</Binary>
      </EventData>
    </Event>
    Any help would be greatly appreciated!

    Don't know if this helps or not, but I also noticed another peculiar issue that is derived from this problem.  If I go to "Modify protection group", then expand the cluster, then expand all six nodes in the cluster, five of them show "All SQL Servers"
    and allow me to expand the SQL Instance and show all databases; the one that is having a problem backing up, when I expand the node, doesn't even show that SQL exists on the node, when in fact, it does.
    I would also like to add that the databases on this node that will not backup are running fine.  They run hundreds of transactions daily so we know SQL itself is OK.  Even though it is a busy SQL Server, there is plenty of available resources as
    the SQL buffer and memory counters show the node is not under durress.

  • Protection Agent failure error during backup to tape with DPM 2010.

    DPM TechNet Forum,
    We are experiencing an intermittent error when backing up to tape using DPM 2010 (with latest QFE roll-up applied).
    "The operation failed because of a protection agent failure. (ID 998 Details: The device is not connected (0x8007048F))"
    Any idea what could be causing this error?
    Thanks in advance,
    Joe

    Hi,
    You can try to reproduce the problem outside of DPM using some external utilities.  If you have more than one drive in the library, run the test against both drives simultaneously to simulate multiple backup jobs running.  If you get an error
    before the tape fills you can use net helpmsg errorcode to see what the error was.
    Download the DPMerasetape.zip file from the following link and extract to c:\temp folder.
    https://onedrive.live.com/?cid=b03306b628ab886f&id=B03306B628AB886F%21524&sc=documents
    The utilities are not that user friendly, but here are the basics.
    Always Stop DPMLA Service prior to running MCT.EXE Commands.
      NET STOP DPMLA
    C:\> mct-x64.exe -p
    Opening changer \\.\Changer0
         ********** Changer Parameters **********
             Number of Transport Elements : 1
             Number of Storage Elements : 50
             Number of Cleaner Slots : 0
             Number of of IE Elements : 0
             Number of NumberDataTransferElements : 6
             Number of Doors : 0
             First Slot Number : 0
             First Drive Number : 0
             First Transport Number : 0
             First IEPort number : 0
             First Cleaner Slot Address : 0
             Magazine Size : 0
             Drive Clean Timeout : 600
      Flags set for the changer :
             CHANGER_BAR_CODE_SCANNER_INSTALLED
             CHANGER_POSITION_TO_ELEMENT
             CHANGER_STORAGE_DRIVE
             CHANGER_STORAGE_SLOT
             CHANGER_DRIVE_CLEANING_REQUIRED
             CHANGER_VOLUME_IDENTIFICATION
             CHANGER_VOLUME_SEARCH
             CHANGER_SERIAL_NUMBER_VALID
     Changer can move from Slot to :
                     Slot
                     Drive
     Changer can move from Drive to :
                     Slot
                     Drive
     Changer is Capable of positioning transport to Slot.
     Changer is Capable of positioning transport to Drive.
    C:\> mct-x64.exe -d
    Opening changer \\.\Changer0
    Product Data for Medium Changer device :
      Vendor Id    : STK
      Product Id   : L180
      Revision     : 030
      SerialNumber : 3077520000
    For MCT utility we have the  -m [MOVE] command to move media around inside the library.
    -m [ElemType-T] Transport# [ElemType-Source] S_lot#/D_rive# [ElemType-Destination] S_lot#/D_rive#
    Get / view command syntax for –m (move) command for changer 0
    C:\>mct-x64 0 -m
    Opening changer \\.\Changer0
    MoveMedium : mct -m t N s\d N s\d N   [Where s/d means Slot or Drive and N is ZERO based].
    Some Examples:
    mct-x64 -m t 0 s 0 d 0    (Using transport-0, move media from slot-0  to drive-0)
    mct-x64 -m t 0 d 0 s 0    (Using transport-0, move media from drive-0 to slot-0)
    mct-x64 -m t 0 s 0 s 100  (Using transport-0, move media from slot-0  to slot-100)
    mct-x64 -m t 0 d 0 d 1    (Using transport-0, move media from drive-0 to drive-1)
    mct-x64 -m t 0 s 0 ie 0   (Using transport-0, move media from slot-0  to IEPort 0)
    Once you move a tape into a drive, use mytape commands Loadtape, taperewind, locktape, Disable hardware compression, Set block size to 65536 (64K), writeforspanning.
    You need the symbolic name for the tape drive you loaded media into - look in the DPM console by clicking the tape drive and look at the details for
    \\.\tape########.  use that in the following command.
    Mytape.exe \\.\Tape2147483638
    Status: Getting the handle for \\.\Tape2147483638...Success
    TapeConsole_1.0>taperewind">\\.\Tape2147483638>TapeConsole_1.0>taperewind
    Status: Rewinding Tape ...Success
    TapeConsole_1.0>setdriveinfo">\\.\Tape2147483638>TapeConsole_1.0>setdriveinfo
    Hardware error correction  [y]-Enable / [n] Disable : y
    Hardware data compression  [y]-Enable / [n] Disable : N   (BE SURE TO DISABLE)
    Data padding  [y]-Enable / [n] Disable : n
    Setmark reporting   [y]-Enable / [n] Disable : n
    Number of bytes between the end-of-tape warning and the physical end of the tape: 0
    Status: Setting Drive Information...Success
    TapeConsole_1.0>writeforspanning">\\.\Tape2147483638>TapeConsole_1.0>writeforspanning
    Status: Writing onto tape...Failed !!!
    Error_ID reported: 1100                 (net helpmsg 1100
    = The physical end of the tape has been reached.
    Number of bytes written: 983040     (Ignore bytes written, we'll get physical tape position later)
    Giving up
    Time taken: 15788ms
    TapeConsole_1.0>taperewind">\\.\Tape2147483638>TapeConsole_1.0>taperewind
    Status: Rewinding Tape ...Success
    REPEAT
    TapeConsole_1.0>erasetape">\\.\Tape2147483638>TapeConsole_1.0>erasetape s
    Short erase / Long Erase [s/l]:Status: Erasing the tape...Success
    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread. Regards, Mike J. [MSFT]
    This posting is provided "AS IS" with no warranties, and confers no rights.

  • Server 2008 sp2 freezes during DPM 2010 volume shadow backup

    Hi All,
    We have a hyperv guest server 2008 sp2 that freezes during DPM 2010 volume shadow backup.
    I presume this is when backing up SQL databases. There are no errors in the event logs.
    The sequense of entries in the System event are as follows up until the server freezes.
    1) The DPMRA service entered the running state.
    2) The Volume Shadow Copy service entered the running state. 
    3) DCOM  started the service swprv with arguments "" in order to run the server:
    {65EE1DBA-8FF4-4A58-AC1C-3470EE2F376A}
    4) The Microsoft Software Shadow Copy Provider service entered the running state.
    After this the entries stop and the new entries are from after reboot.                                                               
    You cannot send control-alt-delete or connect to the server in any way.
    Only hard reboot gets it going again.This is the only server this is happening to.
    Please advise if anybody has experienced this and how they resolved.
    Maybe I require a Hotfix.

    This looks similar to what I'm seeing.
    DPM 2010, there's one backup set (for me a file server disk) that every time I try to run the initial replica on it the server hangs and needs to be rebooted by iLO. It doesn't just die suddenly, first the data stream on the backup stops then the OS becomes
    less responsive but there is no resource issue. trying to open event view will cause a few things to lock up then over a few mins the server is complete froze. like the disk drives have been locked.
    Suspecting McAfee, I added in all the exclusions, that didn't help so I added the process exclusions which are done by setting dpmra and csc to low risk and that didn't help either. I could reproduce it just by kicking off a backup for this one file servers
    drive so it's easy to test with.
    Tonight, I had some permissions in EPO to let me stop the scanning completely and disable the on-access scan and for the first time it worked!
    There is definitely an issue between DPM and McAfee beyond what is on MS's web page for AV checks.
    I don't have a workaround yet other than stopping the AV completely... Something to follow up on next week. For the moment I made some progress though.

  • DPM 2010 Cancel Jobs -Database Backup

    Hi All,
    We have an Exchange 2010 Environment with DPM 2010 Backup Solution.
    I am facing issue to Cancel one of the Database Job. We have three mailbox servers and configured 8 database backup per server.
    If i try to cancel one database job, associated database backup jobs also cancelling.
    Please let me know, is there any DPM shell command to cancel particular job in DPM server.
    Regards
    Manoj

    Hi Manoj,
    You could use the DPM Shell command Stop-Job - see below for a description:
    NAME
        Stop-Job
    SYNOPSIS
        Stops a Windows PowerShell background job.
    SYNTAX
        Stop-Job [[-InstanceId] <Guid[]>] [-PassThru] [-Confirm] [-WhatIf] [<Common
        Parameters>]
        Stop-Job [-Job] <Job[]> [-PassThru] [-Confirm] [-WhatIf] [<CommonParameters
        >]
        Stop-Job [[-Name] <string[]>] [-PassThru] [-Confirm] [-WhatIf] [<CommonPara
        meters>]
        Stop-Job [-Id] <Int32[]> [-PassThru] [-Confirm] [-WhatIf] [<CommonParameter
        s>]
        Stop-Job [-State {NotStarted | Running | Completed | Failed | Stopped | Blo
        cked}] [-PassThru] [-Confirm] [-WhatIf] [<CommonParameters>]
    DESCRIPTION
        The Stop-Job cmdlet stops Windows PowerShell background jobs that are in pr
        ogress. You can use this cmdlet to stop all jobs or stop selected jobs base
        d on their name, ID, instance ID, or state, or by passing a job object to S
        top-Job.
        You can use Stop-Job to stop jobs that were started by using Start-Job or t
        he AsJob parameter of Invoke-Command. When you stop a background job, Windo
        ws PowerShell completes all tasks that are pending in that job queue and th
        en ends the job. No new tasks are added to the queue after this command is
        submitted.
        This cmdlet does not delete background jobs. To delete a job, use Remove-Jo
        b.
    RELATED LINKS
        Online version:
    http://go.microsoft.com/fwlink/?LinkID=113413
        about_Jobs
        about_Job_Details
        about_Remote_Jobs
        Start-Job
        Get-Job
        Receive-Job
        Wait-Job
        Remove-Job
        Invoke-Command
    REMARKS
        To see the examples, type: "get-help Stop-Job -examples".
        For more information, type: "get-help Stop-Job -detailed".
        For technical information, type: "get-help Stop-Job -full".
    Within the DPM Shell, if you type Get-Command you will then see all of the available commands that are provided within the DPM Shell Module. For futher information on a particular command, simply type the name of the command put
    -? after it - e.g. Stop-Job -?
    Hope this helps!
    Kevin.

  • Restore from DPM 2010 backup in DPM 2012

    Hi,
    I am planning to move from current DPM 2010 to DPM 2012 R2.
    In-place upgrade to DPM2012 does not seem to be possible due to upgrade errors so I will have to wipe off the physical server and install DPM2012 from scratch. I use virtual tape drive (firestreamer) and backup to disk for long term protection.
    The problem is how will I read my existing DPM2010 backups in DPM2012.
    Is there a way to attach old virtual tapes  and be able to restore data from it?
    Thanks for all your feedback.
    Tech Farmer

    Hi Eugene,
    Thanks for your reply.
    Your solution will work for inplace upgrade. My scenario is building DPM2012r2 from scratch., which means I will not be restoring DPM2010 DB. Due to this, DPM will have no record of the tape catalog/contents.
    Firestreamer will be able to mount the vTape but not identify whats in the vTape.
    With this in mind, I trying to figure out if there is a way to read the old virtual tapes (files) in the new DPM2012r2.

  • What's the best way to backup DFS using DPM 2010?

    We setup DFS on two Windows 2012 R2 and they are hosting files. What’s the best way to backup DFS using DPM 2010?
    1. When doing data backup, do we backup both dfs01 and dfs02? Or just backup one of them?
    2. Beside the data, what do we need to backup?
    Bob Lin, MCSE &amp; CNE Networking, Internet, Routing, VPN Networking, Internet, Routing, VPN Troubleshooting on http://www.ChicagoTech.net How to Install and Configure Windows, VMware, Virtualization and Cisco on http://www.HowToNetworking.com

    Protecting Data in DFS Namespaces
    Backup files and BRM of dfs01 and dfs2 for minimaze traffic.
    If dfs01 and dfs02 have good channel width that use DPM to only protect a single “copy” of the data located on a server-specific local path.
    Data Protection Manager 2010 Protection Best Practices
    Have a nice day !!!

Maybe you are looking for

  • How to discover text objects with specific point size

    Can I somehow narrow down the selection to point size xx and smaller? I'd like to outline those objects later. var alltxtObjsBelow10 =  app.activeDocument.stories.everyItem() //...with point size smaller than 10pts alltxtObjsBelow10.createOutlines();

  • Skype Community

    How is skype community structured in regards to who actually works on behalf of skype and the ones who are just community members? Can community members become moderators? Also how can you contact Skype directly in the even that your questions are no

  • Accessing the FORM tag with Javascript.

    I ran into the following problem when writing the customization portion of a Java portlet. I have some JavaScript that I would like to have run when the customization form is submitted (ie, when the user clicks OK or APPLY). Normally, this is easy to

  • T43 dvd drive playback issues [SOLVED]

    I have recently picked up a t43, and am having some downright bizarre issues with the DVD drive.  The t43 has ide drives, but uses a sata-bridge from what I can tell - this essentially means that both IDE drives are seen as scsi.  My hard disk is sda

  • Release Stategies for a document type

    Hello Gurus,    Is there any bapi that I can use to display the release codes applicable for a document type? For a PR perhaps? Thanks, Jeffrey