+ Post New Thread
Results 1 to 4 of 4
Windows Server 2008 R2 Thread, DPM Frequently Failing in Technical; Hi folks, I'm getting frequent backup failures with DPM . We use it to backup shares, system states, and complete ...
  1. #1
    Gongalong's Avatar
    Join Date
    Oct 2011
    Location
    United Kingdom
    Posts
    856
    Thank Post
    762
    Thanked 14 Times in 13 Posts
    Rep Power
    9

    Question DPM Frequently Failing

    Hi folks,

    I'm getting frequent backup failures with DPM.

    We use it to backup shares, system states, and complete VMs.

    The DPM server is an HP ProLiant DL120 G6 running 2008 R2 (fully updated) and DPM 2012. It's connected via iSCSI to a Buffalo Terastation NAS.

    The daily failures are typically only the complete VMs (often the same VMs), and the errors are also typically the same (example below), although there is a little variation.

    Affected area: \Backup Using Child Partition Snapshot\VC-DC1
    Occurred since: 05/10/2012 12:28:04
    Description: Recovery point creation jobs for Microsoft Hyper-V \Backup Using Child Partition Snapshot\VC-DC1 on SCVMM VC-DC1 Resources.HyperV.domain.local have been failing. The number of failed recovery point creation jobs = 1.
    If the data source protected has some dependent data sources (like a SharePoint Farm), then click on the Error Details to view the list of dependent data sources for which recovery point creation failed. (ID 3114)
    DPM failed to access the volume \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy473\ on HyperV-2.domain.local. This could be due to
    1) Cluster failover during backup or
    2) Inadequate disk space on the volume.
    (ID 2040 Details: The device is not ready (0x80070015))

    There aren't space issues on the SAN or NAS. Both the Hyper-V hosts (ProLiant DL385 G7's) have the latest Support Packs installed and have the latest Windows Updates. The DPM server has had its network drivers updated, but doesn't seem to have a Support Pack.

    If I keep manually retrying the backup jobs they eventually complete.

    Anyone got any ideas on where to continue troubleshooting this?

    TIA

  2. #2

    Join Date
    Oct 2008
    Location
    Leeds
    Posts
    215
    Thank Post
    21
    Thanked 17 Times in 17 Posts
    Rep Power
    14
    Quote Originally Posted by Gongalong View Post
    Hi folks,

    I'm getting frequent backup failures with DPM.

    We use it to backup shares, system states, and complete VMs.

    The DPM server is an HP ProLiant DL120 G6 running 2008 R2 (fully updated) and DPM 2012. It's connected via iSCSI to a Buffalo Terastation NAS.

    The daily failures are typically only the complete VMs (often the same VMs), and the errors are also typically the same (example below), although there is a little variation.

    Affected area: \Backup Using Child Partition Snapshot\VC-DC1
    Occurred since: 05/10/2012 12:28:04
    Description: Recovery point creation jobs for Microsoft Hyper-V \Backup Using Child Partition Snapshot\VC-DC1 on SCVMM VC-DC1 Resources.HyperV.domain.local have been failing. The number of failed recovery point creation jobs = 1.
    If the data source protected has some dependent data sources (like a SharePoint Farm), then click on the Error Details to view the list of dependent data sources for which recovery point creation failed. (ID 3114)
    DPM failed to access the volume \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy473\ on HyperV-2.domain.local. This could be due to
    1) Cluster failover during backup or
    2) Inadequate disk space on the volume.
    (ID 2040 Details: The device is not ready (0x80070015))

    There aren't space issues on the SAN or NAS. Both the Hyper-V hosts (ProLiant DL385 G7's) have the latest Support Packs installed and have the latest Windows Updates. The DPM server has had its network drivers updated, but doesn't seem to have a Support Pack.

    If I keep manually retrying the backup jobs they eventually complete.

    Anyone got any ideas on where to continue troubleshooting this?

    TIA
    DPM can be prone to this sort of thing.

    A few things.

    You say that your DPM server is connected via iSCSI to a Buffalo NAS, which I assume is used as the Storage Pool? Is this via a dedicated NIC on the server? Or does it share the same NIC as what is used for pulling the data from the protected servers? If you are going to use iSCSI to a NAS, I would strongly recommend you dedicate a NIC for it (and use the other NIC to connect to the network to pull backup data from protected servers). In order to work affectively, iSCSI really requires a dediciated 10Gbit connection (or 1Gbit minimum).

    I could be wrong, but the error message appears to show the DPM server is having trouble connecting to it's Storage Pool, which could be due to the NIC being overloaded by the traffic of data it is pulling from the protected servers.

    Alternatively, if the error is referring to problems connecting to the protected server, then the issue could be the flipside of this. That the iSCSI traffic (perhaps from a concurrent backup job), could be causing the DPM server to lose connectivity with the protected server.

    I notice that you are backing up System States, these (and BMR) can be very network intensive (as I don't believe they are Incremental) and could potentially interfer with other DPM concurrent backups. Looking at the logs, does it looks like the failed VM backups could coincide with BMR/System State Backups? What happens if you manually re-start a failed VM backup when you know that there are no other backups happening?

    Could you schedule the BMR/System State backups so they don't run concurrently the VM backups?

    Finally, for each DPM protected source DPM sets up two volumes in the Storage Pool (one for the main replica and one to store any changes), check that they are both big enough (right click on the protected source).

    Good luck,

    Bruce.

  3. Thanks to Bruce123 from:

    Gongalong (30th October 2012)

  4. #3

    Join Date
    Oct 2008
    Location
    Leeds
    Posts
    215
    Thank Post
    21
    Thanked 17 Times in 17 Posts
    Rep Power
    14
    Also, I would check the Event logs on both, the DPM server and probably more importantly, the Hyper-V host of the VM that it failed on.

    Thanks,

    Bruce.

  5. Thanks to Bruce123 from:

    Gongalong (30th October 2012)

  6. #4
    Gongalong's Avatar
    Join Date
    Oct 2011
    Location
    United Kingdom
    Posts
    856
    Thank Post
    762
    Thanked 14 Times in 13 Posts
    Rep Power
    9
    Quote Originally Posted by Bruce123 View Post
    You say that your DPM server is connected via iSCSI to a Buffalo NAS, which I assume is used as the Storage Pool?
    Yes.
    Quote Originally Posted by Bruce123 View Post
    Is this via a dedicated NIC on the server?
    Dedicated. 1Gb/s.
    Quote Originally Posted by Bruce123 View Post
    Looking at the logs, does it looks like the failed VM backups could coincide with BMR/System State Backups?
    Complete VM backups start at 6pm everyday. There are 12 of those, and it's typically 2 or 3 of these that fail.

    File shares are done twice a day at 2:30pm and 10:30pm. These have yet to fail, as far as I can recall.

    VM and DC system state recovery points are taken at 5am, and there's a full backup everyday at 8pm. I don't remember these failing either.
    Quote Originally Posted by Bruce123 View Post
    What happens if you manually re-start a failed VM backup when you know that there are no other backups happening?
    Often it fails again, but if I persist it will eventually work.
    Quote Originally Posted by Bruce123 View Post
    Could you schedule the BMR/System State backups so they don't run concurrently the VM backups?
    I think that's the case currently. This was one of the things I checked, and tried to get the various backups not to coincide with each other.
    Quote Originally Posted by Bruce123 View Post
    Finally, for each DPM protected source DPM sets up two volumes in the Storage Pool (one for the main replica and one to store any changes), check that they are both big enough (right click on the protected source).
    Would DPM have done that already?

    The only other oddity is that if I check the Management section it shows some (but not all) protected servers. One of the servers it says is unprotected, but has the protection agent. Except this server is in the protection group. Coincidentally or not, this is the one that typically fails frequently, but will eventually backup if I persist with manual retries.

SHARE:
+ Post New Thread

Similar Threads

  1. DPM 2010 Failed to prepare a Cluster Shared Volume
    By Gongalong in forum Thin Client and Virtual Machines
    Replies: 7
    Last Post: 26th June 2012, 08:48 AM
  2. DPM 2010 failed Recovery Point
    By techie08 in forum Enterprise Software
    Replies: 2
    Last Post: 18th May 2011, 10:52 AM
  3. Delayed Write Failed when saving over a network
    By indiegirl in forum Wireless Networks
    Replies: 15
    Last Post: 4th November 2005, 10:21 AM
  4. failed redundancy - what to do?
    By browolf in forum Hardware
    Replies: 3
    Last Post: 2nd November 2005, 08:59 AM
  5. Replies: 18
    Last Post: 14th October 2005, 09:28 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •