Windows Server 2016 backups are completing with the warning “There was a failure compacting the virtual hard disk on the backup location. Detailed error: A device attached to the system is not functioning.” I have this on one virtual machine and one physical machine.
In both cases, the backups target a local Windows partition with a 16KB cluster size. On the virtual machine, the backup partition is a 150GB virtual hard drive with 45GB free (30%); the drive hosting this VHD is encrypted with BitLocker. On the physical machine, the backup partition is 500GB with 198GB free (39.6%); the drive is directly encrypted with BitLocker.
On the virtual machine, changing the backup VHD type from dynamic to fixed did not help. But if I simply wipe the VHD and recreate the backup job, the backups will succeed for a while.
On the physical machine, I created an empty 400GB partition as a temporary backup target. Backups to that partition complete without issue.
Fragmentation as the Root Cause?
In this thread on September 3, 2015, Darren Blanchard commented, ”I’ve seen this what I typically do is wipe the existing backups, because the windows backups are essentially large vhd files I’ve heard theories that if the file gets to fragmented or low on disk space you will have this problem.” Checking the fragmentation with Sysinternals’ Contig, sure enough, the T: temporary target has very few fragments but the original Q: target is heavily fragmented.
My hunch is the compaction (which happens after the backup) somehow needs to move large chunks of data around, and there simply isn’t enough contiguous free space to allow it. I tried using contig to compact the big VHDX files, but I only got the message, “<uuid>.vhdx is as contiguous as possible.” I even installed MyDefrag and ran it against the Q: drive. MyDefrag did manage to combine some of the fragments, but the backup still failed. so apparently still not enough continguous free space.
I opened a Microsoft support case and they spent two long sessions on the physical machine trying various things, including a clean system boot (stop all non-Microsoft services). Still the issue persists. They are willing to continue working on it but since I have a workaround (use a new partition), I’m not sure it’s worth it. For now, I’ve re-formatted the partitions; I’m trying the new REFS file system (with 4KB clusters) to see if that makes a difference. (Here’s a worthwhile Microsoft article on cluster sizes and backups. There’s no mention of the old rule that the minimum recommended cluster size if the volume contained shadow copies was 16KB–see my related post from 2010.)
It’s too bad that Microsoft’s core backup product is so flaky about disk space management. Even with an entire partition dedicated to Windows Backup, and even with almost 40% free disk space, it still is unable to complete backups until a sysdmin manually frees up space.
Update December 6, 2019
On the virtual machine, I gave up on Windows Backup and switched to Veeam.
On the physical machine, backups to an REFS partition dedicated to Windows Backup have been working.
Now seeing this about six weeks after a new install of Essentials 2016 on bare metal. Of six backup sources, only drive D: fails to compact. Failed initially when targeting a shared NTFS partition. Re-targeting the backup to an external NTFS drive with over 50% free still didn’t solve it. Re-targeting to an REFS drive dedicated to Windows Backup, deleting old backups first, seems to have solved it for now. Since the issue occurs only on D: with different targets, I’m wondering if it’s really an issue with on the source drive even through the message says it’s an issue compacting on the target drive. Or maybe it’s just because the backup of D: is so big (741GB) and the server isn’t powerful enough to do the compaction within whatever timeout it uses. Fortunately this machine is already using Veeam for its primary backup.
Update January 11, 2020
On the physical machine with the REFS backup partition, started getting “There was a failure while compacting…” this week. I renamed the target folder to WindowsImageBackup.old. The next backup re-created the WindowsImageBackup folder and ran successfully. Two more backups to the new folder have completed without errors. Still don’t understand the cause of this error, but deleting/renaming the WindowsImageBackup folder and letting it create a new folder on an REFS partition seems to solve it.