Snapshot/VSS common issues

Issue

Backups regularly fail with some type of snapshot error

Cause

Snapshot issues are probably the most common cause of backup failures, some common causes of snapshot issues are:

  1. Issues with the VSS (volume snapshot services) subsystem: this can be determined by checking the output of vssadmin list writers.
  2. Overloaded disk/storage performance on the machine: windows events that log VSS timeouts are often due to an overloaded disk subsystem. Machines with a lack of free RAM can experience significant swapping, which results in overloaded disks.
  3. An underlying disk/storage fault: failing storage hardware can often cascade problems into upper levels such as the VSS subsystem. Incorrectly configured VSS hardware providers can also cause problems with backup software that is VSS aware.

Resolution

  1. Check the storage system for any faults: predictive failures/bad blocks/failed drives etc.
  2. Check the storage system has current firmware and drivers
  3. Check the storage system is not overloaded and has enough IOPS and bandwidth to support workloads
  4. Check that the VSS writers are in a stable state, restarting those VSS services that are showing to be in a failed state