Why can't I just snapshot the database data directory directly with restic?

A running database like PostgreSQL writes files in a non-atomic way, so a file-level snapshot taken mid-write produces an inconsistent state that may fail to start on restore. The correct approach is to dump first — using pg_dump or pg_dumpall for PostgreSQL, or mysqldump for MySQL — and then let restic back up the resulting dump file. The dump is the actual backup; restic is the transport and archive layer.

What does the 3-2-1 rule require, and how does it map to a typical VPS setup?

The 3-2-1 rule requires three copies of your data, stored on two different media or systems, with one copy kept off-site in a different building, provider, or region. For a VPS fleet, this means the live production data on the server counts as copy one, a nightly restic snapshot in object storage at a different provider is copy two, and a second restic copy synced to a disk at home or the office is copy three. Off-site means a different failure domain — not just a different folder on the same account.

How does the setup verify that backups are actually intact, not just present?

The nightly script runs restic check with the read-data-subset flag, which downloads and cryptographically verifies a random 5 percent of repository data every night. This ensures silent corruption is caught within weeks rather than discovered only at restore time. A Uptime Kuma heartbeat at the end of the script inverts the alerting model: instead of notifying on failure, it pages when the success signal stops arriving, which also catches cases where cron itself has died.

How does deduplication affect the storage cost of keeping many months of snapshots?

Content-defined deduplication means unchanged data blocks are stored only once across all snapshots, so the marginal cost of adding more retention history is nearly zero. The post notes that months of nightly snapshots of a 40 GB dataset fit in roughly 55 GB of storage. This makes keeping six months of history instead of one month essentially free, and is one reason the advice is to keep the history rather than prune it aggressively.

Why is heartbeat monitoring recommended over failure-alert emails for backup jobs?

Failure alert emails depend on the failing system being healthy enough to send them — if cron dies, a script crashes before the alert fires, or the server is down, no alert is sent. Heartbeat monitoring flips this: the backup job sends a success signal on completion, and you are paged only when that signal stops arriving. This approach is strictly more reliable and is trivially implemented with Uptime Kuma push monitors.

Practical 3-2-1 Backups for VPSs with Restic and Object Storage

VPS providers are reliable right up until they are not. Disks die, accounts get suspended over billing hiccups, datacenters have fires — the 2021 OVH Strasbourg fire turned a lot of single-copy backup strategies into postmortems overnight. The uncomfortable question for anyone running production workloads on rented servers is simple: if this VPS evaporated right now, how many hours of data would I lose, and how long until I am serving traffic again?

My answer to that question is the 3-2-1 rule implemented with restic and S3-compatible object storage, and it costs me only a few dollars a month across multiple servers. This post is the full setup: what the rule actually demands, the exact scripts I run nightly, how databases need special handling, what it costs, and the restore drill that separates a backup system from a feeling of safety.

What 3-2-1 Actually Requires

The rule, popularized by Backblaze and photographer Peter Krogh before them, is a minimum bar, not a gold standard:

Three copies of your data — the live production data plus two backups.

Two different storage media or systems, so one class of failure cannot take both backups.

One copy off-site — a different building, provider, or region than production.

For a VPS fleet, my translation is: the live data on the server, a nightly restic snapshot in object storage at a different provider, and a second restic copy synced to a disk at home or office. Ransomware-era refinements like 3-2-1-1-0 add an offline or immutable copy and zero verification errors — and the verification part, at least, is non-negotiable in my setup, as you will see in the script below.

Deciding What to Back Up (and What Not To)

Backing up an entire VPS image is the lazy default and it is mostly waste — the OS and packages are reproducible from your Ansible playbooks in minutes. I back up only what cannot be rebuilt:

Application state: uploaded files, generated documents, anything under /srv that users created and would notice missing.
Database dumps — as dump files, never as raw data directories (more on this below).
Configuration that drifted from automation: /etc, certificates, the crontabs and env files that somehow never made it into git.
The restic password and repository config themselves, stored separately in a password manager — an encrypted backup you cannot decrypt is a very expensive random number generator.

Never snapshot a running database's data directory with a file-level tool. PostgreSQL files copied mid-write are inconsistent and may not start, and the failure is silent until restore day. Dump first — pg_dump or pg_dumpall for Postgres, mysqldump or mariadb-dump for MySQL — then let restic back up the dump. The dump is the backup; restic is the transport and archive.

The Restic Setup: Encrypted, Deduplicated, Cheap

Restic is my tool of choice because it checks every box at once: client-side encryption by default, content-defined deduplication that keeps incremental snapshots tiny, and native support for S3-compatible backends — AWS S3, Google Cloud Storage, Backblaze B2, MinIO, Wasabi — plus SFTP and rclone for everything else. Setup is a one-time init:

# one-time setup: encrypted repo on S3-compatible storage
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export RESTIC_REPOSITORY=s3:https://storage.googleapis.com/my-backup-bucket
export RESTIC_PASSWORD_FILE=/root/.restic-password

restic init

The nightly job is where the actual strategy lives. Mine is a short shell script driven by a systemd timer, and every line earns its place:

#!/usr/bin/env bash
# /usr/local/bin/backup.sh — runs nightly via systemd timer
set -euo pipefail

# 1. dump databases to files restic can snapshot
pg_dumpall -U postgres | gzip > /var/backups/pg/all.sql.gz

# 2. snapshot app data + dumps (deduplicated, encrypted)
restic backup /var/backups/pg /srv/app/uploads /etc \
  --exclude="*.tmp" --tag nightly

# 3. enforce retention, prune unreferenced data
restic forget --keep-daily 7 --keep-weekly 4 \
  --keep-monthly 6 --prune

# 4. verify a sample of the repo actually restores
restic check --read-data-subset=5%

# 5. heartbeat to Uptime Kuma — silence means broken
curl -fsS https://status.example.com/api/push/abc123 > /dev/null

Steps four and five are the ones most setups skip. The check with read-data-subset downloads and cryptographically verifies a random 5 percent of repository data every night, so silent corruption gets caught within weeks, not at restore time. The Uptime Kuma heartbeat at the end inverts the alerting: I do not get notified when backups succeed, I get paged when the success signal stops arriving — which also catches the failure mode where cron itself died.

Mapping the Copies to Real Infrastructure

Here is how the three copies land across providers for a typical client setup, with monthly cost for roughly 50 GB of deduplicated backup data:

Copy	Where it lives	Monthly cost (approx.)
Copy 1 — live	The production VPS itself: PostgreSQL, uploads, configs.	Included in the server you already pay for.
Copy 2 — off-site, different medium	Restic repository in object storage at a different provider and region than the VPS.	A few dollars at typical object-storage pricing around half a US cent per GB.
Copy 3 — second medium, second location	restic copy of the same repo to a disk at the office, synced weekly.	Hardware you own; effectively free after purchase.

Deduplication changes the economics more than people expect: my repositories hold months of nightly snapshots of a 40 GB dataset in roughly 55 GB of storage, because unchanged blocks are stored once. The marginal cost of keeping six months of history instead of one month is nearly zero — so keep the history.

The Restore Drill: Where Backup Systems Become Real

An untested backup is a hypothesis. Twice a year, per server, I run the full drill against a throwaway VPS and time it:

Provision a clean VPS from the base Ansible playbook — this also re-validates that provisioning still works from nothing.
Install restic, restore the latest snapshot with restic restore latest, and load the database dump into a fresh PostgreSQL.
Start the application stack against the restored data and run the same smoke tests the deploy pipeline uses.
Record total wall time and the snapshot timestamp gap. Those two numbers are your real recovery time and recovery point — put them in the runbook, not in your optimism.

Write the restore commands into the same repo as the backup script, as a runnable restore.sh. During an actual incident you will be stressed and possibly not the person doing the restore. A script that worked at the last drill beats documentation every time.

The Four Mistakes I See Most

Backups stored at the same provider as production. One suspended account or regional outage takes both. Off-site means a different failure domain, not a different folder.
No retention policy. Disk fills or bills grow until someone deletes backups by hand, usually the oldest ones, usually right before they are needed. The forget/prune line is policy as code.
Credentials with delete permission on the backup bucket sitting on the production server. A compromised VPS can then destroy its own backups — use append-only credentials or bucket-level object lock where the provider supports it.
Alerting on failure emails. Failure alerts depend on the failing system being healthy enough to send them. Heartbeat monitoring — alert on absence of success — is strictly more reliable and trivial with Uptime Kuma push monitors.

The Takeaway

A real 3-2-1 setup for a VPS is one evening of work: restic init against an object storage bucket, a twenty-line nightly script with dump, snapshot, retention, verification, and heartbeat, plus a second copy somewhere you physically control. The cost rounds to a coffee per month, and the payoff is that the worst infrastructure day of your year — provider fire, ransomware, fat-fingered rm — becomes a documented, rehearsed, two-hour restore instead of a career event. Back up like the disk is already failing, because somewhere in your fleet, it is.

Sources and further reading

Frequently Asked Questions

Practical 3-2-1 Backups for VPSs with Restic and Object Storage

Frequently Asked Questions

Practical 3-2-1 Backups for VPSs with Restic and Object Storage

What 3-2-1 Actually Requires

Deciding What to Back Up (and What Not To)

The Restic Setup: Encrypted, Deduplicated, Cheap

Mapping the Copies to Real Infrastructure

The Restore Drill: Where Backup Systems Become Real

The Four Mistakes I See Most

The Takeaway

What 3-2-1 Actually Requires

Deciding What to Back Up (and What Not To)

The Restic Setup: Encrypted, Deduplicated, Cheap

Mapping the Copies to Real Infrastructure

The Restore Drill: Where Backup Systems Become Real

The Four Mistakes I See Most

The Takeaway