Of course I hope I never need a backup, but just in case...
This is probably for everyone the most boring topic. Backups are meant to never be used, it's a bother to set up, and there are lots of things to think about. Especially for that last reason I'd better document how I set it up.
The purpose of making a backup
I would like to protect myself for these cases:
- Hardware failure, e.g. a disk crash. I have never had that happen yet
- Human error: when I delete something, or make an error editing something.
Currently I do not protect against a site failure, where all computers are no longer available.
Principle
The basic principle is that all valuable data should be stored on two disks, preferably on two computers.
Laptops
The most precious data is stored on a few laptops.
They're all running Linux, so a simple daily rsync of /home
to my home server suffices to make sure that the data is at least available
on two disks.
In the past I also had Windows computers, it is more bothersome to
automate this daily copying. Probably my lack of skills.
Home server
My home server has multiple disks, all formatted with btrfs. I have set up the disks to mount always a subvolume for "normal" use, and also the whole disk so that I can easily make snapshots separately.
The relevant part of /etc/fstab
looks as follows:
LABEL=SAMSSSD500 / btrfs subvol=rootfs,auto,noatime 0 1
LABEL=SAMSSSD500 /mnt/SAMSSSD500 btrfs auto,noatime 0 0
LABEL="HGST1000" /mnt/HGST1000 btrfs auto,noatime 0 0
LABEL="HGST1000" /data btrfs subvol=data,auto,noatime 0 0
And mounted it looks like this:
# ls /mnt/SAMSSSD500
rootfs snaps
# ls /mnt/HGST1000
data snaps
So there are two disks:
- labeled 'SAMSSSD500` with a subvolume 'rootfs', used as linux root file system.
- labeled 'HGST1000', with a subvolume 'data', under which there are directories for backup..
Both disks have a directory 'snaps' under the root.
Safe guarding the home server's root disk
This can again be done with a simple rsync scriptt:
#!/bin/bash
SEMOPHORE_FILE="/tmp/backup_running.sema"
MAX_TIME=3600
echo "Start of backup"
# Check for stale semaphore file
if [ -f "${SEMOPHORE_FILE}" ]; then
read SEM_TIME < "${SEMOPHORE_FILE}"
RUNNING_TIME=$((`date "+%s"` - ${SEM_TIME}))
if [[ ${RUNNING_TIME} -gt ${MAX_TIME} ]] ; then
echo "Removing stale semaphore file ${SEMOPHORE_FILE}."
rm ${SEMOPHORE_FILE}
else
echo "Other backup still running, ${RUNNING_TIME} seconds, leaving semaphore file ${SEMOPHORE_FILE} in place."
fi
fi
# Make an rsync copy when there is no other rsync running
if [ ! -f "${SEMOPHORE_FILE}" ]; then
date "+%s" > "${SEMOPHORE_FILE}"
for d in boot etc home var
do
/usr/bin/rsync -avx --exclude-from="${HOME}/bin/backup-exclude.lst" --delete-excluded --delete "/${d}" /data/backup/current/hf/local
done
rm "${SEMOPHORE_FILE}"
fi
echo "Backup done"
The scipt protects against running multiple times by using a sempahore file.
It also cleans up stale semaphore files.
The main function is of course to rsync from one disk to the other, and then
only the relevant directories /boot
, /etc
, /home
, and /var
.
It uses an 'exclude file' so that not too much is copied.
Sample entries of the exclude file are:
home/hf/.cache
home/hf/.config/chromium
home/hf/.kodi/userdata/Thumbnails
home/hf/Downloads
home/hf/downloads
var/cache/
var/db/
var/lib/
var/log/
var/spool/
Making snaphots
Just making copies on multiple disks is of course not good enough. We also need to be able to go back in time. And we need to do some house keeping: we cannot keep copies of everything until eternity.
There are many utilities that provide such functionalities, and they all have their own constraints and pecularities. Many only work with a specific disk set up.
Subsnap, a small script that I developed, aims to do away with that, and be really simple to use. It does not assume any disk layout, and lets sysadmins easily configure the amount of daily, weekly, monthly, or yearly backups that (s)he wants to retain. It does assume a naming convention for the backups: basename + label + date.
It can be called from a cronjob:
#Mins Hours Days Months Day-of-the week
10 6 * * * /root/bin/snapfs /mnt/SAMSSSD500/rootfs /mnt/SAMSSSD500/snaps daily 8
#11 6 * * 1 /root/bin/snapfs /mnt/SAMSSSD500/rootfs /mnt/SAMSSSD500/snaps weekly 14
#12 6 1 * * /root/bin/snapfs /mnt/SAMSSSD500/rootfs /mnt/SAMSSSD500/snaps monthly 13
20 5 * * * /root/bin/snapfs /mnt/HGST1000/data /mnt/HGST1000/snaps daily 15
21 5 * * 1 /root/bin/snapfs /mnt/HGST1000/data /mnt/HGST1000/snaps weekly 14
22 5 1 * * /root/bin/snapfs /mnt/HGST1000/data /mnt/HGST1000/snaps monthly 13
23 5 1 1 * /root/bin/snapfs /mnt/HGST1000/data /mnt/HGST1000/snaps yearly 5
The top entry specifies that every day at 6:10 AM a snapshot labeled 'daily'
is made from /mnt/SAMSSSD500/rootfs
, which will be stored in
/mnt/SAMSSSD500/snaps
, and that it retains just eight copies.
For today the result would be a snapshot at
/mnt/SAMSSSD500/snaps/rootfs-daily-2023-02-02T06:10+01:00
.
I currently do not see the need to retain weekly or monthly copies of the
root file system, because they are kept on the backup disk already.
For the data / backup disk I retain 15 daily (two weeks worth plus one), 14 weekly (three months plus one), 13 monthly (one year plus one), and five yearly copies.