I recently moved my personal server over to a VPS and decided to automate and refine the process with Ansible. The end experience turned out much better than I thought it would.
In this post I won't go into the details of setting up the entire Ansible project or running the plays. I'm mainly going to show the core plays/roles I used to acheive a really seamless backup and restore experience.
This post should be treated as high level inspiration while also containing snippets you can plug into your own ansible repo.
Each of my services is deployed as a docker container with a mounted volume. The server is setup with Ansible. The Restic role will create a service's volume and attempt to restore it from a snapshot if it doesn't exist.
Each service then mounts it's volume. Airflow will be setup to mount all the volumes and back them all up periodically by send a snapshot to S3 via Restic.
I'll start off by showing my play, so you can get a high level idea of what is going to happen and what the end result is. I'll then dive into the details of each relevant role for the backup/restore functionality. Finally I'll show a role for my Ghost service as an example of something using a volume that gets backed up.
In each section I'll show you a play/role and it's dependant files (defaults, templates, etc) and briefly explain the important bits.
There are a bunch of Traefik tags I have on each container that I won't get into as it's a bit distracting here.
Server Setup Play
I won't dive into the first 2 roles. I may make another post about installing Docker as it's a bit tricky, but roles/docker is pretty generic and there's loads of info online about that.
The roles/common role is tiny and just installs some packages I want on the machine like curl.
After that you can see the roles/restic role being run to setup and restore the volumes, postgres starts up using it's volume, airflow mounts both to be backed up and ghost starts up with it's volume mounted.
This role is what handles our docker volume creation (meaning it needs to be run first) and restoration.
In order, it:
Collects info on our restore_volumes docker volumes
Loops over the volumes, and if a volume doesn't exist create it
Loops over the volumes, and if a volume didn't exist, it restore data into it from the latest snapshot
One thing to keep in mind here is I've chosen to ingest secrets from my local environment. So this role relies on the following environment variables upon running:
Note that this script template relies on the same environment variables I mentioned int the Restic role.
Adding other services!
These services will look very similar to our PG service. In fact I've been thinking of creating a generic docker container role for these services as they don't often need anything else.
I left in the Traefik config just for fun here, but as you can see it simply mounts its volume that is being backed up and generally managed by Ansible via Restic. Anything stored there will be backed up with Airflow every week!
All of my other services follow this same pattern, and it's super nice.