Cover photo by Grant Hussey @gthussey_art

I recently moved my personal server over to a VPS and decided to automate and refine the process with Ansible. The end experience turned out much better than I thought it would.

In this post I won't go into the details of setting up the entire Ansible project or running the plays. I'm mainly going to show the core plays/roles I used to acheive a really seamless backup and restore experience.

This post should be treated as high level inspiration while also containing snippets you can plug into your own ansible repo.

Overview

Each of my services is deployed as a docker container with a mounted volume. The server is setup with Ansible. The Restic role will create a service's volume and attempt to restore it from a snapshot if it doesn't exist.

Each service then mounts it's volume. Airflow will be setup to mount all the volumes and back them all up periodically by send a snapshot to S3 via Restic.

I'll start off by showing my play, so you can get a high level idea of what is going to happen and what the end result is. I'll then dive into the details of each relevant role for the backup/restore functionality. Finally I'll show a role for my Ghost service as an example of something using a volume that gets backed up.

In each section I'll show you a play/role and it's dependant files (defaults, templates, etc) and briefly explain the important bits.

There are a bunch of Traefik tags I have on each container that I won't get into as it's a bit distracting here.

Server Setup Play

  - hosts:
      - myhost.com
    vars:
      # these assume the init role has been applied
      ansible_port: 22222
      ansible_user: zac
      # host is for traefik settings in each service to specify subdomain
      host: myhost.com
      # Both restic and airflow use this
      restic_repo: s3:https://myresticrepo.com
    roles:
      - role: roles/docker
        become: true
        docker_users:
          - zac

      # just installs helpfull packages like curl
      - role: roles/common
        become: true
        
      - role: roles/restic
        become: true
        restore_volumes:
          - volume: ghost
            path: /toback/personal_server/docker/ghost
          - volume: postgres
            path: /toback/personal_server/docker/postgres
            
      - role: roles/postgres
        become: true
        volumes:
          - postgres:/var/lib/postgresql/data
        
      - role: roles/airflow
        become: true
        volumes:
          - /opt/airflow/dags:/usr/local/airflow/dags
          # everything in /toback will be backed up and restored
          - postgres:/toback/personal_server/docker/postgres
          - ghost:/toback/personal_server/docker/ghost
          
       - role: roles/ghost
        become: true
        volumes:
          - ghost:/var/lib/ghost/content
plays/personal_server/setup.yaml

I won't dive into the first 2 roles. I may make another post about installing Docker as it's a bit tricky, but roles/docker is pretty generic and there's loads of info online about that.

The roles/common role is tiny and just installs some packages I want on the machine like curl.

After that you can see the roles/restic role being run to setup and restore the volumes, postgres starts up using it's volume, airflow mounts both to be backed up and ghost starts up with it's volume mounted.

roles/restic

This role is what handles our docker volume creation (meaning it needs to be run first) and restoration.

---
- name: Check volume info
  docker_volume_info:
    name: "{{ item.volume }}"
  register: volume_infos
  with_items: "{{restore_volumes}}"

- name: Setup volumes to restore
  docker_volume:
    name: "{{ item.1.volume }}"
    state: present
  when: not volume_infos.results[item.0].exists
  with_indexed_items: "{{restore_volumes}}"

- name: Start Restic Restore
  docker_container:
    name: restic-restore
    image: "{{ image }}"
    volumes:
      - "{{ item.1.volume }}:/data{{ item.1.path }}"
    env:
      RESTIC_REPOSITORY: "{{ restic_repo }}"
      RESTIC_PASSWORD: "{{ lookup('env', 'RESTIC_PASSWORD') }}"
      AWS_ACCESS_KEY_ID: "{{ lookup('env', 'RESTIC_AWS_ACCESS_KEY_ID') }}"
      AWS_SECRET_ACCESS_KEY: "{{ lookup('env', 'RESTIC_AWS_SECRET_ACCESS_KEY') }}"
    command: restore latest --target /data --include "{{ item.1.path }}"
    detach: false
  when: not volume_infos.results[item.0].exists
  with_indexed_items: "{{restore_volumes}}"
roles/restic/tasks/main.yml

In order, it:

  1. Collects info on our restore_volumes docker volumes
  2. Loops over the volumes, and if a volume doesn't exist create it
  3. Loops over the volumes, and if a volume didn't exist, it restore data into it from the latest snapshot

One thing to keep in mind here is I've chosen to ingest secrets from my local environment. So this role relies on the following environment variables upon running:

RESTIC_PASSWORD=
RESTIC_AWS_ACCESS_KEY_ID=
RESTIC_AWS_SECRET_ACCESS_KEY=

If you ever want to restore to the latest snapshot, just remove the desired docker volume and re-run!

roles/postgres

Airflow relies on a running SQL db (and I use it for other services like Nextcloud) so it comes next. It's also one of the simpler ones.

---
- name: Ensure volume exists
  docker_volume:
    name: postgres
    state: present

- name: Start Postgres
  docker_container:
    name: "{{ container_name }}"
    image: "{{ image }}"
    restart_policy: unless-stopped
    volumes: "{{ volumes }}"
    networks_cli_compatible: true
    networks: "{{ networks }}"
    ports: "{{ ports }}"
    env: "{{ env }}"
    labels: "{{ labels }}"
roles/postgres/tasks/main.yml
---
image: postgres:12
container_name: "postgres"
volumes: []
ports: []
networks:
  - name: main-net
env:
  POSTGRES_PASSWORD: docker
host: REPLACE.ME
labels: {}
roles/postgres/defaults/main.yml

roles/airflow

Airflow might be overkill here, but I like the UI and ability to explore previous runs and view logs.

---
- name: Ensure volume exists
  docker_volume:
    name: airflow
    state: present

- name: Creates directories for config
  file:
    path: /opt/airflow/dags/scripts
    state: directory
    mode: u=rwx,g=r,o=r
    recurse: yes

- name: Add restic backup dag
  template:
    src: dags/restic-backup.py.j2
    dest: /opt/airflow/dags/restic-backup.py
    mode: u=rwx,g=r,o=r

- name: Add restic backup script
  template:
    src: scripts/restic-backup.sh.j2
    dest: /opt/airflow/dags/scripts/restic-backup.sh
    mode: u=rwx,g=r,o=r

- name: Start Airflow
  docker_container:
    name: airflow
    command: webserver
    # not good, but so we can apt-get install restic in the dag
    # would love a better way to do this...
    user: root
    image: "{{ image }}"
    restart_policy: unless-stopped
    volumes: "{{ volumes }}"
    networks_cli_compatible: true
    networks: "{{ networks }}"
    ports: "{{ ports }}"
    exposed_ports: "{{ exposed_ports }}"
    env: "{{ env }}"
    labels: "{{ labels }}"
roles/airflow/tasks/main.yml

In order, it:

  1. Will ensure the docker volumes exist in case this role is run independently
  2. Make sure our DAG dir exists on the remote machine
  3. Add the DAG
  4. Add the script
  5. Start airflow

Here are the templates. If you aren't familiar with Airflow, just ignore the DAG and realize it's just going to run the bash script once a week.

from datetime import timedelta, datetime
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from airflow.utils.dates import days_ago

default_args = {
    'owner': 'zac',
    'depends_on_past': False,
    'start_date': datetime(2020, 7, 1),
    'email': ['zac@mydomain.come'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=5),
}
dag = DAG(
    'restic-backup',
    default_args=default_args,
    description='Backs up home directory and docker volumes mounted to /toback in this container',
    schedule_interval='@weekly',
    catchup=False
)

t1 = BashOperator(
    task_id='install_restic',
    bash_command='apt-get update; apt-get -y install restic',
    dag=dag,
)

t2 = BashOperator(
    task_id='backup',
    depends_on_past=False,
    # https://stackoverflow.com/questions/42147514/templatenotfound-error-when-running-simple-airflow-bashoperator
    bash_command='/usr/local/airflow/dags/scripts/restic-backup.sh ',
    retries=3,
    dag=dag,
)
dag.doc_md = __doc__

t1 >> t2
roles/airflow/templates/dags/restic-backup.py.j2
AWS_ACCESS_KEY_ID="{{ lookup('env', 'RESTIC_AWS_ACCESS_KEY_ID') }}" AWS_SECRET_ACCESS_KEY="{{ lookup('env', 'RESTIC_AWS_SECRET_ACCESS_KEY') }}" RESTIC_PASSWORD="{{ lookup('env', 'RESTIC_PASSWORD') }}" restic -r {{ restic_repo }} --verbose backup /toback

Note that this script template relies on the same environment variables I mentioned int the Restic role.

Adding other services!

These services will look very similar to our PG service. In fact I've been thinking of creating a generic docker container role for these services as they don't often need anything else.

---
- name: Ensure volume exists
  docker_volume:
    name: ghost
    state: present

# tasks file for traefik
- name: Start Ghost
  docker_container:
    name: ghost
    image: "{{ image }}"
    restart_policy: unless-stopped
    volumes: "{{ volumes }}"
    networks_cli_compatible: true
    networks: "{{ networks }}"
    ports: "{{ ports }}"
    exposed_ports: "{{ exposed_ports }}"
    env: "{{ env }}"
    labels: "{{ labels }}"
roles/ghost/tasks/main.yml
---
# defaults file for traefik
image: ghost:3-alpine
volumes: []
networks:
  - name: main-net
env:
  database__client: sqlite3
  url: https://{{ host }}
ports: []
exposed_ports: []
host: REPLACE.ME
labels:
  traefik.enable: "true"
  traefik.http.routers.ghost.rule: "Host(`{{ host }}`) || Host(`www.{{ host }}`)"
  traefik.http.routers.ghost.entrypoints: "websecure"
  traefik.http.routers.ghost.tls.certresolver: "myresolver"
  traefik.http.services.ghost.loadbalancer.server.port: "2368"
roles/ghost/defaults/main.yml

I left in the Traefik config just for fun here, but as you can see it simply mounts its volume that is being backed up and generally managed by Ansible via Restic. Anything stored there will be backed up with Airflow every week!

All of my other services follow this same pattern, and it's super nice.