EPIC: automated backup/restore of db and keystores

Value

be able to recover the service if it is completely borked or a server is lost/damanged
maintain confidentiality of user lists by not keeping incremental backups. if admin deletes a channel, it will no longer be backed up within a day

(aka: "definitions of done"):

every night run a cron job runs that:
- makes a copy of the signal_data volume and pg_dumps the db
- writes encrypted versions of both to the filesystem (encrypted to 2 maintainer keys)
- scps those files to a backup server
- destroys old backups on the backup server
implementation note: all files necessary to do this (cron tab file, scripts, ssh keys) should live in the repo (so a user logged into prod can set things up by pulling repo and cd'ing into correct dir)

there is an ansible script that can be run with make ansible.restore that:
- reads an sb_backup host from the inventory
- runs an scp script that pulls backups from sb_backup to (new) signalboost host using -i /home/sb_user/.ssh/id_sb_user (which assumes that pubkey & secrete key must be on signalboost and that pubkey must be in allowed_keys on sb_backup)
- runs a restore job on signalboost (which must run after provision and deploy) that restores the keystore volume and runs pg_restore on the db backup
- optionally: deletes signalboost files from any borked machine (?)

some way of putting inventory (hence what backup and prod hosts and what users and keys are) under version control, without blasting away the ansible_user etc. values that any given dev might be using. (maybe put inventory.tmpl.gpg under version control?
pub key for sb_user must go on sb_backup host
sb_user and its pub/priv ssh keys must go on all prod instances
gpg keys should be imported into keyring as part of provision
cron job for running backup script must be put into all prod instances as part of provision.yml

Edited Jan 13, 2020 by aguestuser

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information