EPIC: automated backup/restore of db and keystores
Value
- be able to recover the service if it is completely borked or a server is lost/damanged
- maintain confidentiality of user lists by not keeping incremental backups. if admin deletes a channel, it will no longer be backed up within a day
Behavior
(aka: "definitions of done"):
automated backup
-
every night run a cron job runs that:
-
makes a copy of the signal_datavolume andpg_dumpsthe db -
writes encrypted versions of both to the filesystem (encrypted to 2 maintainer keys) -
scps those files to a backup server -
destroys old backups on the backup server
-
-
implementation note: all files necessary to do this (cron tab file, scripts, ssh keys) should live in the repo (so a user logged into prod can set things up by pulling repo and cd'ing into correct dir)
automated restore
- there is an ansible script that can be run with
make ansible.restorethat:- reads an
sb_backuphost from the inventory - runs an scp script that pulls backups from
sb_backupto (new)signalboosthost using-i /home/sb_user/.ssh/id_sb_user(which assumes that pubkey & secrete key must be onsignalboostand that pubkey must be inallowed_keysonsb_backup) - runs a restore job on
signalboost(which must run afterprovisionanddeploy) that restores the keystore volume and runspg_restoreon the db backup - optionally: deletes signalboost files from any borked machine (?)
- reads an
ansible requirements
- some way of putting inventory (hence what backup and prod hosts and what users and keys are) under version control, without blasting away the
ansible_useretc. values that any given dev might be using. (maybe putinventory.tmpl.gpgunder version control? - pub key for
sb_usermust go onsb_backuphost -
sb_userand its pub/priv ssh keys must go on all prod instances - gpg keys should be imported into keyring as part of
provision - cron job for running
backupscript must be put into allprodinstances as part ofprovision.yml
Edited by aguestuser