Restore from offsite backups
We use Duplicity to perform offsite backups. Some backups are encrypted with GPG before being shipped to an Amazon S3 bucket.
You will find the fingerprint of the key in
the govuk-puppet repository. The key and passphrase are both
stored in encrypted hieradata in the govuk-secrets repository.
The same private key is used for all offsite backups.
Prerequisites for restoring backups
On the machine where you want to restore the backup:
For the backup and restore drill, you will restore and unpack a MySQL database on a Vagrant VM.
On a fresh VM, you may require the following packages for this exercise:
You can use either your dev VM or if you have the space you can create a new mysql server VM using the following command:
vagrant up mysql-master-1.backend
This needs to be run from the root of the
Access the new VM using:
vagrant ssh mysql-master-1.backend
sudo apt-get install duplicity python-pip python-boto mysql-server
Python libs via
sudo pip install s3cmd
Set up GPG keys to decrypt backups
You will need access to production hieradata credentials to retrieve the AWS credentials and GPG key to decrypt the backups.
You are looking for:
backup::offsite::job::aws_access_key_id backup::offsite::job::aws_secret_access_key backup::assets::backup_private_gpg_key backup::assets::backup_private_gpg_key_passphrase
If you are performing the 2nd line backup drill, you will want to use the production credentials
Ensure that you can connect to the S3 bucket:
export AWS_ACCESS_KEY_ID=<access_key_id> export AWS_SECRET_ACCESS_KEY=<secret_access_key> s3cmd ls s3://s3-eu-west-1.amazonaws.com/govuk-offsite-backups-production/govuk-datastores/
If you receive a
s3cmd ls s3://govuk-offsite-backups-production/govuk-datastores/
If you can view objects inside the bucket you now have access.
You will still need to use the full URL list above when using the below duplicity commands
Now you can see the status of duplicity:
duplicity collection-status s3://s3-eu-west-1.amazonaws.com/govuk-offsite-backups-production/govuk-datastores/
Import key on machine
On the machine where you’ll be running the restore:
Create a file containing the
Import it with:
gpg --allow-secret-key-import --import <path to GPG key file>
Confirm the key has been imported correctly with:
Once the key is imported, you’ll be able to list files:
duplicity list-current-files s3://s3-eu-west-1.amazonaws.com/govuk-offsite-backups-production/govuk-datastores/
This will take a long time to complete and you will need to enter the password found in
Restore datastore from offsite backups
Download a backup
Download the latest backup with:
duplicity restore --file-to-restore data/backups/whitehall-mysql-backup-1.backend.publishing.service.gov.uk/var/lib/automysqlbackup/latest.tbz2 s3://s3-eu-west-1.amazonaws.com/govuk-offsite-backups-production/govuk-datastores/ /tmp/latest.tbz2
If you are running out of space in your VM, you could run the same command, but replacing
--tempdir /var/govuk/tmp /var/govuk/tmp/latest.tbz2
When this completes you may see the following ‘error’:
Error '[Errno 1] Operation not permitted: '/tmp/latest.tbz2'' processing .
This doesn’t seem to have any significant consequences and can be ignored.
Restore a backup
Note: If performing this as part of the 2nd line drill with the whitehall backup above, please note that this may require a lot of free disk space as the whitehall database is large - ~10GB as of Sept 2017.
To make space, first drop your dev VM’s
whitehall_development database. Note
after you import the sql, you will end up with a
Extract the downloaded backup
cd /tmp tar xvjf latest.tbz2
Extract the dump that you want to restore:
sudo mysql < foo.sql
You will need to provide the password for
mysql_rootfrom the hieradata if running the mysql VM
This will restore the contents of file
foo.sql to the database name that the
dump was taken from, creating it if it doesn’t exist.
Restore assets from offsite backups
This shows the example process of restoring files for Whitehall attachments.
Note: Ensure that you can connect to the S3 bucket using the supplied access keys. To do this, follow the Prerequisites for restoring backups section.
SSH to the machine where you want to restore the backup, for example
lsthe destination bucket
export AWS_ACCESS_KEY_ID=<access_key_id> export AWS_SECRET_ACCESS_KEY=<secret_access_key> s3cmd ls s3://govuk-offsite-backups-production/assets-whitehall/
If you can view objects inside the bucket you should have access.
The buckets are as described in
hieradata/production.yamlin the govuk-puppet repo.
Now you’ll be able to see the status of duplicity:
duplicity collection-status s3://s3-eu-west-1.amazonaws.com/govuk-offsite-backups-production/assets-whitehall/
Import the GPG secret key from the credentials store as per the section to Set up GPG keys to decrypt backups.
Once the key is imported, you can list files:
duplicity list-current-files s3://s3-eu-west-1.amazonaws.com/govuk-offsite-backups-production/assets-whitehall/
In order to restore the files, you may need to change the owner of the
/mnt/uploads/whitehalldirectory to your user temporarily, and remove any files that already exist in that directory.
Run a restore:
duplicity restore --file-to-restore mnt/uploads/whitehall/ s3://s3-eu-west-1.amazonaws.com/govuk-offsite-backups-production/assets-whitehall/ /mnt/uploads/whitehall
Once the backup has restored correctly, make sure you revert all the manual actions you’ve taken. These may include:
Changing the owner of the assets files.
Removing the secret key from the GPG keyring (
gpg --delete-secret-key 12345678).
Rotating offsite backups GPG keys
Please see Rotating offsite backup GPG keys.