This is how PostgreSQL backups have traditionally been taken on the GOV.UK Infrastructure.
A third-party script called autopostgresqlbackup
pg_dump every night and stores them on disk on a dedicated mount point on the PostgreSQL primary machines.
The onsite backup machine (backup-1.management) pulls the latest backup and stores it on disk. Duplicity runs each night to send encrypted backups to an offsite backup machine (the location of this machine is dependent on the environment).
To restore from this method:
- Fetch a backup from either the dedicated mount point, the onsite machine or the offsite machine (to decrypt you may need a password kept
in the password store)
- Unzip the file
- Import into a PostgreSQL primary using
psql <dbname> < <file>
WAL-E Backups to S3
We use WAL-E to do continuous archiving to an Amazon S3 bucket. PostgreSQL uses a method of archiving transaction logs: this means that transaction logs are copied to a place on disk, and are rotated by a set amount of time or disk usage.
WAL-E uses this function and makes it easy to upload the archived logs to a different location, using environmental variables to specify where the logs should go. It also specifies encryption details.
The archived transactions logs are based upon a “base” backup, which is taken every night. By default we rotate a new transaction log every 5 minutes, so in theory we can recover point in time backups to specific time and dates in this timeframe.
To restore we can use the commands specified in the documentation:
wal-fetch. To make this easier, a script has been written to automate the
restore of the very latest backup available(
/usr/local/bin/wal-e_restore). The environmental variables which define the AWS credentials will need to be present.