Replicate application data locally for development
Dumps are generated from production data in the early hours each day, and can then be downloaded from integration (AWS). The process is managed by the replicate-data-local.sh script within the govuk-puppet repository.
The Licensify and Signon databases aren’t synced from production because of security concerns. Mapit’s database is downloaded in the Mapit repo, so won’t be in the backups folder.
Pre-requisites to importing data
To get production data on to your local VM, you’ll need to have either:
- access to Integration via AWS; or
- database exports from someone that does.
Follow the AWS setup guide to get your user set up in AWS. You’ll need at least the Integration environment set up.
When you have integration access, you can download and import the latest data by running:
mac$ cd ~/govuk/govuk-puppet/development-vm/replication mac$ ./replicate-data-local.sh -u $USERNAME -F ../ssh_config -n
You may be able to skip the -u and -F flags depending on your setup
The data will download to a folder named with today’s date in
./backups, for example
dev$ cd /var/govuk/govuk-puppet/development-vm/replication dev$ ./replicate-data-local.sh -d path/to/dir -s
You can skip the -d flag if you do this on the same day as the download.
Databases take a long time to download and use a lot of disk space (up to ~30GB uncompressed). The process also uses a lot of compute resource as you import the data.
The downloaded backups will automatically be deleted after import (whether successful or not) unless the -k flag is specified.
If you don’t have integration access
If you don’t have integration access, ask someone to give you a copy of their
dump. Then, from
dev$ ./replicate-data-local.sh -d path/to/dir -s
If you’re running out of disk space
If you get a curl error when restoring Elasticsearch data
Check the service is running:
dev$ sudo service elasticsearch-development.development start
Can’t take a write lock while out of disk space (in MongoDB)
You may see such an error message which will prevent you from creating or even dropping collections. So you won’t be able to replicate the latest data.
You will need to delete large Mongo collections to free up space before they can be re-imported. Follow this guide on how to delete them, and ensure that Mongo honours their removal.
Find your biggest Mongo collections by running:
dev$ sudo ncdu /var/lib/mongodb
You can re-run the replication but skip non-Mongo imports like MySQL if it’s already succesfully imported. Use
to see the options.
For example, to run an import but skip MySQL and Elasticsearch:
dev$ replicate-data-local.sh -q -e -d backups/2017-06-08 -s
More about Development VM
- Can't connect to Mongo in VM
- Content store times out in VM
- Fix issues with installing Ruby gems using Bundler
- Fix issues with vagrant-dns
- Fix low disk space in development
- Fix NFS errors in VM
- Not enough RAM for the VM
- Problems provisioning and fetching packages in VM
- Run an application in the VM
- Send a test email via Notify
- SSH into GOV.UK servers from the VM
- SSH into your VM directly
- Unable to mount VirtualBox shared folders
- Using `bowl` fails with bundler error