Migration to AWS
At the moment, all GOV.UK applications are running in AWS except for Signon and most of the publisher apps.
Please refer to the Release app to determine where an app is currently located.
Most services run on Amazon EC2, but there are some differences in the infrastructure that you should be aware of.
Hostnames and DNS
Traditionally we hardcoded hostnames and IPs on each instance in
/etc/hosts. In AWS, we are making use of Auto Scaling Groups (ASG) and Elastic Load Balancing (ELB) to connect to instances, and internal DNS using Amazon Route 53 for service resolution.
Traditionally you would see hostnames similar to:
backend-1.backend frontend-1.frontend puppetmaster-1.management
Hostnames are now automatically generated by DHCP, and refer to the IP address and region that the instance belongs to:
Please see the documentation about accessing the environment.
Please note During the migration to AWS, the presence and value of the app_domain, app_domain_internal and Plek service environment variables are dependent on the migration state of an application as well as its dependencies and can be non-intuitive. If in doubt, please talk to RE:GOV.UK (e.g. in Slack #re-govuk) to make sure your configuration change is consistent with your intention.
Traditionally resolving a service name to an IP would be handled by hardcoding names and IPs in
To make use of the dynamic environment in AWS, we are using Amazon Route 53 to resolve service names to their appropriate ELB. Each node group (a set of instances within an autoscaling group) will resolve a main service name, along with any application service names that belong to that group. For example, the
calculators-frontend node group, will resolve
calculators-frontend as the service name:
lauramartin@ec2-integration-blue-backend-ip-10-1-5-53:~$ host calculators-frontend calculators-frontend.integration.govuk-internal.digital is an alias for calculators-frontend.blue.integration.govuk-internal.digital. calculators-frontend.blue.integration.govuk-internal.digital has address 10.1.6.27 calculators-frontend.blue.integration.govuk-internal.digital has address 10.1.5.238
It will also resolve for an application service name, such as
lauramartin@ec2-integration-blue-backend-ip-10-1-5-53:~$ host calendars calendars.integration.govuk-internal.digital is an alias for calculators-frontend.integration.govuk-internal.digital. calculators-frontend.integration.govuk-internal.digital is an alias for calculators-frontend.blue.integration.govuk-internal.digital. calculators-frontend.blue.integration.govuk-internal.digital has address 10.1.5.238 calculators-frontend.blue.integration.govuk-internal.digital has address 10.1.6.27
The service name will first resolve the top level environment domain name (
integration.govuk-internal.digital), which will be a CNAME record to a stack specific DNS record. Please see the documentation about the concept of stacks in the infrastructure.
GOV.UK applications use Plek for service discovery. Plek will return the fully-qualified domain name (FQDN) of the service it is discovering.
irb(main):001:0> Plek.find("publishing-api") => "https://publishing-api.integration.govuk-internal.digital"
This will resolve to the associated ELB:
lauramartin@ec2-integration-blue-backend-ip-10-1-5-53:~$ host publishing-api.integration.govuk-internal.digital publishing-api.integration.govuk-internal.digital is an alias for publishing-api.blue.integration.govuk-internal.digital. publishing-api.blue.integration.govuk-internal.digital has address 10.1.4.215 publishing-api.blue.integration.govuk-internal.digital has address 10.1.5.50
No internal services should be accessed using the external public load balancers from within the internal network.
We are unable to set the internal domain as the default because some applications do self-referred Plek lookups that affect how applications are presented to the user. We have determined it is safer to set specific overrides for services until this behaviour is changed within the applications.
Please see the related ADR for DNS Infrastructure for further detail.
PostgreSQL and MySQL
We are using Amazon Relational Database Service (RDS) to host PostgreSQL and MySQL databases.
To run Puppet against these databases, we have a new instance class: db_admin
db_admin manages MySQL and PostgreSQL RDS instances. It runs nightly backups of all of these to S3 using govuk_env_sync.
Transition has its own class for management: transition_db_admin
Please see the documentation about administering RDS databases.
Applications in AWS are gradually being migrated from self-hosted MongoDB to Amazon DocumentDB. Notable differences include:
- DocumentDB implements the MongoDB 3.6 API, whereas our self-hosted MongoDB is version 2.4.
- DocumentDB instances do not support unauthenticated connections; they require a username and password.
- Storage is allocated automatically and scales automatically.
- DocumentDB does not support arbitrary binary data in fields of type
Stringbecause it doesn’t allow strings to contain NUL characters (
DocumentDB instances are managed and backed up via the
db_admin bastion hosts, similarly to Postgres and MySQL.
We are using Amazon Elasticache instead of managing our own Redis instances.
Removal of load balancer tiers
Merging of MySQL database servers
Traditionally, we had a separate MySQL server for Whitehall. Rather than manage multiple RDS instances, we have merged this into the main MySQL server. See the relevant ADR for details.
See the documentation to make and deploy changes to the infrastructure.
Automated application deployments
If an EC2 instance is terminated, it will be automatically rebuilt by the autoscaling group. If an instance runs deployable applications,
it will automatically start a deployment of the applications it runs using the
Deploy_App Jenkins job. It deploys the
and runs the
If an instance is having issues, terminating the instance may be the quickest way of ensuring a clean redeploy the applications.
Be aware of instances that run a lot of applications that this may block ongoing deployments due to the time it takes to deploy multiple applications.
For a list of what applications run on which instance types, see the
node_class: entry in the relevant hieradata for the environment: