A concurrent crawler and site downloader to make a local copy of a website.
This is used by GOV.UK to populate mirrors hosted by AWS S3 and GCP Storage.
Usage
Configuration is handled through environment variables as listed below:
SKIP_VALIDATION: Skip domain accessibility validation before crawling. Useful for offline testing.
Example: SKIP_VALIDATION=true
How to deploy
This needs manual deployment to staging and production. Once the Release GitHub Action has run select the Run workflow
option from the Deploy GitHub action. Then enter the latest tag number and the environment to deploy to.