Skip to main content
Last updated: 30 Jan 2025

asset-manager: Local virus scanning with ClamAV

Per the main Asset Manager README, we use ClamAV to scan uploaded assets for viruses before they are made available to the public.

When might you run a local scan?

One reason you might want to run a scan locally is if an error like Heuristics.Limits.Exceeded.MaxFiles FOUND (VirusScanner::InfectedFile) is raised. This means that an uploaded file exceeds our size limits and cannot be scanned. To fix this, we'll likely need to increase our limits, and running a scan locally can allow us to experiment with the relevant setting. Various GOV.UK Helm Charts commits have done this: 229e16e, 5c33832, and a01862b.

Other than Sentry error reporting, this is often surfaced via Zendesk support tickets when a user tries to access an uploaded document and sees a JSON response like this:

{
  "_response_info": {
    "status": "not found"
  }
}

Setup

You'll need to install the ClamAV CLI tool and set up its virus database before you can run a scan. Below are example steps for setting this up using Homebrew with an arm64 architecture macOS system (e.g. M1 or later).

  1. Install the CLI tool: brew install clamav
  2. Start the service: brew services start clamav
  3. Create a config file for setting up the virus database:
    1. cd /opt/homebrew/etc/clamav
    2. cp freshclam.conf.sample freshclam.conf
    3. Edit the file created in the last step (freshclam.conf) and comment out the Example line with a #
    4. (Optional) Edit the config to more closely resemble the config we use in production
  4. Set up the virus database: freshclam

Usage

Run the following command, adjusting arguments as appropriate (see the clamscan docs or run man clamscan to learn more). You might want to replicate relevant parts of our production clamd config via these arguments for accurate testing.

clamscan --alert-exceeds-max=yes --max-files=35000 --max-scansize=2000M --max-filesize=500M filename.pdf

[!TIP] For the example in When might you run a local scan?, we adjusted the --max-files argument until the scan stopped reporting the Heuristics.Limits.Exceeded.MaxFiles error.

The scan might take a little while to complete. After it completes, it should report the results. A report for a clean file should include lines like those below (along with others).

filename.pdf OK

Scanned files: 1
Infected files: 0