Skip to main content
Warning This document has not been updated for a while now. It may be out of date.
Last updated: 9 May 2023

locations-api: 1. Add ONS Postcode Data Import

Date: 2023-05-09

Status

Open

Context

Locations API is supposed to be an exhaustive list of postcodes, but because the underlying API (OS Places API) is a list of addresses, postcodes which aren't the primary postcode for an address in the OS Places API database won't ever appear. So Locations API can't support Large User Postcodes (LUPs), or Retired Postcodes.

The nature of data in Imminence means that a lot of LUPs are provided as the contact information for places in Imminence, and without LUPs in the Locations API we can neither geocode nor search by these postcodes. The current workaround is to manually look up the Longitude and Latitude of the postcode in the Office of National Statistics Postcode Data (ONSPD) file, which contains information as a CSV list of postcodes, and critically includes both LUPs and retired postcodes. This is not possible for content second liners, and not easy for technical second liners, so second line tickets generated by these problems are hard to resolve without toil work on the part of a handful of specialists.

It also means we cannot support end users who only have retired postcodes for their addresses (more common than expected, because it seems the communications around postcode retirement are often missed).

Decision

The ONSPD file is updated on average around two to three times a year (exact update intervals may vary), and has a well understood and open structure (a CSV file also available as multiple smaller CSV files) which has not changed since 2017. We do not expect it to change in the near future. Given this apparent stability and the fact that the data after import will be stable even if the file format changes, we will import it into the postcode table in Locations API. If the file format does change, we will be able to use the last import until we can update our import methods.

To differentiate these ONSPD imported records from OS Places API records, two fields will be added to the database

  • a source field, an enum which will record where the data comes from, and
  • a retired field, a boolean field which will note whether the postcode is active (false) or retired (true).

The rest of the ONSPD data for a postcode will be stored in the results field in a JSON structure, the same way we currently store for OS Places API-derived data, but with a simpler structure:

[
  {
    "ONS" => {
      "AVG_LNG" => <Longitude from ONSPD file>.
      "AVG_LAT" => <Latitude from ONSPD file>.
      "TYPE" => <"L" (Large User Postcode) or "S" (a small, "normal" postcode)>.
      "DOTERM" => <Date the postcode was terminated in YYYYMM format as in the ONSPD file>,
    }
  }
]

(Here naming the array element by its source - ONS - is a little redundant, but matches the way OS Places array elements are named with their ultimate source - DPA or LPI)

We will create import workers to handle the import. A rake task will be created which, given the URL of an ONSPD zip file, will schedule a download worker job. The download worker will retrieve the zip file and extract the smaller CSV files, putting them in an S3 bucket. It will then schedule import worker jobs for each CSV file. These jobs will read the CSV, comparing it with the existing database. Where a postcode apppears in the CSV but is not in the current database, it will create a simplified record for it, with the source marked as ONSPD and the retired flag set appropriately. These workers will not override any record that has been retrieved from the OS Places API.

The OS Places API crawler will be updated so that it will not attempt to retrieve information for any postcode whose record source is ONSPD unless it is marked as both not retired (from the active field) and not an LUP (from the simplified record). This will prevent us from bombarding OS Places API with requests for invalid postcodes, while at the same time attempting to get improved information for any postcode that should be in OS Places API.

The GDS Api Adapter methods which call Locations API will be updated to understand how to use the new structures.

We will run a data migration for existing OS Places records to populate the source/retired flag.

Consequences

We can update Imminence to allow geocoding by LUPs and retired postcodes (with warnings for the imminence admin users that those postcodes might need updating), reducing maintenance for Imminence datasets which currently require longitude/latitude overrides and generating fewer 2nd line tickets.

We can provide lookups by LUPs and retired postcodes, reducing confusion in Licence, LLM, and Imminence lookups.

We can (if it is decided useful) return end-user notifications that a postcode is retired in the frontend searches that use Locations API.