Skip to main content
Table of contents

Application: content-data-api

Data warehouse that stores content and content metrics to help content owners measure and improve content on GOV.UK

Ownership

#govuk-platform-health owns the application and is responsible for updating its dependencies.

Hosting

The production version of this application is hosted on AWS.

SSH Access (AWS)

gds govuk connect ssh -e integration backend
gds govuk connect ssh -e staging aws/backend
gds govuk connect ssh -e production aws/backend

Run a rake task

README

Warning The content below is pulled in directly from the repository.
Links might not function properly.

A data warehouse that stores content and content metrics, to help content owners measure and improve content on GOV.UK.

This repository contains:

Data is combined from multiple sources, including the publishing platform, user analytics, user feedback.

Introduction

Live examples

Nomenclature

  • Data warehouse: the database where we store all the metrics.
  • ETL: extract, transform, load - how we get data into the data warehouse.
  • Fact: a record containing measurements/metrics
  • Dimension: a characteristic that provides context for a fact (such as the time it was extracted, or the content item it belongs to)
  • Star schema: The way we structure data in the data warehouse using fact and dimension tables

Technical documentation

This is a Ruby on Rails application that stores over time performance metrics and content changes and exposes this information via an API. It is built on a PostgreSQL 9.6 database.

Dependencies

Running the application

See the getting started guide for instructions about setting up and running your development VM.

cd /var/govuk/govuk-puppet/development-vm
bowl content-data-api

The application can be accessed from http://content-data-api.dev.gov.uk, and will be installed on port 3235 on your Dev environment.

Running the test suite

To run the test suite:

$ bundle exec rake

Populating data

If you are a GOV.UK developer using the development VM, you can run the replication script to populate the database.

Run ETL processes locally

Licence

MIT License