Table of contents

Supporting CKAN


There are currently three environments for CKAN:

  • Live —
  • Test —
  • Development —

You can ssh on to these machines with ssh co@<machine-name>. For example, to access the Test machine, you would ssh

If you cannot ssh as above, it’s worth asking someone in the #platform-health slack channel to make sure you are in the authorized_keys.

We are in the process of migrating CKAN to standard GOV.UK infrastructure.


ckanext-dgu is the primary CKAN extension for the current environments. This is being replaced with ckanext-datagovuk as part of the migration process. Although other extensions are used in the deployment, ckanext-dgu and ckanext-datagovuk are the ones that contain our changes to functionality and styling.

Managing CKAN

First check to see if it is possible to complete the task through the system dashboard. You will need a system administrator account.

For commands not available via the user interface you must connect to the server to run the commands. All of the commands to interact with CKAN use a tool called paster.

Many of these commands take a path to the config file with the -c option, although you can instead use -c $CKAN_INI which should resolve to /var/ckan/ckan.ini.

On Bytemark servers paster should be run with:

cd /vagrant/src/ckan
. /home/co/ckan/bin/activate

On GOV.UK servers paster should be run with:

cd /var/apps/ckan
sudo -u deploy govuk_setenv ckan venv/bin/paster

A full guide to administering CKAN and Bytemark can be found in the CKAN sysops document.

Further, less commonly used, commands can be found in the CKAN documentation.

There is also a separate historical document of previous admin tasks that you may wish to consult.

Switching between legacy CKAN and Find open data

To access legacy CKAN, append ?legacy=1 to the URL.

If viewing a dataset, the final part of the path must be removed, leaving only the GUID (e.g. on Find open data can be viewed in legacy CKAN at

Accessing the CKAN API

There are times when it can be useful to access the CKAN API when debugging or resolving issues.

Note that the responses will be different depending on your access permissions. The ID can be specified as either the GUID or the URL slug (referred to as a URL name in CKAN).

Listing all datasets

Viewing a dataset

Searching for a dataset

Find all packages created during a specific timeframe[2017-06-01T00:00:00Z%20TO%202017-06-30T00:00:00Z]

Find all packages modified during a specific timeframe[2017-06-01T00:00:00Z%20TO%202017-06-30T00:00:00Z]

List all publishers

View a publisher record

View a user (e.g. to get CKAN API key for a Drupal user)

Creating a system administrator account

paster --plugin=ckan sysadmin add USERNAME email=EMAIL_ADDRESS -c $CKAN_INI

You will be prompted twice for a password.

Removing a system administrator account

paster --plugin=ckan sysadmin remove USERNAME -c $CKAN_INI

Managing users

Listing users

paster --plugin=ckan user list -c $CKAN_INI

Viewing a user

paster --plugin=ckan user USERNAME -c $CKAN_INI

Adding a user

paster --plugin=ckan user add USERNAME email=EMAIL_ADDRESS -c $CKAN_INI

Removing a user

paster --plugin=ckan user remove USERNAME -c $CKAN_INI

Changing a user’s password

paster --plugin=ckan user setpass USERNAME -c $CKAN_INI

Deleting a dataset

CKAN has two types of deletions, the default soft-delete, and a purge. The soft delete gives the option of undeleting a dataset but the purge will remove all trace of it from the system.

Where the following commands mention DATASET_NAME, this should either be the slug for the dataset, or the UUID.

Deleting a dataset:

paster --plugin=ckan dataset delete DATASET_NAME -c $CKAN_INI

Purging a dataset:

paster --plugin=ckan dataset purge DATASET_NAME -c $CKAN_INI

Rebuilding the search index

CKAN uses Solr for its search index, and occasionally it may be necessary to interact with it to refresh the index, or rebuild it from scratch.

Refresh the entire search index:

paster --plugin=ckan search-index rebuild -r -c $CKAN_INI

Rebuild the entire search index:

paster --plugin=ckan search-index rebuild -c $CKAN_INI

Only reindex those packages that are not currently indexed:

paster --plugin=ckan search-index -o rebuild -c $CKAN_INI

Managing the harvest workers

Although harvesters can mostly be managed from the user interface, it is sometimes easier to perform these tasks from the command line. If using a system administrator account you will see > 400 harvest configs without a clear way of seeing which are currently running.

Listing current jobs

Returns a list of currently running jobs. This will contain the JOB_ID necessary to cancel jobs.

paster --plugin=ckanext-harvest harvester jobs -c $CKAN_INI

Cancelling a current job

To cancel a currently running job, you will require a JOB_ID from the Listing current jobs section.

paster --plugin=ckanext-harvest harvester job_abort JOB_ID -c $CKAN_INI

Purging all currently queued tasks

It may be necessary, if there is a schedule clash and the system is too busy, to purge the queues used in the various stages of harvesting

Warning: This command will empty the Redis queues

paster --plugin=ckanext-harvest harvester purge_queues -c $CKAN_INI

Restarting the harvest queues

If the queues stall, it may be necessary to restart one or both of the harvest queues.

The gather jobs retrieve the identifiers of the updated datasets and create jobs in the fetch queue.

sudo supervisorctl restart ckan_gather_queue

The fetch job retrieve the datasets from the remote source and perform the relevant updates in CKAN.

sudo supervisorctl restart ckan_fetch_queue

Adding a new Schema

Each new schema for the schema dropdown in CKAN needs a title and a URL …

paster --plugin=pylons shell $CKAN_INI

Then in the REPL that loads:

>>> from ckanext.dgu.model.schema_codelist import Schema
>>> model.Session.add(Schema(url="[URL]", title="[TITLE]"))
>>> model.repo.commit_and_remove()

Find all packages where a resource has a partial URL

psql ckan
FROM package p
INNER JOIN resource_group rg ON rg.package_id =
INNER JOIN resource r ON r.resource_group_id =
WHERE r.url LIKE ''
  AND p.state = 'active';

Stopping a harvester

Find the UUID of the harvester:

psql ckan -c "SELECT id FROM harvest_source WHERE name = '[NAME]'"

Set all jobs belonging to that harvester to finished:

psql ckan -c "UPDATE harvest_job SET finished = NOW(), status = 'Finished' WHERE source_id = '[UUID]' AND NOT status = 'Finished';" 

Change a publisher’s name

Change the name in the publisher page then reindex that publisher:

paster --plugin=ckan search-index rebuild-publisher [PUBLISHER} -c $CKAN_INI

Register a brownfield dataset

See the supporting manual.

This page was last reviewed . It needs to be reviewed again by the page owner #govuk-platform-health.