Skip to main content
Warning This document has not been updated for a while now. It may be out of date.
Last updated: 11 Jul 2017

link-checker-api: Link Checker API's API

This documents the means to interface with the Link Checker API. This can be done through HTTP Endpoints and webhooks. These methods will respond/send specific entities.

Endpoints

Webhook

Entities

GET /check

$ curl -s http://link-checker-api.dev.gov.uk/check\?uri\=https%3A%2F%2Fwww.gov.uk%2F | jq
{
  "uri": "https://www.gov.uk/",
  "status": "pending",
  "checked": null,
  "errors": [],
  "warnings": [],
  "problem_summary": null,
  "suggested_fix": null
}
$ curl -s http://link-checker-api.dev.gov.uk/check\?uri\=https%3A%2F%2Fwww.gov.uk%2F\&synchronous\=true | jq
{
  "uri": "https://www.gov.uk/",
  "status": "ok",
  "checked": "2017-04-12T18:47:16Z",
  "errors": [],
  "warnings": [],
  "problem_summary": null,
  "suggested_fix": null
}

This endpoint is used to check a single link. If the link has been checked within a time specified (default 4 hours) it will return the results from that check, otherwise it will queue a check and return a pending report. You can force it to return a completed check with the synchronous parameter.

Query string attributes

  • uri (required)
    • The URI to the link to be checked
  • checked_within (optional, defaults to 14400)
    • An integer value of the number of seconds in the past that checks for this link are valid.
    • Use 0 to ensure the link is checked again
  • synchronous (optional, defaults to false)
    • A boolean value to specify to check the URI during this request, this may cause a slow/timeout response, use with caution.

Returns

A LinkReport

POST /batch

$ curl -s -H "Content-Type: application/json" -X POST -d '{"uris": ["https://www.gov.uk/", "https://www.gov.uk/search", "https://www.gov.uk/404"], "webhook_uri": "http://my-awesome-micro.service/link-checker-callback", "webhook_secret_token": "AzfenrtbCBMqqta1WEh3BQgViXZQtEdXCxBQ1P9VKN4="}' http://link-checker-api.dev.gov.uk/batch | jq
{
  "id": 137125,
  "status": "in_progress",
  "links": [
    {
      "uri": "https://www.gov.uk/",
      "status": "ok",
      "checked": "2017-04-12T18:47:16Z",
      "errors": [],
      "warnings": [],
      "problem_summary": null,
      "suggested_fix": null
    },
    {
      "uri": "https://www.gov.uk/404",
      "status": "broken",
      "checked": "2017-04-12T16:30:39Z",
      "errors": [
        "Received 404 response from the server."
      ],
      "warnings": [],
      "problem_summary": "404 error (page not found)",
      "suggested_fix": ""
    },
    {
      "uri": "https://www.gov.uk/search",
      "status": "pending",
      "checked": null,
      "errors": [],
      "warnings": [],
      "problem_summary": null,
      "suggested_fix": null
    }
  ],
  "totals": {
    "links": 3,
    "ok": 1,
    "caution": 0,
    "broken": 1,
    "pending": 1
  },
  "completed_at": null
}

This endpoint is used to check a collection of links, such as those from a webpage. It will create a resource that can be checked to get the status of the batch, as well as return the resource in the response of this request.

JSON Attributes

  • uris (required)
    • An array of URIs to be checked (max length: 5000)
  • checked_within (optional, defaults to 14400)
    • An integer value of the number of seconds in the past that checks for links are valid.
    • Use 0 to ensure links are all checked again
  • priority (optional, defaults to high)
    • A value of "high" or "low" to indicate the priority of your job
    • If you are running a lots of batches you should set this to "low" so that you don't block usage for in app usage.
  • webhook_uri (optional)
    • A URL that will be requested once the batch is complete.
  • webhook_secret_token (optional)
    • A token that will be used to generate a HMAC-SHA1 token that is included with the webhook for validating the origin.

Returns

A BatchReport, status code will be 202 for an in-progress BatchReport and 201 for a completed one.

GET /batch/:id

$ curl -s http://link-checker-api.dev.gov.uk/batch/137125 | jq
{
  "id": 137125,
  "status": "completed",
  "links": [
    {
      "uri": "https://www.gov.uk/",
      "status": "ok",
      "checked": "2017-04-12T18:47:16Z",
      "errors": [],
      "warnings": [],
      "problem_summary": null,
      "suggested_fix": null
    },
    {
      "uri": "https://www.gov.uk/404",
      "status": "broken",
      "checked": "2017-04-12T16:30:39Z",
      "errors": [
        "Received 404 response from the server."
      ],
      "warnings": [],
      "problem_summary": "404 error (page not found)",
      "suggested_fix": ""
    },
    {
      "uri": "https://www.gov.uk/search",
      "status": "ok",
      "checked": "2017-04-12T18:55:29Z",
      "errors": [],
      "warnings": [],
      "problem_summary": null,
      "suggested_fix": null
    }
  ],
  "totals": {
    "links": 3,
    "ok": 2,
    "caution": 0,
    "broken": 1,
    "pending": 0
  },
  "completed_at": "2017-04-12T18:55:29Z"
}

This endpoint is used to check on the progress of a batch or to access a completed batch

Path Parameters

  • id (required)

Returns

A BatchReport

Batch complete webhook

You can specify a webhook_uri to POST /batch to receive a callback when a batch is completed. This URL will receive a BatchReport(#batchreport-entity) in a JSON POST request.

To use it you will need an endpoint available in your application that is accessible without authentication and can receive POST requests.

Verifying the webhook request

If you specified a webhook_secret_token when calling POST /batch you will receive an additional header with the webhook request of X-LinkCheckerApi-Signature. The value of this will be a HMAC-SHA1 signature that can be used to verify the request.

You can create one in a Rails application, using the raw JSON as the request_body:

OpenSSL::HMAC.hexdigest(OpenSSL::Digest.new("sha1"), secret_token, request_body)

And verify this matches the value in the header.

LinkReport entity

{
  "uri": "https://www.gov.uk/",
  "status": "ok",
  "checked": "2017-04-12T18:47:16Z",
  "errors": [],
  "warnings": [],
  "problem_summary": null,
  "suggested_fix": null
}

Attributes

  • uri
    • The URI that was checked
  • status
    • Can be the following values:
      • "pending" - A check is queued or in progress for this link
      • "ok" - The check is completed and there were no issues found with the link
      • "caution" - There were warnings detected for this link but no errors, an end user should apply caution when linking to it.
      • "broken" - There were errors detected for this link, an end user should not link to it.
  • checked
    • An RFC 3339 formatted timestamp, will be null for a link with a status of "pending".
  • errors
    • An array of strings with details of each error found.
  • warnings
    • An array of strings with details of each warning found.
  • problem_summary
    • A short description of the most critical problem with the link.
  • suggested_fix
    • Where possible, this provides a suggested fix to the user.

errors, warnings, problem_summary and suggested_fix are all designed to be shown to the end user.

BatchReport entity

{
  "id": 137125,
  "status": "completed",
  "links": [
    {
      "uri": "https://www.gov.uk/",
      "status": "ok",
      "checked": "2017-04-12T18:47:16Z",
      "errors": [],
      "warnings": [],
      "problem_summary": null,
      "suggested_fix": null
    },
    {
      "uri": "https://www.gov.uk/404",
      "status": "broken",
      "checked": "2017-04-12T16:30:39Z",
      "errors": [
        "Received 404 response from the server."
      ],
      "warnings": [],
      "problem_summary": "404 error (page not found)",
      "suggested_fix": ""
    },
    {
      "uri": "https://www.gov.uk/search",
      "status": "ok",
      "checked": "2017-04-12T18:55:29Z",
      "errors": [],
      "warnings": [],
      "problem_summary": null,
      "suggested_fix": null
    }
  ],
  "totals": {
    "links": 3,
    "ok": 2,
    "caution": 0,
    "broken": 1,
    "pending": 0
  },
  "completed_at": "2017-04-12T18:55:29Z"
}

Attributes

  • id
    • The id of the batch this is associated with.
  • status
    • A value of "in_progress" or "completed", indicating whether all links have been checked.
  • links
  • totals
    • An object with numbers summarising the link progress. Contains the following keys:
      • links - The total number of links for this batch
      • ok - The number of links with a status of "ok"
      • caution - The number of links with a status of "caution"
      • broken - The number of links with a status of "broken"
      • pending - The number of links with a status of "pending"
  • completed_at
    • An RFC 3339 formatted timestamp if this batch has a status of "completed" otherwise null.