Repository: govuk-graphql
Experimental high-performance GraphQL API for GOV.UK
- GitHub
- govuk-graphql
- Ownership
- #govuk-publishing-platform owns the repo. #govuk-publishing-platform-system-alerts receives automated alerts for this repo.
- Category
- Publishing apps
README
EXPERIMENTAL - DO NOT USE IN PRODUCTION
Ask Richard Towers if you want more details about this repository
This is a proof of concept GraphQL API for GOV.UK.
It uses Publishing API’s database to efficiently serve the content needed to render pages on GOV.UK.
The goal of this implementation is to be as fast as possible, at all costs. This means that the code uses some unfamiliar technology, and is at times harder to understand than other implementations, but it is also (much) faster.
This allows us to set a lower bound on what’s possible in terms of performance, so that we can make an informed decision on the trade off between code readability and performance.
Technologies
Like other proof of concepts, we use graphql-ruby as the main library to implement the GraphQL API.
We use the Sequel ORM instead of ActiveRecord, because the heart of this implementation is a complex SQL query which is difficult to construct using ActiveRecord.
Rather than requesting one edition at a time from the database, or batching requests up using a dataloader, this implementation:
- walks the graphql request using lookahead
- works out which link type paths are needed to resolve the request
- uses a single SQL query to fetch all the editions needed to resolve the request
- reconstructs the tree of editions requested in the graphql query from the database results
The resulting SQL query is complex, but initial testing suggests that its performance is very good.
<!– raw HTML omitted –>WITH RECURSIVE
"link_type_paths" AS (
SELECT trim_array(path, 1) as path, path[array_upper(path, 1)] as next
FROM json_to_recordset($1) AS paths(path text[])
),
"reverse_link_type_paths" AS (
SELECT trim_array(path, 1) as path, path[array_upper(path, 1)] as next
FROM json_to_recordset($2) AS paths(path text[])
),
"edition_links" AS (
SELECT 'root' AS "type", '{}'::text[] AS "path", ARRAY["editions"."id"]::int[] AS "id_path", "documents"."content_id" AS "content_id", "editions"."id" AS "edition_id", "editions".*
FROM "editions"
INNER JOIN "documents" ON ("documents"."id" = "editions"."document_id")
WHERE (("state" = 'published') AND ("locale" = 'en') AND ("base_path" = $3)
)
UNION ALL (
WITH "edition_links" AS (SELECT * FROM "edition_links")
SELECT 'forward edition' AS "type", ("edition_links"."path" || "link_type") AS "path", ("edition_links"."id_path" || "editions"."id") AS "id_path", "documents"."content_id" AS "content_id", "editions"."id" AS "edition_id", "editions".*
FROM "edition_links"
INNER JOIN "link_type_paths" ON ("link_type_paths"."path" = "edition_links"."path")
INNER JOIN "links" ON (("links"."edition_id" = "edition_links"."edition_id") AND ("links"."link_type" = "link_type_paths"."next"))
INNER JOIN "documents" ON (("documents"."content_id" = "links"."target_content_id") AND ("documents"."locale" = 'en'))
INNER JOIN "editions" ON (("editions"."document_id" = "documents"."id") AND ("editions"."state" = 'published')
)
UNION ALL (
SELECT 'reverse edition' AS "type", ("edition_links"."path" || "link_type") AS "path", ("edition_links"."id_path" || "editions"."id") AS "id_path", "documents"."content_id" AS "content_id", "editions"."id" AS "edition_id", "editions".*
FROM "edition_links"
INNER JOIN "reverse_link_type_paths" ON ("reverse_link_type_paths"."path" = "edition_links"."path")
INNER JOIN "links" ON (("links"."target_content_id" = "edition_links"."content_id") AND ("links"."link_type" = "reverse_link_type_paths"."next"))
INNER JOIN "editions" ON (("editions"."id" = "links"."edition_id") AND ("editions"."state" = 'published'))
INNER JOIN "documents" ON (("documents"."id" = "editions"."document_id") AND ("documents"."locale" = 'en'))
)
UNION ALL (
SELECT 'forward link set' AS "type", ("edition_links"."path" || "link_type") AS "path", ("edition_links"."id_path" || "editions"."id") AS "id_path", "documents"."content_id" AS "content_id", "editions"."id" AS "edition_id", "editions".*
FROM "edition_links"
INNER JOIN "link_type_paths" ON ("link_type_paths"."path" = "edition_links"."path")
INNER JOIN "link_sets" ON ("link_sets"."content_id" = "edition_links"."content_id")
INNER JOIN "links" ON (("links"."link_set_id" = "link_sets"."id") AND ("links"."link_type" = "link_type_paths"."next"))
INNER JOIN "documents" ON (("documents"."content_id" = "links"."target_content_id") AND ("documents"."locale" = 'en'))
INNER JOIN "editions" ON (("editions"."document_id" = "documents"."id") AND ("editions"."state" = 'published'))
)
UNION ALL (
SELECT 'reverse link set' AS "type", ("edition_links"."path" || "link_type") AS "path", ("edition_links"."id_path" || "editions"."id") AS "id_path", "documents"."content_id" AS "content_id", "editions"."id" AS "edition_id", "editions".*
FROM "edition_links"
INNER JOIN "reverse_link_type_paths" ON ("reverse_link_type_paths"."path" = "edition_links"."path")
INNER JOIN "links" ON (("links"."target_content_id" = "edition_links"."content_id") AND ("links"."link_type" = "reverse_link_type_paths"."next"))
INNER JOIN "link_sets" ON ("link_sets"."id" = "links"."link_set_id")
INNER JOIN "documents" ON (("documents"."content_id" = "link_sets"."content_id") AND ("documents"."locale" = 'en'))
INNER JOIN "editions" ON (("editions"."document_id" = "documents"."id") AND ("editions"."state" = 'published')))
)
)
SELECT * FROM "edition_links"
<!– raw HTML omitted –>
Approach
This implementation takes a slightly different approach to the GraphQL schema.
The goal is still to be able to produce responses which are compatible with content-store. However, the schema is designed to be a bit more flexible, allowing queries for arbitrary link types in both forward and reverse directions.
Specifically, rather than having an explicit field for level_one_taxons
, and the implementation knowing that this
is actually a reverse link of type root_taxon
, the schema has a links_of_type
field which has arguments to specify
the link type and direction. We can use graphql aliases to make the query look like the content-store response.
# Instead of:
level_one_taxons {
base_path
content_id
details
document_type
...
}
We have:
level_one_taxons: links_of_type(type: "root_taxon", reverse: true) {
base_path
content_id
details
document_type
...
}
A more complete query (for the GOV.UK homepage) might look like this:
query homepage {
edition(base_path: "/") {
analytics_identifier
base_path
content_id
description
details
document_type
first_published_at
links {
level_one_taxons: links_of_type(type: "root_taxon", reverse: true) {
base_path
content_id
details
document_type
}
primary_publishing_organisation: links_of_type(type: "primary_publishing_organisation") {
base_path
title
details
document_type
}
}
phase
public_updated_at
publishing_app
publishing_request_id
rendering_app
schema_name
title
updated_at
}
}
Performance
The homepage query above completes in about ~35-60ms on my local machine, of which ~13-20ms is spent in the database:
# Homepage load times
Sequel::Postgres::Database (12.7ms)
Completed 200 OK in 35ms (Views: 0.6ms | GC: 0.0ms)
Sequel::Postgres::Database (18.9ms)
Completed 200 OK in 54ms (Views: 0.7ms | GC: 0.0ms)
Sequel::Postgres::Database (12.1ms)
Completed 200 OK in 34ms (Views: 0.7ms | GC: 0.0ms)
More complicated pages such as the ministers index page take longer to load, but still complete in well under a second:
<!– raw HTML omitted –>fragment Person on Edition {
title
base_path
details
links {
role_appointments: links_of_type(type: "person", reverse: true) {
links {
role: links_of_type(type: "role") {
title
base_path
}
}
}
}
}
fragment Department on Edition {
base_path
links {
ordered_ministers: links_of_type(type: "ordered_ministers") {
base_path
}
ordered_roles: links_of_type(type: "ordered_roles") {
content_id
}
}
}
query ministers_index {
edition(base_path: "/government/ministers") {
title
links {
ordered_cabinet_ministers: links_of_type(type: "ordered_cabinet_ministers") {
...Person
}
ordered_also_attends_cabinet: links_of_type(
type: "ordered_also_attends_cabinet"
) {
...Person
}
ordered_ministerial_departments: links_of_type(
type: "ordered_ministerial_departments"
) {
...Department
}
ordered_assistant_whips: links_of_type(type: "ordered_assistant_whips") {
...Person
}
ordered_baronesses_and_lords_in_waiting_whips: links_of_type(
type: "ordered_baronesses_and_lords_in_waiting_whips"
) {
...Person
}
ordered_house_lords_whips: links_of_type(type: "ordered_house_lords_whips") {
...Person
}
ordered_house_of_commons_whips: links_of_type(
type: "ordered_house_of_commons_whips"
) {
...Person
}
ordered_junior_lords_of_the_treasury_whips: links_of_type(
type: "ordered_junior_lords_of_the_treasury_whips"
) {
...Person
}
}
}
}
<!– raw HTML omitted –>
# Ministers index load times
Sequel::Postgres::Database (120.2ms)
Completed 200 OK in 380ms (Views: 4.6ms | GC: 28.1ms)
Sequel::Postgres::Database (107.5ms)
Completed 200 OK in 331ms (Views: 3.7ms | GC: 5.4ms)
Sequel::Postgres::Database (101.1ms)
Completed 200 OK in 379ms (Views: 4.2ms | GC: 37.0ms)
Where to look at the code?
The big scary SQL query is in app/graphql/resolvers/expanded_edition_resolver.rb
.
The code to walk the graphql loookahead and build link type paths, and the code to build a tree from
the database results is in lib/tasks/path_tree_helpers.rb
.