Find usage of Govspeak in content
Govspeak is an extension for Markdown used in GOV.UK’s publishing applications.
After making a change to the Govspeak gem, you may need to republish content that uses that markdown.
In nearly all cases, Govspeak is converted to HTML within the publishing application before being sent to Publishing API. This means there are two options for finding affected content: searching for raw Govspeak within the publishing application or converted HTML in Publishing API.
Searching for raw Govspeak in Whitehall
There is a Rake task to find published content that matches a regular expression:
rake 'reporting:matching_docs[regex]'
Replace regex
with an escaped version of the regular expression. Ruby’s Regexp.escape
method might escape too much; in general you’ll need to escape things like backslashes and square brackets. For example, to find all uses of inline attachments contained within steps, you would use the following regex:
^s[0-9]+\..*?\[AttachmentLink:.*?\].*$
This would be escaped and used in the rake task as follows:
rake 'reporting:matching_docs[s\[0-9\]+\\..*?\\\[AttachmentLink:.*?\\\].*$]'
You could try running the Rake task on a whitehall-admin
pod using the instructions in Run a rake task on EKS. However, you might find that the task quickly times out with errors like those below.
ActiveRecord::StatementInvalid: Mysql2::Error: Timeout exceeded in regular expression match. (ActiveRecord::StatementInvalid)
Mysql2::Error: Timeout exceeded in regular expression match. (Mysql2::Error)
In this case, you can:
- Replicate the data locally.
- Adjust your MySQL regex timeout by running the following in your
govuk-docker
directory:docker exec -it govuk-docker-mysql-8-1 mysql --user=root --password=root
. If you get aNo such container
error, try starting up Whitehall in GOV.UK Docker and then stopping it.SET GLOBAL regexp_time_limit=20000;
. You could try a shorter limit to start. This variable is explained in the MySQL docs.
- Run the Rake task locally with GOV.UK Docker.
Searching for converted HTML in Publishing API
Follow the instructions to open a Rails console for Publishing API.
Example commands
Here’s some example commands you can run, feel free to modify the regex for your specific usecase (and add more here if you fancy :))
This will take a few minutes to execute since it’s iterating over a lot of editions!
Find ‘call to action’
Edition.where.not(content_store: nil).find_each { |e| puts "https://gov.uk#{e.base_path}" if e.details.to_s =~ /class=\\"call-to-action/ }
Find YouTube links
Edition.where.not(content_store: nil).find_each { |e| puts "https://gov.uk#{e.base_path}" if e.details.to_s =~ /href=\\"https:\/\/www.youtube.com\/watch?v=/ }
Find hardcoded buttons
Edition.where.not(content_store: nil).find_each { |e| puts "https://gov.uk#{e.base_path}" if e.details.to_s =~ /class=\\"button/ }