Search synchronisation errors
A document doesn’t appear in site search after it is published
When a document is published, publishing API places a message on Rabbit MQ’s search_api_published_documents
queue.
Search API v2’s document_sync_worker
listens to this queue, and co-ordinates the creation of a PublishingAPIDocument that is then synchronised to the VAIS datastore.
If a document has been successfully published, is live on gov.uk, but is not being returned in search results, the first step is to confirm if this is the expected behaviour.
- Check the locale of the missing content_item. The document_sync_worker only sends documents with an
en
locale to the datastore. - Check the document_type of the missing ContentItem. If it’s is on the document_type_ignorelist the content has been intentionally ignored by the worker.
- Check the state of the missing ContentItem. Withdrawn content is desynchronised i.e. removed from the datastore.
If you are sure that the document should be visible in results, you can try the following debugging steps:
- Check #govuk-search-alerts for any helpful synchronisation errors coming from search-api-v2
- Inspect the application logs for the search-api-v2-worker in kibana. A quick hacky way to find
the logs is to set the appropriate time window, and just search for ‘ “DiscoveryEngine” AND “
” ’. A more systematic approach would be to filter by kubernetes.labels.app_kubernetes_io/name: search-api-v2 worker and search for the content_id. - Look for a message that explains why the worker has failed to synchronise the document. One possible cause is that the payload version of the message received from the queue is lower than the previous time the document was synced. If this is the cause, clear the redis cache and then resynchronise the document.
An unpublished/withdrawn document is still present in search results
When a document is withdrawn or unpublished via the UI in Whitehall, a message is placed on Rabbit MQ’s search_api_published_documents
queue.
Search API v2’s document_sync_worker listens to this queue, and co-ordinates the desynchronisation of the document from the VAIS datastore if the document type is on the UNPUBLISH_DOCUMENT_TYPES list.
If an unpublished document is still visible in search results, you can take the following debugging steps
- Confirm if the document_sync_worker received a request to update the document. Do this by checking in Kibana by searching for the content_id.
- Check for a message which explains why the desynchronisation failed.
- If there is no log for this request, it could be that the document was manually deleted from the whitehall database and so no message was placed on the search_api_published_documents queue. If this is the case, you can manually delete the document from the datastore.
Wider synchronisation issues
When documents are published or unpublished via a publishing app a message is placed on Rabbit MQ’s search_api_published_documents
queue.
Search API v2’s document_sync_worker listens to this queue and should co-ordinate the creation of a PublishingAPIDocument that is then synchronised to or desynchronised from the VAIS datastore.
If all documents that have been successfully published or unpublished on gov.uk, but are not being returned as expected in search results it could be a sign that synchronisation process is failing in general for all documents. You can take the following debugging steps.
- Check #govuk-search-alerts for any helpful synchronisation errors coming from search-api-v2
- Confirm if the document_sync_worker received a request to update the document. Do this by checking in Kibana by searching by content_id for a few example documents.
- Check for messages which explains why the desynchronisation failed.
- One cause is that the payload version of the message received from the queue is lower than the previous time the document was synced. If this is the cause, clear the redis cache and then resynchronise all documents.
- If you are see a lot of
Google::Cloud
errors in the logs the likely cause is that VAIS is having issues. Check the error rates for site search and follow the steps for what to do if site search is unavailable.