Fetch analytics data for search failed
This checks the latest build state of a job in production Jenkins which runs every night and updates all documents in the search index with pageview data from Google Analytics.
The only downside of this not running is that the popularity data is slightly out of date. This job should not be run in working hours.
This job has two parts:
- Page view data is fetched from Google Analytics. This process seems to sometimes hang, resulting in stuck jobs.
- The search index is locked and popularity score of every page is updated.
If the job fails or gets stuck in part 1, you can cancel it and leave it to run the next day. If you see in the Jenkins job output:
$ jenkins.util.io.CompositeIOException: Unable to delete '/var/lib/jenkins/workspace/search-api-fetch-analytics-data'.
The job runs a Docker container which uses
root, and if it fails the
files don’t get cleaned up properly. You can check by running the following
command on the machine:
$ sudo su - jenkins $ ls -al /var/lib/jenkins/workspace/search-api-fetch-analytics-data
total 20 drwxr-xr-x 5 jenkins jenkins 4096 Dec 31 21:17 . drwxr-xr-x 78 jenkins jenkins 4096 Feb 15 14:18 .. drwxr-xr-x 4 jenkins jenkins 4096 Dec 31 21:17 analytics_fetcher drwxr-xr-x 2 root root 4096 Dec 23 04:10 cache drwxr-xr-x 3 jenkins jenkins 4096 Dec 31 21:17 gapy
They can be removed by running:
$ sudo rm -r /var/lib/jenkins/workspace/search-api-fetch-analytics-data
The job should hopefully run successfully overnight.
If the job fails or gets stuck in part 2, the search indices may still be locked. If that’s the case, documents will be missing from search. First unlock the indices:
bundle exec rake search:unlock SEARCH_INDEX=all
Then see the “fix out-of-date search indices” docs.
If the job is failing often, make sure the team currently responsible for search are aware of the problem (or Platform Health).