Skip to main content

Application: search-api-v2-dataform

Google Dataform workflow settings and pipeline definitions for GOV.UK Search

Ownership
#govuk-search
Category
Data engineering

README

What’s in this repo

This repo contains:

  • definitions/: Google Dataform SQLX pipeline definitions for processing and transforming GA4 data into Google Verex AI Search datasets
  • workflow_setting.yaml: Google Dataform workflow settings that apply to all pipelines

What’s not in this repo

Terraform definitions for dataform resources used to provision corresponding workflow and release configurations - alphagov/govuk-infrastructure/blob/main/terraform/deployments/search-api-v2/dataform.tf.

Usage

Install the Dataform CLI for local development.

Currently no Dataform Workspace is established for Google Cloud Console based pipeline development - all development is performed locally before pushing progressively into each environment/branch. Each Search API v2 GCP project/environment has a seperate Dataform Release configuration which maps onto the corresponding branch (i.e. integration, staging etc.).

Documentation

See the Google Dataform documentation.

Team

GOV.UK Search team looks after this repo. If you’re inside GDS, you can find us in #govuk-search