Case studies

Evidence that agentic AI can support real statistical production.

The first flagship story is the Samoa Agriculture Survey 2025 cleaning pilot: a realistic test of agent-assisted cleaning in a national statistics workflow.

Samoa Agriculture Survey 2025

Impact Engines partnered with the Samoa Bureau of Statistics on a de-identified survey data cleaning workflow involving 3,214 raw records, 1,438 variables, 47 linked files, and eight questionnaire versions.

The agent prepared review logs, inspected missingness and outliers, supported open-end recoding, generated modular Stata do-files, reran the pipeline, and produced a reproducible cleaned package for human review.

  • Final cleaned sample of 3,187 households.
  • 47 cleaned Stata files and a full reproducible package.
  • 610 review rows across missingness, outliers, validation errors, open ends, ISCO clarification, inconsistencies, and questionnaire-version routing issues.
Survey data cleaning workflow diagram