About Nesta

Nesta is an innovation foundation. For us, innovation means turning bold ideas into reality and changing lives for the better. We use our expertise, skills and funding in areas where there are big challenges facing society.

Using AI to generate policy blueprints

What if you could read hundreds of research studies and policy reports in minutes, and quickly pinpoint which ideas could have the greatest impact?

This is what we are aiming to achieve with Policy Atlas, an AI-powered tool that helps policymakers synthesise global evidence on what works and, in turn, supports more innovative and effective public policy.

We have built an early alpha version (a functioning prototype) of the app that demonstrates an end-to-end evidence-synthesis workflow for policymakers, from information retrieval to screening, to synthesis. While we continue to iterate and improve the tool, in this project update we walk through the current version and outline next steps.

Searching for evidence

Imagine you are researching options for financing and business models to design a new policy supporting uptake of green technologies, such as heat pumps. You want to know what has already been tried and whether it worked.

You begin by entering your search query in the Policy Atlas app.

The app guides you through a step-by-step process to refine your query and make it more specific. It does this by suggesting optional targeted or related sub-questions that are rapidly generated by a large language model (LLM), based on your initial query. The user can then select these sub-questions, add their own, or continue with their original search query.

You can then select search parameters, such as the evidence source. For academic publications, Policy Atlas uses OpenAlex, an open index of more than 250 million research publications. For policy documents, we are testing Overton, a proprietary index of roughly 18 million government publications and think-tank reports from around the world.

The user can also specify the time period and geography of interest for their search query, plus specify further criteria for which documents should be included or excluded from the results based on aspects such as the used methodology or types of outcomes.

By including these optional parameters within the app’s user workflow, we hope to support policymakers to refine their query to their desired level of specificity and as a result generate a ‘policy blueprint’ that is more closely tailored to their needs.

Analysing policy options

After you start the search, Policy Atlas retrieves document summaries from the selected databases, screens them for relevance using AI, and - where available - analyses full-text documents. The analysis identifies: 

  • key issues in the policy area
  • interventions proposed or tested to tackle those issues
  • reported or proposed outcomes.

You can then explore detailed information about specific interventions. We interpret “interventions” broadly; these could include: policy measures, technologies, business models and other initiatives. Users can review tangible examples and outcomes, and click through to the original source documents.

Policy Atlas uses LLMs to estimate potential impact and evidence strength for interventions, on a scale from one to five. For evidence strength, one corresponds to anecdotal evidence, whereas five corresponds to a robust randomised controlled trial with a large sample. For predicted impact, one corresponds to speculative claims, whereas five indicates strong causal evidence. For any rating given, the LLM provides a short explanation for this.

We take inspiration from established evidence frameworks, such as Nesta’s blueprint for halving obesity and the Education Endowment Foundation’s Teaching and Learning toolkit, which publish carefully researched lists of interventions with assessments of impact, rigour and even the cost of implementation. Our ambition is that analyses of this kind can be reliably automated to give policymakers a concise policy ‘blueprint’: a collection of the best-evidenced interventions, examples from across the world and guidance on applying them to their specific context.

Diving deeper into the evidence

Users can also navigate their results by key issues within a policy area, then explore interventions linked to a chosen issue. This helps policymakers focus on the underlying problems they are trying to solve, and trace how different interventions address similar challenges.

For deeper inquiry of the evidence, an AI chatbot can answer your questions using retrieval-augmented generation, which grounds its answers in the collected evidence, therefore reducing the risk of AI hallucinations.

Reliability and trustworthiness

Our user testing is highlighting improvements to be made across the app’s functionalities, and our automated approach to identifying interventions and assessing evidence strength and impact is still at an early stage. We are currently developing evaluations to test the reliability of the analysis and to guide improvements.

For example, we need to assess whether the app reliably retrieves the most relevant information from academic and policy databases, as this underpins all subsequent analyses. We also need to evaluate how accurately the LLMs extract information from source documents, and whether the resulting conclusions align with expert-produced policy blueprints.

We are also exploring how to use expert input to verify and elevate the automated analysis. The ambition is that any UK policymaker could use the app to become rapidly better informed on any policy topic. There will, however, always be a need for domain expert validation. We would like to facilitate this through collaborative features on the app - for example, including expert rankings of interventions, or allowing experts to provide suggestions to include or exclude certain types of evidence.

What next?

We are conducting internal user research with the alpha version, developing evaluations and iterating to make the tool reliable and accurate. We are also considering what it would take to scale the app to reach all 35,000 policymakers in the UK civil service.

AI-powered evidence synthesis is a rapidly developing field; we are keen to work in the open, learn from best practice and share our learning. If this project aligns with your interests or expertise, please get in touch with the Mission Discovery team.

If you prefer a video version of the app demonstration, watch the demo on Youtube.

Author

Karlis Kanders

Karlis Kanders

Karlis Kanders

Head of Data Science for Discovery and Innovation

Karlis is the Head of Data Science for Discovery and Innovation, working in the Discovery and data science teams.

View profile
Shabeer Rauf

Shabeer Rauf

Shabeer Rauf

Principal Data Scientist, Data Science Practice

He/Him

Shabeer is a principal data scientist working in the Data Science practice.

View profile
Aidan Kelly

Aidan Kelly

Aidan Kelly

Junior Data Scientist, Data Science Practice

Aidan is a junior data scientist in the Data Science Practice, embedded in the sustainable future mission to focus on the reduction of carbon emissions from UK households.

View profile