An update on our early version of an AI-powered tool for rapid evidence synthesis in policymaking
What if you could read hundreds of research studies and policy reports in minutes, and quickly pinpoint which ideas could have the greatest impact?
This is what we are aiming to achieve with Policy Atlas, an AI-powered tool that helps policymakers synthesise global evidence on what works and, in turn, supports more innovative and effective public policy.
We have built an early alpha version (a functioning prototype) of the app that demonstrates an end-to-end evidence-synthesis workflow for policymakers, from information retrieval to screening, to synthesis. While we continue to iterate and improve the tool, in this project update we walk through the current version and outline next steps.
Imagine you are researching options for financing and business models to design a new policy supporting uptake of green technologies, such as heat pumps. You want to know what has already been tried and whether it worked.
You begin by entering your search query in the Policy Atlas app.
Policy Atlas landing page
The app guides you through a step-by-step process to refine your query and make it more specific. It does this by suggesting optional targeted or related sub-questions that are rapidly generated by a large language model (LLM), based on your initial query. The user can then select these sub-questions, add their own, or continue with their original search query.
Policy Atlas search page
You can then select search parameters, such as the evidence source. For academic publications, Policy Atlas uses OpenAlex, an open index of more than 250 million research publications. For policy documents, we are testing Overton, a proprietary index of roughly 18 million government publications and think-tank reports from around the world.
The user can also specify the time period and geography of interest for their search query, plus specify further criteria for which documents should be included or excluded from the results based on aspects such as the used methodology or types of outcomes.
By including these optional parameters within the app’s user workflow, we hope to support policymakers to refine their query to their desired level of specificity and as a result generate a ‘policy blueprint’ that is more closely tailored to their needs.
Policy Atlas refinement options
After you start the search, Policy Atlas retrieves document summaries from the selected databases, screens them for relevance using AI, and - where available - analyses full-text documents. The analysis identifies:
Policy Atlas summary 1
You can then explore detailed information about specific interventions. We interpret “interventions” broadly; these could include: policy measures, technologies, business models and other initiatives. Users can review tangible examples and outcomes, and click through to the original source documents.
Policy Atlas summary 2
Policy Atlas uses LLMs to estimate potential impact and evidence strength for interventions, on a scale from one to five. For evidence strength, one corresponds to anecdotal evidence, whereas five corresponds to a robust randomised controlled trial with a large sample. For predicted impact, one corresponds to speculative claims, whereas five indicates strong causal evidence. For any rating given, the LLM provides a short explanation for this.
We take inspiration from established evidence frameworks, such as Nesta’s blueprint for halving obesity and the Education Endowment Foundation’s Teaching and Learning toolkit, which publish carefully researched lists of interventions with assessments of impact, rigour and even the cost of implementation. Our ambition is that analyses of this kind can be reliably automated to give policymakers a concise policy ‘blueprint’: a collection of the best-evidenced interventions, examples from across the world and guidance on applying them to their specific context.
Users can also navigate their results by key issues within a policy area, then explore interventions linked to a chosen issue. This helps policymakers focus on the underlying problems they are trying to solve, and trace how different interventions address similar challenges.
Policy Atlas interventions
For deeper inquiry of the evidence, an AI chatbot can answer your questions using retrieval-augmented generation, which grounds its answers in the collected evidence, therefore reducing the risk of AI hallucinations.
Policy Atlas chatbot
Our user testing is highlighting improvements to be made across the app’s functionalities, and our automated approach to identifying interventions and assessing evidence strength and impact is still at an early stage. We are currently developing evaluations to test the reliability of the analysis and to guide improvements.
For example, we need to assess whether the app reliably retrieves the most relevant information from academic and policy databases, as this underpins all subsequent analyses. We also need to evaluate how accurately the LLMs extract information from source documents, and whether the resulting conclusions align with expert-produced policy blueprints.
We are also exploring how to use expert input to verify and elevate the automated analysis. The ambition is that any UK policymaker could use the app to become rapidly better informed on any policy topic. There will, however, always be a need for domain expert validation. We would like to facilitate this through collaborative features on the app - for example, including expert rankings of interventions, or allowing experts to provide suggestions to include or exclude certain types of evidence.
We are conducting internal user research with the alpha version, developing evaluations and iterating to make the tool reliable and accurate. We are also considering what it would take to scale the app to reach all 35,000 policymakers in the UK civil service.
AI-powered evidence synthesis is a rapidly developing field; we are keen to work in the open, learn from best practice and share our learning. If this project aligns with your interests or expertise, please get in touch with the Mission Discovery team.
If you prefer a video version of the app demonstration, watch the demo on Youtube.