ArtificiaI intelligence (AI) techniques could play an important role in the mission to tackle COVID-19, from helping to discover new drugs and vaccines to predicting the spread of infection and testing patients. At the same time, many AI techniques are experimental, rely on big, sensitive datasets and might be difficult to apply in high-stakes domains such as health.
This report studies the levels, evolution, geography, knowledge base and quality of AI research in the COVID-19 mission field using a novel dataset taken from open preprints sites arXiv, bioRxiv and medRxiv, which we have enriched with geographical, topical and citation data.
Although there has been rapid growth in the levels of AI research to tackle COVID-19 since the beginning of the year, AI remains underrepresented in this area compared to its presence in research outside of COVID-19. So far in 2020, 7.1 per cent of research on COVID-19 references AI, while 12 per cent of research on topics outside COVID-19 references it. After growing rapidly earlier in the year, the share of AI papers in COVID-19 research has stagnated in recent weeks.
More than a third of publications to tackle COVID-19 involve predictive analyses of patient data and in particular medical scans. AI is also being deployed to analyse social media data, predict the spread of the disease and develop biomedical applications.
China, the US, the UK, India and Canada are the global leaders in the development of AI applications to tackle COVID-19 research, accounting for 62 per cent of the institutional participations for which we have geographical data. China in particular is overrepresented in COVID-19 AI research. We have also identified a substantial number of publications involving institutions that we are unable to match with the global research institution database we are using. This is consistent with the idea that new actors are entering the COVID-19 mission field.
AI and non-AI researchers working in COVID-19 tend to draw on different bodies of knowledge. AI’s share of citations to computer science is five times higher than outside and its share of citations to medicine is a third lower. These differences hold, even after we control for the topic within COVID-19 that different publications focus on .
In general, AI papers to tackle COVID-19 tend to receive less citations than other papers in the same topic. The population of AI researchers active in the COVID-19 mission field also tends to have a less established track record proxied through the citations they have received in recent years. This result holds when we compare researchers working in the same topics, suggesting that it is not simply driven by variation in the citation behaviours of different communities and disciplines.
Our analysis highlights the velocity with which research communities – including AI researchers – are mobilising to tackle the COVID-19 pandemic. We find many opportunities to apply powerful AI algorithms to prevent, diagnose and treat the virus. At the same time, deep learning algorithms’ reliance on big datasets, difficulties interpreting their findings, and a disconnect between AI researchers and relevant bodies of knowledge in the medical and biological sciences may limit the impact of AI in the fight with COVID-19. The persistent underrepresentation of AI research in the COVID-19 field we evidence, and its focus on computer vision analyses that play to the strengths of current algorithms, but require substantial investments in hardware and changing how hospitals work, are consistent with the notion that AI’s may play a limited role tackling this pandemic.
There is also the risk that researchers facing low barriers to entry into the field may produce low-quality contributions making it harder to find valuable studies and discourage interdisciplinary contributions that could take longer to develop. Our finding that AI research tends to be less cited than other research, even inside the same publication topics, and that AI researchers entering the field have a weaker track record on average than others, lends some support to these concerns.
How can we harness AI’s potential to tackle COVID-19 and future pandemics, while removing some of its risks?
In the shorter term, creating bigger higher-quality open datasets related to COVID-19 could make it easier to deploy state-of-the-art deep learning algorithms. Spurring interdisciplinary collaborations, bringing together AI researchers and subject experts, may help to prioritise those AI applications with the greatest relevance and value. It might also reduce the risk of ‘AI imperialism’; where AI researchers ignore relevant bodies of knowledge about the complex biological and social systems where their techniques will be applied, reducing their value and creating unintended consequences. We also need technological and social solutions for the challenge of navigating a vast and fast-growing body of research of uncertain quality. Going forward, research funders should encourage the development of AI algorithms that are easier to deploy in small-data, high-stakes domains.
Novel data sources and methods, such as those we have used in our analysis, can play an important role in informing these strategies.
The data set used in this report is open for other researchers to analyse and build on.