Findings – Topical composition
We use topSBM, a topic modelling algorithm, to analyse the composition of COVID-19 research and identify topical clusters based on the language in their abstracts. This analysis reveals 31 topical clusters involving 193 topics.
The bar chart at the top of Figure 4 shows the share of all artificial intelligence (AI) and non-AI COVID-19 papers in different clusters, and the heat map below shows how prevalent are different topics (which we summarise with their most salient terms) in those topical clusters . For example, the heatmap shows that a topic that uses the terms ‘classification, images and deep learning’ is prevalent in Cluster 4, and a topic related to ‘Social Media, Twitter and tweets' is prevalent in cluster 16.
Highly-cited examples of AI research in different topical clusters summarises the content of some of the topical clusters with the highest levels of AI activity and gives examples of highly-cited AI publications in those clusters.
The first thing to note is some overlap in the themes of clusters, consistent with the idea that it is not trivial to classify many publications into a single area of activity. Having said this, we note that AI applications in the COVID-19 mission field include:
- Predictive analyses of medical scans to diagnose COVID-19 (clusters 4, 18 and 19).
- Analyses of digital (e.g. social media) data and development of digital solutions such as apps (cluster 16).
- Predictive analyses of the spread of the pandemic (clusters 5 and 10).
- Biomedical research for drug and vaccine discovery (clusters 12 and 14).
Over a third of AI applications to tackle Covid involve predictive analyses of patient data.
In relative terms, AI is overrepresented among predictive analyses of image data (where almost all publications involve AI), as well as analyses of social media data. Although there is a large number of AI papers analysing the spread of COVID-19 (cluster 5), they are in the minority in that category.
Figure 4: AI research to tackle COVID-19 focuses on predictive analyses of patient (specially imaging) data and analyses of social media data
Example AI papers:
AI-assisted CT imaging analysis for COVID-19 screening: Building and deploying a medical AI system in four weeks
COVID-CT-Dataset: A CT Scan Dataset about COVID-19
Automated Methods for Detection and Classification Pneumonia based on X-Ray Images Using Deep Learning.
Example AI papers:
A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19)
Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study
Deep learning-based Detection for COVID-19 from Chest CT using Weak Label
COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification
A Machine Learning Application for Raising WASH Awareness in the Times of COVID-19 Pandemic
A Novel AI-enabled Framework to Diagnose Coronavirus COVID 19 using
Smartphone Embedded Sensors: Design Study
COVID-19 Outbreak Prediction with Machine Learning
COVID-19 Infection Forecasting based on Deep Learning in Iran
Generating Similarity Map for COVID-19 Transmission Dynamics with
Deep Learning System to Screen Coronavirus Disease 2019 Pneumonia
EXPLAINABLE-BY-DESIGN APPROACH FOR COVID-19 CLASSIFICATION VIA CT-SCAN
A Machine Learning Solution Framework for Combating COVID-19 in Smart Cities from Multiple Dimensions
Severe acute respiratory syndrome-related coronavirus - The species and its viruses, a statement of the Coronavirus Study Group
Host and infectivity prediction of Wuhan 2019 novel coronavirus using deep learning algorithm
COVID-19 coronavirus vaccine design using reverse vaccinology and machine learning
Predicting commercially available antiviral drugs that may act on the novel coronavirus (2019-nCoV), Wuhan, China through a drug-target interaction deep learning model
Potentially highly potent drugs for 2019-nCoV
Large-scale ligand-based virtual screening for SARS-CoV-2 inhibitors using deep neural networks
Impacts of Social and Economic Factors on the Transmission of Coronavirus Disease 2019 (COVID-19) in China
Finding an Accurate Early Forecasting Model from Small Dataset: A Case of 2019-nCoV Novel Coronavirus Outbreak
Quantifying the effect of quarantine control in Covid-19 infectious spread using machine learning
COVID-19 Epidemic in Switzerland: Growth Prediction and Containment Strategy Using Artificial Intelligence and Big Data
Coronavirus Geographic Dissemination at Chicago and its Potential
Proximity to Public Commuter Rail
COVID-19 and Company Knowledge Graphs: Assessing Golden Powers and
Economic Impact of Selective Lockdown via AI Reasoning
Figure 5 presents the sources for articles in different clusters, distinguishing between AI and non-AI publications in each row. Interestingly, it shows that although clusters 4 and 18 focus on a similar topic (predictive analyses of image data), the former is dominated by publications from arXiv (which we assume primarily involve computer scientists) while the latter is dominated by medRxiv (which we assume must involve medical scientists). Cluster 16, about analyses of social media data, is also dominated by publications from arXiv, while clusters 5 and 19 include a mix of computer science and medical science publications. AI papers focused on drug discovery (cluster 12) generally come from biorXiv. Interestingly, cluster 14, also about biomedical applications, includes many contributions from arXiv.
Figure 5: What disciplines are focusing on what topical clusters of AI research?
2. We have removed 38 very generic topics that tend to appear in most papers.