Tracking Synthetic Biology
As part of a series of research grants for projects using online data to understand specific emerging technologies, Nesta part-funded Abdullah Gök and Philip Shapira’s research using advanced text-mining techniques help to assess expectations, applications, and concerns about synthetic biology in a way that is useful for policymakers.
Synthetic biology involves redesigning biological components and systems found in the natural world or making new ones from scratch. Proponents expect synthetic biology to radically restructure many existing industries and create significant new ones. Synthetic biology is a domain of rapidly growing research interest. The UK is the world’s second largest publisher of research papers in synthetic biology, after the US and ahead of Germany, France and China. Commercial interest is increasing, with potential applications of synthetic biology spanning a broad swath of sectors: engineered plants in agriculture, synthetic biofuels in energy, synthetically designed natural fragrances in fine chemicals, and rapidly synthesising viruses to speed up vaccine development in healthcare. The UK government is investing significant amount of resources in synthetic biology research as well as supporting business development and regulation.
At the same time, synthetic biology’s ground-breaking prospects are accompanied by concerns about societal implications. These include the ethics of engineering nature, risks associated with uncontrolled (and controlled) release of synthetically-engineered organisms into the environment, bioterrorism, patents, effects on traditional agriculture and industries and the implications of synthesising life. As synthetic biology moves out of the lab, what applications are likely to emerge and who is developing them? What are the implications of those applications? Such questions require a better understanding of synthetic biology applications, expectations and concerns as well as insights into how these issues are discussed.
A new way of looking at synthetic biology
We analysed posts and discussions on social media to rapidly gather information about this emerging technology. The constraints of relying on conventional data sources, such as scientific papers (which report research findings but not who might use them) or patent applications (which often take time to be made public) make it attractive to explore what can (and cannot) be obtained through social media analysis. The everyday posting and responding on social media by millions of people from all walks of life has presented opportunities for social scientists to mine this data to explore questions ranging from changes in popular sentiment to the spread of flu symptoms. We analysed social media to see whether it can serve as a leading-edge indicator of policy-relevant issues and debates about synthetic biology.
In a study, supported in part by Nesta, we piloted an emerging technologies monitoring system based on Twitter and website data. We analysed nearly 21,000 tweets on the topic of synthetic biology posted by more than 8,200 unique users worldwide (in English) between January and May 2014.
Twitter is used by different actors for different reasons. Researchers working on synthetic biology share their ideas, new results, and comment on other research and the broader development of the field. Businesses use social media to promote their synthetic biology products, as well as their perspectives on policy and regulation. Non-profit groups, research sponsors, policymakers and members of the general public are also active in sharing their perspectives and comments on synthetic biology. Social media acts as a highly compressed mirror of more conventional dialogue. The format doesn’t lend itself to extended debate. A tweet is limited to 140 characters and discoverable as part of a long chronological list. But, as an indicator to what is happening in synthetic biology, where, and who is involved, social media offers timely clues.
Significantly, a frequent feature of tweets is how users embed signals to further sources of information. We found that 86% of the synthetic biology tweets in our dataset included a URL link for a total of 4,443 unique webpages. For this reason, we use Twitter as an entry point to more detailed sources – we gather and text mine these webpages as the basis of our analysis. At least one-fifth of these websites are UK-based.
We use Twitter as an entry point to more detailed sources – we gather and text mine these webpages as the basis of our analysis.
After extensive data cleaning, the first step in our analysis was to classify the webpages included in our datasets. We devised a three-layer classification based on the type of organisation that owned the website, the type of the information that the page contains and the scope of the interest in synthetic biology. We deployed advanced text mining techniques to identify the applications of synthetic biology mentioned in webpages, the concerns raised, and expectations from governments. We ran formal statistical tests to verify our results - in this blog we presented findings visually.
(Click on image to view full size)
- Community broadening. While much discussion about synthetic biology (about two thirds of webpages) is posted on specialist science, technology or synthetic biology websites, we also find considerable discussion (the other one third) on other kinds of websites. About half of all websites provide institutional and scientific information related to synthetic biology, the other half presents news and opinion. Most websites discussing synthetic biology belong to universities, academic publishers and government organisations. But around one third are operated by individuals, non-governmental organisations, and companies. As commercial interest grows, we anticipate more presentation related to synthetic biology on corporate sites, even though some companies may not disclose all details.
- Multiple applications. Specific applications of synthetic biology are mentioned in about one third of pages linked to synthetic biology tweets. Medical and healthcare applications (at around one-fifth of all application references) slightly edge out others, followed by applications related to energy and environment, food and agriculture, and consumer products. This widening out of discussion about synthetic biology applications from medical domains to a broad array of resource, intermediate, and consumer sectors surely suggests the need for policymakers and other stakeholders to consider how current regulatory systems will deal with the potential variety and diversity of synthetic biology applications. Military and non-peaceful applications of synthetic biology are discussed in only about 5% of webpages, in contrast to their representation in the academic literature.
Military and non-peaceful applications of synthetic biology are discussed in only about 5% of webpages, in contrast to their representation in the academic literature.
- Concerns. The safety of synthetic biology is currently a most prominent issue, highlighted in about two-fifths of webpages linked to synthetic biology tweets. Ethical concerns and issues related to fairness and social justice are each discussed in around 20% of webpages. About one-tenth of webpages raise other issues including concerns related to hubris and religion. Again, there is a signal here to policymakers – and also to companies seeking to commercialise synthetic biology - to pay attention to safety, ethical, and other concerns and to address the issues raised.
- Linkages. Mapping applications to concerns reveals interesting results. A mention of a specific application significantly (around two times) increases the probability of raising any type of concerns; fairness and social justice concerns are linked with a mention of a specific application to a higher extent than other concerns. Concerns about synthetic biology are amplified in discussions of a specific product.
- Expectations of government. Some 20% of the webpages discuss expectations of governments and regulatory bodies. Government is expected to intervene to address safety and social justice concerns, although expectations vary by specific applications.
These results indicate the breadth and nature of discussion of applications, expectations and concerns related to synthetic biology in social media and on the web. Exchanges about applications come with discussion about safety and equity issues rather than less tangible ethical debates.
We continue working on this rich dataset we have collected. We will present more detailed findings in open-access work, further explain the methods, and share our data when the project is complete.
Visualisation of the data available here