Good innovation policy requires accurate, reliable and timely data, but getting this data is becoming harder as change in the economy speeds up.
We have been working with new data sources and methods to address this gap between the reality of innovation, as it happens, and our ability to measure it. Our ultimate goal is to turn this into information for effective use by policymakers, and to drive decision making from an evidence-based position.
In his recently published independent review, Sir Charles Bean set out a radical new vision for the creation and use of economic statistics in the UK. Several of the recommendations in Bean’s report relate directly to Nesta and our partners’ areas of work over recent years, from the need to exploit new methods like web-scraping, to the requirement for skills and technology to support flexible exploitation of large datasets.
Today we have published ‘Innovation Analytics: A guide to new data and measurement in innovation’. This short report acts as a summary of our different strands of work in this area. We split this work up into three broad categories:
Administrative data sources such as the Inter-departmental Business Register (IDBR) and surveys like the Annual Business Survey and Labour Force Survey form the basis of official economic estimates that innovation policymakers use. Our work tries to more accurately define and measure innovative areas with these data, most notably the creative, hi-tech and information economies.
For example, our “dynamic” mapping approach to measuring creative industries takes official Standard Industrial Classification codes and provides a systematic way of grouping them into higher level sectors based on where creative employees work (since many creative people work outside the traditional creative industries), instead of using ad hoc lists of sectors.
But there are a number of well-known limitations in using official statistics to understand new and innovative sectors. Difficulties emerge, in great part because official SIC codes are infrequently updated, and so do not capture new industries in a timely way. The surveys and administrative sources underlying official data also do not typically capture relational data - the ‘networks’ between individual innovators or places - that are of interest to innovation policy analysts. Therefore:
As Sir Charles Bean put it in his recent review:
“The volume of data – both public and private – that can be employed in principle in measuring the economy, together with the technological capacity for handling it, has exploded as a result of the digital revolution”
We and our partners have been expanding our use of new types of data, from gathering more specific detail by harvesting information from company websites, to better capturing the institutional relationships that exist beyond the firm (the meetups, networks and knowledge transfer behind successful innovation).
Big data isn’t only about volume, but also about variety. There is a lot to gain from combining data sources and methods, as we did in our recent Tech Nation analysis with Tech City UK. The project combined a dynamic mapping of the digital economy, data collated by web-scraping of company websites and job advertisement websites, and data sourced from online platforms like Meetup and Github. With these data we were able to build a more detailed picture of the digital economy and the sectors, skills and informal relationships that constitute it.
It’s no good measuring just for the sake of it; policy-makers need to be able to extract value from the data, and they need to be able to interact with it in helpful and intuitive ways. This might be through the use of interactive data visualisations, which - in using ‘preattentive’ features like the efficient use of colour, size and shape in a single space - are particularly well suited to an era of increasingly large and complex datasets.
The web's connectivity also makes it possible to combine many different datasets, and update them in real time. We are able to move from static reports to interactive data tools whose users can interrogate the data in new ways. Moreover, the open data generated through this process can be harvested and further analysed by others, increasing our understanding of the innovation system, and the value that can be created from this data.
This is what we are aiming to do with our Welsh innovation analytics project: 'Arloesiadur'. In line with previous work, this project also sets out to understand how the people, skills and processes behind innovation policy might also need to adapt to harness new data sources and methods.
These methods should be still be regarded as experimental. They require dealing with messy data, which often has not been created for analytical purposes or quality assured by statisticians. As suggested in our video games study, it’s important that researchers openly share new techniques and methodologies for using these data so that they can be replicated, or the results reproduced and scrutinised. In our work we’re hoping to create something that is as open as possible, so others can overlay their data and develop new ways of interrogating the existing datasets.
We are actively looking for partners in this ambitious programme of work. We welcome potential collaborators to get in touch, including anyone with:
interesting datasets that can be analysed to help us better understand innovation;
policy-relevant research questions that can be addressed with these methods;
research funds to support this work;
analytical skills to extract and communicate information from these complex datasets.
 Beyond our economy-related work see, for example, The Political Futures Tracker. This project used web, social media and long form text data to create a comprehensive picture of political themes, attitudes and ‘future thinking’ running up to the 2015 UK General Election.