Maximising open data’s impact is about incentives and rethinking the boundary of the state
Government is making more and more of its data accessible as open data. Over 30,000 datasets have been made available. It’s great that more public data is being opened up. We, as taxpayers who have paid for its creation, should be able to obtain it in accessible form, and important things are being done with it.[1] Locking it away does nobody any good - vested interests aside. Nevertheless, making data freely available, even in a readily accessible form does not, on its own, necessarily get the most out of it. Maximising open data’s impact is also about the incentives for analysing it and, ultimately, rethinking the interface between the citizen and the state.
1. The expansion in public open data is happening while a lot of other data is becoming available, but data analysis skills are still in limited supply
Although access to data is a precondition to analyse it, the main cost in data analysis is often not the data itself, but the skills and time to analyse it. An investment of time which, like all investments, may not always pay off and which has to be compared against the returns to other data analysis options. This is important as at the same time as the increase of public open data many other new private sector data sources are becoming available from administrative systems, website analytics, web scraping, sensors and Applications Programming Interfaces (APIs). These data sources are, implicitly, competing with public open data for analysts' attention. There is now more open public data, but there’s more private data too (both open and closed) and, in the short-term it’s much easier to increase the availability of open data, than the supply of people with the skills to analyse it.
2. The fact that data is open access affects economic incentives to invest in analysing it
Opening up data gives companies access to a resource that they can use to build products and businesses on. Open access to this data means that, relevant skills aside, there are few barriers to entering this market allowing companies to compete, innovate and do good things with it. If one company can access the data and analyse it, another can too. This does though potentially affect the incentives for investing in analysing it. While the value of prominent large tech companies is not exclusively due to data (a lot of it is often about having a large user base to show adverts to), the commercial value of their data, and a lot of companies data, derives from the fact that it’s private, and competitors can’t access it. The data is an asset of the business and for reasons related to commercial interest, as well as data protection, it is often quite closed. This means that closed data can offer returns to analysis, which may not always exist with open data, something that is particularly relevant given a shortage of data analysis skills. This is certainly not to say that very successful businesses can’t be built on open data, but it does mean that incentives in this area are not completely straightforward. There are reasons why companies can be keen on paying to secure access to key inputs or having them in-house.
3. The incentives for volunteering to analyse open data are often personal
Impressive work is being done in data volunteering, such as that by Pro Bono economics and Datakind. Open data is making data available for people outside government to freely use their skills and public data to benefit society. Reliance by government on volunteering of specialised skills to address public policy issues can though raise equity issues as the skills are in limited supply and people, understandably, are often motivated to help specific causes of personal interest. This may not necessarily correspond to need and data analysis skills are probably not evenly distributed around the country.
The UK improving its data analysis skills would help address all these issues.[2] There are though other ways in which we can try and get more out of open data. Some are already happening, others are more long-term.
1. Making the successes of open data visible
One of the reasons why people invest a lot of resources in mining commodities like gold, gas, oil is the knowledge, from market prices, that they have value if extracted, processed, and sold. Data because of its heterogeneity isn’t quite like that. It typically isn’t a homogenous commodity with a clear market price - an end in itself. This is important because it means that the market signals for where data is most useful are not that clear. Many people have inferred its value indirectly from other markets i.e. the rising salaries for data analysts and the valuations of successful tech companies. In the case of the analysis of open data for public benefit, things are even less clear as the benefits may not always have a direct monetary value. It’s therefore important to demonstrate the value that open data analysis can bring from practical applications such as the work of the Open Data Institute (ODI), our recent Wise Council report on the use of data in local authorities and the work that is being done on the London Office for Data Analytics to ensure that lessons learnt are spread most effectively.[3] Indeed one of the findings from Wise Council, was that there is no natural audience for open data - it had to be nurtured and grown. When councils actively engage with local data people, programmers, civic enthusiasts etc, and give them a structured way to get involved in data then it increases the amount that happens with open data.
2. Opening up data, but with strategic incentives: for example through challenge prizes
An issue with open data and data volunteering from a policy perspective is that, they do not, in themselves, necessarily direct potential analysers towards areas of greatest possible impact among the large amount of data. Using data challenge prizes is one way to address this, where a particular issue or question is identified, and a reward made available for the best response to it using the data. We have been doing this with the Open Data Challenge Prizes in a range of areas run jointly with the ODI.[4] The prize gives a clear incentive and specifying a question in advance helps focus on areas of interest among a large amount of data. It also helps streamline the analysis, making it easier for third parties to engage as the most time consuming part of analysis is not always the technical data manipulation, but working out which questions to ask in the first place.
Access to public data is not an abstract issue of principle, we can see its importance with the removal of environmental data from public websites in the US and the debates around it. Even if data is available, getting the most out of it and data analysis in government involves rethinking the role of open data in government in the longer-term.
1. Using open data to discharge the democratic accountability functions of government more efficiently
For years government has hired hundreds of highly qualified analysts through a demanding recruitment process and then bogged them down in the machinery of democratic accountability, turning out identikit tables and/or answering idiosyncratic parliamentary questions. The accountability of government is an essential function, but the time spent fulfilling it has sometimes been at the expense of doing more in-depth and strategic analysis of arguably greater benefit. This has reduced the efficiency and impact of internal data analysis within government. Indeed, the recent Bean review on economic statistics described the Office for National Statistics as having in the past operated ‘somewhat like a factory’ with less exploratory research.[5] The review’s recommendations and recent data science programmes within government are a welcome change in this direction.[6] A key role of open data should be to deliver the state’s duty of democratic transparency as efficiently as possible, freeing up government’s analytical functions to undertake research that helps give us better policies. We’ve found from our work on the use of data by local authorities that open data is already helping officials in dealing with freedom of information requests more efficiently.[7] Maximising this kind of benefit in the longer-term probably involves more than just releasing data, but ultimately building tools on top of open data to make government more transparent. It is also very important that the public sector remains a primary consumer of its own data, as in using it, it becomes clearer what the issues with it are and how they can be addressed.
2. Looking towards a future with open data supporting digital democracy (and vice versa)
A factor that affects the impact of the analysis of data outside government is that it’s harder for external analysts to engage with the policy process. They are just outside the system. Conversely, in order for new digital democracy initiatives to work effectively citizens will need to be able to access the government data in readily usable form. This is why open data formed part of the pilots in the Nesta managed D-CENT project on digital citizen engagement. To get the most out of open data in public policy therefore involves rethinking the borders of the state through the digital democracy agenda. There is a symbiotic relationship between the two. The expansion of open data is a precondition for digital democracy, and maximising the impact of open data in public policy would be facilitated by digital democracy. Thinking how best to achieve this is an exciting challenge for the future. Turning the state inside out, in a good way.
Acknowledgements: Thanks go to Katja Bego, Kostas Stathoulopoulos and Tom Symonds and for their helpful comments.
[1] Open Data Institute (2015). Open data means business: UK innovation across sectors and regions. London, UK. Seglias, A. (2017). ‘Open data for better outcomes’ Civil service quarterly, January 2017.
[2] Nesta (2015), ‘Analytic Britain; Securing the right skills for the data driven economy’ and Mateos-Garcia, J., Bakhshi, H. and Windsor, G., (2015), ‘Skills of the datavores’, Nesta.
[3] Symons, T. (2015), ‘Wise Council’, Nesta. Copeland, C. and Collinge, A. ‘London Office of Data Analytics Challenge workshop report’, Nesta and GLA.
[4] Parkes, E. and Phillips, B. (2016),’Open data challenge series handbook’, Nesta and the ODI.
ODI (2017), ‘What works in open data challenges.’
The data challenge platform Kaggle provides another example of this with companies making their data available through the platform (Competitions with both private and publicly available data are possible) after the competition the prize host pays the prize money in exchange for a licence to use the winning Entry.This sort of mechanism allows companies to access a wide pool of data skills to solve their problems.
[5] Bean review (2016), ‘Independent Review of UK Economic Statistics’, p6.
[6] For a discussion of how the role of data analysis in government is changing see Civil Service Quarterly (January 2017) and Government Transformation Strategy: better use of data (2017)
[7] Symons, T. (2015), op. cit., p32.
[8] Mulgan, G. (2016), ‘A new family of data commons’.