Collective action and anonymity - two tools to stimulate innovation in the data economy
Data is different from other assets like gold and oil in that almost all of us, as individuals, generate it. Giving us more control over our own data will stimulate innovation, but realising the full potential will also involve collective action and anonymisation.
Power to the people: Giving us control over our own data
We should be empowered to control our own data. This is due to our part in its creation, our right to privacy for our personal information, the growing likelihood that this data will be processed in ways that affect us, but also the economic and social good that can be realised from it.
There are mechanisms to encourage this in the impending European General Directive on Data Protection (GDPR). This contains a right to data portability for individuals’ data. Although part of data protection legislation, this also has the objective of stimulating competition and innovation.
It is important that consumers switching companies can take their data with them, otherwise they risk becoming locked into their existing provider. For example, if you want to change email supplier, but cannot take your emails to the new provider then this acts as barrier to moving. Switching helps encourage competition between companies and help generates improvements and innovation. This is likely to become more important as data is integrated into more services to personalise them.
Policy makers should not though assume that individual control of data is a panacea, as there are some practical issues with individual choice driving change.
Issues in individual control of data
- Consumers do not always engage with choice (or data protection). It has been argued that Google and Facebook are utilities, but we have seen in other utilities such as energy and banking that consumers often do not use the opportunity to move between suppliers even when there are relatively clear financial incentives to do so (and with many data products the financial incentives are not that clear as the products are free). There is also evidence that users do not engage with data protection economically despite saying they value it.
- There are practical limitations on the usefulness of the data that will be transferable. It has been recently argued by Vanberg and Ünver that the right to portability in GDPR may be limited in its effectiveness as it need only apply to data provided by the user (as opposed to the related data companies generate from analysing it) and that the data which can be exported may not necessarily be compatible when switching between company systems. This potentially limits the benefits to the individual of transferring their data to another provider.
- As some data companies earn their income from business sales, individual choice does not have the same effect as other sectors. Google and Facebook, and other advertising-based data companies, earn most of their income from companies that pay for targeted adverts to consumers. As a result, this restricts the cost from people switching to competitors as they are not directly generating revenue. There are negative effects on the platforms from individuals leaving the platform, but they are less direct than where sales are straight to consumers. In addition, there is a:
- Low value of individual data. Much of the commercial value of an individual consumer's data is obtained by using the data of many people to inform targeted advertising and having the capability to show adverts to large numbers of people to maximise the chance they click on them. This means that the value of an individual’s data is arguably small, so there is less incentive for an individual to engage with it.
- Network effects. In the case of a social network like Facebook, there are network effects whereby the more people that use the platform the more attractive it is to use. This means people are less likely to unilaterally switch to another provider, as their decision to use the platform depends on other people’s decisions to use the platform. While not directly network based, the improved service companies can provide if they have more users, and so more data, can similarly help keep people on a platform.
Getting the most out of individual personal choice may therefore involve:
1. The Collective: allowing individuals data to be collectively managed
The limited value of individual data, the limited bargaining power of the citizen, their often limited engagement and skills to manage data are an argument for, with their consent, some kind of collective pooling and management of the data which empowers them. Where individuals, while having control of their data, can chose for it to be managed in a collective format, either to obtain a better deal in terms of data privacy from internet platforms or to be used for the social good.
Given the scale at which large internet platforms operate, for a collective intermediary to exert influence on these it would probably also have to operate at a large scale. However, perhaps the greater public visibility of, and focus on, the issues that an intermediary would bring to this would help facilitate change and not all companies that use our data operate at large scales.
There may also be social benefits. For example individuals might be able to donate their data for the social good. Even at a relatively small scale this could be useful, as an illustration, official national statistical surveys (at least in the UK) typically have sample sizes of 10,000s to 100,000s. This is far below the scale of data used by many internet platforms, but the data is still very useful. It has been argued that cities are a particularly productive arena for collective data sharing due their large populations and citizens' shared issues. We are working on this at Nesta through the DECODE project with a series of practical city-level pilots in Amsterdam and Barcelona to highlight the opportunities in this area.
2. The known unknowns: Using anonymisation to align data analysis with data protection
An alternative to stimulate innovation is make data available in anonymised form, in ways that do not compromise the privacy of individuals or other legal restrictions on data sharing in competition law. Anonymising data is not straightforward, or without trade-offs, but it does offer another way to unlock its potential.
We are working on this approach at Nesta in a B2B (Business to Business) context with the Open Up Challenge, which builds on the emergence of "open banking" to address competition issues and stimulate innovation in small business banking.  As part of this, challenge participants can access a data sandbox which provides real anonymised small business transaction data to help them develop useful innovations targeted at small businesses. Innovations supported through the Challenge include ways to make raising capital significantly easier and to radically reduce financial admin for microbusinesses. Academic research is also developing new technological solutions to anonymisation, such as techniques that allow data sets to be analysed while individuals' data itself remains private and accessible to its original owner only.
Technological change has transformed data into the new asset class of the 21st century. The characteristics of data, and the market structures it creates, mean that we should, in addition to regulation, use technology to build on individual choice as a driver for innovation by also enabling collective control and anonymisation.
Acknowledgements: Thanks go to Theo Bass, Chris Gorst, Juan Mateos-Garcia, and Tom Symons for their comments on this post and its predecessor, and to Christiane Wendehorst for discussions on related topics.
 Information Commissioner’s Office (ICO), (2017), ‘Overview of the General Data Protection Regulation’. The legislation is set to come into force in the UK in May 2018.
 Preibusch, S., Kubler, D. and Beresford, A. (2013), ’Price versus privacy: an experiment into the competitive advantage of collecting less personal information’, Electronic Commerce Research. 2013, Issue 4, pp 423-455. Davies, J. (2015), ‘The price of being free’.
 Diker Vanberg, A. & , Ünver, MB. (2017), "The right to data portability in the GDPR and EU competition law: odd couple or dynamic duo?", in European Journal of Law and Technology, Vol 8.
 Wendehorst, C. (2017), Presentation on data intermediaries at the European Legal Institute conference (ELI) on digitalisation, 31 March.
 Mulgan, G. (2017), 'A new family of data commons', Nesta. Smart city expo world congress (2016), ‘Roundtable Session - A New Deal on Data: What role for Cities?’. Panellists José Luis De Vicente, Evgeny Morozov, Francesca Bria.
 Symonds, T., Bass, T. and Copeland, E. (2017), ‘It’s time to start taking control of our personal data seriously’, Nesta.
 ICO (2017), ‘Anonymisation: managing data protection risk code of practice summary’. There are for example long-standing prohibitions on the sharing of information between businesses due to the potential for collusion that it opens up. OFT (2004), Agreements and concerted practices.
 There is also more general interest in the role of data in competition policy as evidenced by the recent joint paper from the French and German competition authorities on competition law and data. Authorite de la concurrence and the Bundeskartellamt (2016), ‘Competition Law and Data’.
 Zysking, G., Nathan, O, and Pentland, A. (2017), ‘Enigma: Decentralized Computation Platform with Guaranteed Privacy’, arXiv:1506.03471 .