With the right skills, we can create big value from big imperfect data
Big data is lurking with biases, mirages and self-fulfilling prophesies. Avoiding them - and preventing big fails - requires access to the right skills and organisation. On the 9th of July, Nesta will launch Model Workers, a report that explores the skills implications for educators, policymakers and managers.
Agencies of magic
In his seminal ‘Cybernetics’, Norbert Wiener uses the story of ‘The Sorcerer Apprentice’ to illustrate the risks of outsourcing our agency to machines: the Sorcerer Apprentice’s desire for a free lunch gets the better of him after his wishes are misinterpreted by ‘literal minded agencies of magic’, those pesky (robotic) brooms.
The way some describe it, ‘big data’ – the vast volumes of data that are being generated in ‘real time’ from many sources - can sound like another magical agent descended from The Cloud to deliver many of our wishes. They include faster growth, more efficiency and customer satisfaction, less waste and less risk. Big data can also act as a wrathful god, and punish us with disruption if we fall behind. As with many other things, Dilbert has illustrated best the messianic aspects of the ‘Big Data revolution’ (“In the past, our company did many evil things, but if we put big data in our servers, we will be saved from bankruptcy”.)
What are the problems with big data, according to its discontents? That it is biased, that it does not allow us to distinguish between correlation and causation, that it generates self-fulfilling prophecies, and that by multiplying the number of possible analyses, it also multiplies the number of spurious results we will get (after we torture data long enough). A few examples are being paraded to illustrate how things can go wrong with big data, most egregiously Google Flu Trends overestimation of 2012’s flu, and the skewed picture of the effects of hurricane Sandy from Twitter data.
These big data pitfalls make one thing clear: that it is not the prerogative of machines (and agencies of magic) to be too literal in their interpretation of information: we humans make the same mistake when we approach big data as if it was an oracle instead of an often-warped mirror, for example forgetting that the behaviour of search engine users changes over time, or that web data is skewed by unequal access to smartphones and broadband.
A big job
The response to this situation is not to make the true the enemy of the probable, and reject big data, but to accept its flaws, and overcome them. There are some tasty lunches out there, but they are not free. Getting them requires hard work, skills and organisation. Same as it ever was.
- What are the skills needed to create value from big data? We have talked before about data scientists – creative workers with the programming skills to get and process data, and the analytical skills - including things like research methods, statistical sampling and inference, or experimental design - to find answers inside it. Another important element of the data scientist toolset is domain knowledge about the field where she works– including what are the questions worth asking, what are the best data sources to do so, and their limitations and biases. All of these skills should help us reduce the risk of big data fails - but is there enough talent like this to deal with all the data we are creating? What does the data revolution mean for skills and education policy in the UK?
- Where does organisation come into play here? Big data projects often involve creative talent, unconventional data and new business models, and this raises management challenges in many areas, from recruitment (where to find this talent?) to organisational design (where to put it in the organisation?) and project selection (how to keep data talent motivated with interesting work while at the same time preventing them from going down analytical rabbit holes?)
On Wednesday the 9th of July, we will be launching Model Workers, a report that offers answers to some of the questions above. Model Workers is based on 45 interviews with experts working at the coalface of the data economy. It is the first output of a collaboration between Nesta, Creative Skillset and The Royal Statistical Society aimed at creating robust, independent and actionable evidence about the skills implications of the data revolution for UK policymakers, educators and managers.
These are some of the questions that we address:
- What are the data analysis skills needs of leading companies?
- What are the barriers to accessing top data talent in the UK?
- What good management practices can be adopted to build high-performing data teams?
- How do policy and education need to change to improve the supply of data talent?
You will have to wait until the 9th to find out what we learned (register for the event here). For now, suffice to say that many of the companies that we have interviewed highlight the importance of approaching big data wisely (‘a degree of scepticism is wise, but too much scepticism would be foolish’), and of avoiding the mistake of looking for technology solutions to talent problems, least big data causes big fails when it is put in the hands of people lacking the skills to analyse it creatively and critically, the way that, at least for now, only humans can do.
 In a follow-up report later in the year we will publish the findings of a larger firm survey also exploring the skills implications of the data revolution.