Strata - Big Data comes to London
I spent two days this week at Strata, the Big Data conference that's been running for the last 18 months by O'Reilly, which was held for the first time outside the United States this week.
I was there to talk about healthcare, innovation and data with Laura Bunt. This is linked to a project we are pursuing on the idea of a health knowledge commons - a set of shared, common data that can be used to make healthcare decisions.
The talks at Strata ranged from the highly technical to questions of strategy, marketing, ethics, randomisation, history, and engagement. The slides and videos of some of the talks can be found online. With so many parallel sessions, I saw only a fraction of the talks that went on, but a few highlights for me were:
- Arfon Smith from Zooniverse, the team behind crowdsourced galaxy classification system Galaxy Zoo, talked about some of the strategies they have seen amongst participants in their supernovae project. They can group participants depending on their strategy: those who use all three scores (definitely not a supernova, possibly a supernova, definitely a supernova) in balance; the optimists (who think that almost everything is a supernova); the pessimists (who think nothing is); the extremists (who classify almost nothing in the middle); and the uncertain (who mark most things as possible, and some as not).
- Brian Bot from Sage Bionetworks talked about their clearScience project to make science more reproducible.
- JD Vogt from Salesforce talked about machines as collaborators, and gave some examples of the emerging capabilities of robots and machine learning, including a process of voice recognition from radiologist dictation, which can identify key clinical phrases from the dictation, and also link them to medical ontologies and clinical meaning (e.g. viral pneumonia is a type of pneumonia, is a lung disease).
- Zena Wood and Alasdair Allan, who are using an iOS app to track students at the University of Exeter as they move around campus, and see how social groups are formed, and even how freshers' flu is transmitted.
- William Spooner from Eagle Genomics talked about the Pistoia Alliance, a collaborative approach to genetic analysis, and showed a dramatic chart of the fall in sequencing costs since 2007.
The final word should perhaps go to Mike Lynch, founder of Autonomy, who spoke at Nesta this week. When asked about big data, he said data may be big, but is it clever? We need to be intelligent about data if it's going to be a real source of innovation.