From detecting to engaging: An analysis of emerging tech topics using Meetup data
Innovation policymakers are interested in finding new stuff first: new technologies, new industries, new clusters and new businesses. Traditionally, this has involved lots of networking, sleuthing and hard graft - going to events, talking to lots of people, reading the trade press. How could new data sources and data science methods help them in their job?
Innovation policymakers are constantly on the lookout for emerging technologies and sectors. This has two main functions:
Innovation detection: Many technology races have a first mover advantage element, which means that those countries that first bet on a future growth sector are better positioned to benefit from its expansion. This is a consequence of ‘network effects’, as with web platforms that become more attractive to new users the more users they have already, or ‘learning by doing’, where businesses become more productive over time, as they gain experience about customer needs, and iron out glitches in their processes. The sooner you get started, the better you can do.
Innovator engagement: Businesses in emerging sectors may have completely different skills, finance or infrastructure needs from existing ones - for example, the R&D expenses that a pharma company incurs in are very different from a video games studio, so R&D tax credits designed for the former might not support the latter. If innovation policymakers want to put in place programmes that work for emerging sectors, they need to find them first, and talk to them.
Traditionally, both activities have been informed by 'local' knowledge, personal contacts and ‘snowballing’ from existing networks into new ones. Innovation policymakers go to tech conferences, read the trade press, and meet start-ups and researchers to find ‘what’s hot’. This doesn't tend to be a data-driven process because, in general, official data sources are not very good for identifying and measuring new activities that don’t fit with existing industrial codes.
Web data - including company websites, and websites used by innovative companies and workers - offers new opportunities to identify emerging technologies, industries and businesses at a higher level of granularity, and closer to real time. For example, big data B2B marketing companies like GrowthIntel are analysing company websites in order to offer their clients more relevant leads. There is no reason why innovation policymakers couldn’t benefit from similar information.
In the second data pilot for Arloesiadur, our innovation data analytics project for Welsh Government, we have explored some web data sources in order to understand their potential for tracking emerging technology trends for innovation detection, and innovator engagement. Here is what we did, and here is what we found:
Meetup.com is an interesting platform to analyse emerging tech topics: this is a website used by technologists, coders and entrepreneurs to organise events. If these people are getting interested in a new tech trend, perhaps we will find an uptick in the number of meetup groups and events around it.
In order to test this idea, we extracted information about UK tech meetups, events and their attendees from the platform’s API from its beginning until March 2016. We then tracked levels of activity in three ‘hot’ tech-topics that we know are of interest to policymakers: Bitcoin (distributed digital currencies and online transaction ledgers), Deep Learning (a sophisticated method for building Artificial Intelligences) and Virtual Reality (technologies to generate highly immersive digital environments, often used in video games). We identified tech groups active in these areas based on the keywords that their organisers use to label them.
Why did we focus on these three?
While these three topics are now relatively well established, during the period of research they fit with the most recent definition of 'emerging technology' in the innovation study literature, by Rotolo, Hicks and Martin, which characterises emerging technologies in terms of:
- Radical novelty: Emerging tech helps solve completely new problems, or solve old problems in ways that weren't considered possible before. A great example of this is how Deep Learning methods were used to beat human masters at Go, a feat that many experts thought was years or even decades into the future. The crowdsourced authentication method adopted by Bitcoin tackles the "Byzantine Generals problem" of coordinating the activities of untrusted agents in a network of transactions.
- Relatively fast growth: All these technologies have received a lot of interest, in terms of new start-ups, investment and Wired magazine covers.
- Coherence: These technologies have their own identity, separate from others in their wider area, and are developed by specialised communities. Virtual Reality is quite different from other kinds of video games. Deep Learning is often distinguished from statistical machine learning methods, and relies on its own software packages and algorithms.
- Prominent impact: All these technologies have transformative potential: Bitcoin and distributed ledgers set out to overhaul the concept of money, and automate trust. Highly immersive virtual reality technologies could alter the meaning of reality and space forever, by enabling the creation of completely digitised environments, and bringing together people located far away. Deep Learning is a key tool in the search for 'general purpose' artificial intelligences.
- Uncertainty and ambiguity. There is still uncertainty about the prospects for these technologies. The bitcoin community is riddled with controversies (e.g. about the size of bloks); motion-sickness could limit the adoption of Virtual Reality; some of the leaders of the Deep Learning field acknowledge that its potential has been exaggerated.
As in previous Arloesiadur pilots, we have uploaded the code we used to download the data in GitHub. You can get it here.
We have identified 58 groups with relevant keywords (25 in VR, 18 in Bitcoin, and 15 in Deep Learning). These groups set-up 449 events since September 2012, when the first groups of interest were formed (Augmenting Reality and Coinscrum, in August and September 2012, respectively). The first Deep Learning meetup group - Deep Learning London - didn’t appear until April 2014. It’s worth noting that all these groups were formed in London, consistent with high levels of meetup activity in the city, including in cutting-edge fintech, creative industries and analytics areas relevant for the three technologies we’re looking at.
The two charts below show a timeline of activity in the three tech topics we picked, smoothed in three-month periods to remove some of the noise. The first chart considers total levels of activity according to different metrics (number of events, number of attendees and average attendees per event), while the second one normalises by levels of activity in a random sample of 200 meetup groups with the aim of controlling for Meetup’s growing popularity.
Even after smoothing the data and normalising by baseline levels of activity in Meetup, we see spikes followed by slower periods. What explains them?
Interestingly, some of the spikes appear to be linked to milestone tech moments ‘in real life’ (we have represented those with vertical lines, using the same colour to identify each tech topic).
- Meetup activity around Bitcoin appears to go hand in hand with the rapid appreciation of the digital currency towards the end of 2014. At their peak, bitcoin events were attracting hundreds of participants, and speakers from high profile bitcoin start-ups and projects such as Ethereum. Interest in the topic has deflated - to a degree, and specially in terms of event attendees - with the decline in its value after the collapse of Mt Gox. It also seems that activity has become fragmented, with many small groups active, perhaps reflecting the increasing application of blockchain-style solutions outside of finance.
- Deep Learning appears at around the same time as the acquisition of DeepMind, a London start-up doing cutting edge AI work. After a few inactive months, it picks up again in 2015, just as the world started to get excited about this area, culminating with Alpha Go’s famous victories over Go champions Fan Hui and Lee Sedol in January 2016 and March 2016. As of the cut-off period in our analysis, Meetup activity around deep learning was still climbing at a faster clip than the other tech areas we are interested in, as well as the random baseline.
- The trend is less clear for Virtual Reality – maybe perhaps the concept for the technology has existed for a longer time. Having said this, there seems to be higher levels of activity towards the end of 2013-beginning of 2014, around the time when Oculus VR started operating, to be eventually acquired by Facebook for $2 Bn (one of the groups formed around this period is in fact called “London Oculus Rift/VR developer" meetup). There appears to be an uplift in activity in 2016, perhaps reflecting increased interest in innovative VR and AR (Augmented Reality) products being developed by HTC and Valve, Microsoft and Magic Leap.
If our interpretation of the data is correct, this means that activity around new technology in Meetup isn’t so much focused in its development, as in its diffusion or popularisation among wider communities of practitioners. The initial Research & Development seeding these new technologies is probably happening elsewhere, in universities and businesses.
Does this means that Meetup.com is more suitable for innovator engagement than innovation detection? Not necessarily. First, the data generated by this analysis is still quite timely, and could give those organisations who react to it fast significant lead-times over the competition. Second, the analysis we report is constrained by the fact that we are searching for meetup groups and events using agreed-upon terms for ‘emerged’ technologies. Could one use meetup data to analyse interesting combinations of disciplines giving rise to new topics before those even get a name? This is a topic we will continue exploring as part of Arloesiadur - check the conclusion for some ideas.
So far, we have looked at ‘emerged’ tech areas. What about the present and the future, when we don’t know what keywords to focus on? We have explored this by measuring the levels of Meetup activity in groups labelled with novel, popular, keywords (that is, the 20 most popular keywords among those that didn’t appear in the platform until after March 2015). The charts below show what these keywords are, and their levels of activity.
The first noticeable thing is that few of these keywords seem particularly 'new'. Rather, they appear to capture growing specialisation inside existing domains. It also seems, in some cases, that we are picking up existing industries (e.g. in the cybersecurity community - ethical hackers - and in the creative industries - visual effects, as well as e-commerce businesses) that perhaps are becoming more active in meetup, or more active networkers.
All this is again consistent with the idea that most meetups are about diffusion and dissemination rather than development. This means that, at least when using the simple keyword-based methods we adopted in this blog, Meetup may be a more suitable platform for innovator engagement than for innovation detection. This is no mean feat - a recent analysis of the latest UK Innovation Survey suggests that innovative businesses rarely benefit from government support. New sources of data, like Meetup, that help UK innovation agencies promote and target their programmes more effectively could be used to address this.
Our analysis in this blog was descriptive rather than predictive, and based on user specified keywords, rather than coherent collections of keywords defining a tech topic. As we have seen, this could make it difficult to discover emerging tech areas that one isn’t already aware of - an area of interest to policymakers and other new tech ‘cool-hunters’ such as Venture Capitalists and open innovation teams in large corporations. Our next goal is to use more sophisticated methods for this innovation detection. There are at least three ways to pursue this:
- Supervised analysis: What features of a technology area are predictive of its eventual ‘emergence’? Can we use them to identify new areas with high probability of emerging?
- Unsupervised analysis: Can we use text mining and topic modelling to identify sufficiently granular topics, and track their evolution?
- Social network analysis: Does the formation of new ties between previously unconnected areas indicate that something new is happening?
We’ll keep you posted about what we find. Drop us a line if you are interested in any of this.
This blog benefitted from helpful feedback and comments from Katja Bego, Gail Dawes, John Davies, Antonio Lima and Jen Rae.
 One important exception to this is patent data, which is often analysed to identify new and interesting science and technology areas - one limitation with this is that patent data comes with a big time lag.
 Before using keywords, we explored the possibility of extracting ‘topics’ of interrelated keywords from the data using text mining methods. The results we were able to obtain within the timetable for the pilot were insufficiently robust. When extracting a low number of topics, the categories were too coarse to capture emerging areas of activity; when extracting a high number of topics, some of the categories became noisy, and generated many false positives. Fine tuning our topic modelling algorithms to capture new and interrelated keywords is an important follow-up we come back to in the conclusion.