The goal of the EURITO project is to develop Relevant, Inclusive, Trusted and Open (RITO) indicators for research and innovation (R&I) policy. We are currently collecting ideas for the exploratory pilots we will be carrying out over the next six months, and have built a framework around which to structure our search for — and assessment of — data.
To make the most of this opportunity, we need your help… After reading the blog, we invite you to participate in our experimental wiki survey of R&I policy questions. The survey aims to capture new pilot ideas as well as to rank existing ones, and the responses will ultimately help to inform our decisions on which pilots to carry out. See the FAQ section at the end of the blog for more information.
In this blog, I describe the EURITO data audit framework (EDAF) — a theoretical and methodological tool to help guide both the search for, and assessment of, data for the EURITO project. The EDAF starts to weave together threads of R&I policy questions and data sources, laying the foundation for the the work to come.
The EDAF is shown in Figure 1. It was developed to be flexible and adaptable, reflected in the fact that users can approach it from three entry points: with a construct (Step 1: Conceptual Anchoring), an interesting data source (Step 2: Basic Data Audit), or a pilot idea (Step 3: Pilot Ideas). Step 4 (Advanced Data Audit) is reserved only for those data sources that are sufficiently grounded in a policy-relevant pilot question to be explored in more depth.
The sections below describe the components of the EDAF and include some reflections and lessons learned during its development. For a more detailed description of the EDAF and its components, as well as the accompanying Excel workbook, please see here and here.
The first step of the EDAF is called ‘Conceptual Anchoring’, and it’s composed of two components: a conceptual backbone of indicators drawn from an existing R&I framework, and space for new constructs. Before continuing, take a moment to ponder these questions: What is ‘organisational culture’? What is ‘social innovation’? What do these concepts have in common?
If you were unsure how to answer, don’t worry — both are difficult to describe and challenging to measure. What they have in common is that they are constructs — mental abstractions of ideas or phenomena. Constructs are often broad and abstract, so must be given a name and defined before they can be measured. This is often easier said than done, however. For instance, in the case of social innovation this excerpt from the 2015 book New Frontiers in Social Innovation Research provides some telling insight: ‘much of the discussion of social innovation is vague, and there are many competing definitions [...] Some present it as simply a new term for the study of nonprofits; for others it can encompass almost anything from new types of democracy to the design of products for poor consumers.’
Realising that this challenge exists more broadly across the R&I landscape, making space for constructs in the EDAF allows us to capture phenomena which are relatively novel, and for which no widely accepted indicators currently exist. These include constructs such as open innovation, open design, and open education.
Making space for constructs does not mean that we’ve abandoned existing R&I evidence frameworks. After all, a great deal of time, research and expertise have gone into developing the methodologies of the Oslo Manual and European Innovation Scoreboard (EIS), to name but two prominent examples. For this reason, the Conceptual Anchoring component of the framework also includes the EIS framework as a backbone against which new constructs can be placed. For example, the EIS places the indicators ‘R&D expenditure in the public sector’ and ‘venture capital expenditure’ under the ‘Investments’ group and ‘Finance and Support’ dimension of its methodological framework. A new construct placed alongside this group/dimension pairing is ‘crowdfunding’, which can be considered an additional funding source for early-stage innovators.
The EURITO project provides an exciting opportunity to consider a vast range of data sources that might be analysed to better understand and monitor the R&I landscape in Europe. This vastness is at the very root of the structure of the Basic Data Audit step of the EDAF, which allows framework users to easily and quickly capture data sources that may be of interest, recording only the conceptual framing dimensions (i.e. Group, Dimension, Construct, Data Source), a short description, and the URL.
Why create space to capture data sources in such a minimalist fashion, intentionally leaving the more robust auditing of data access, coverage, and other important factors until Step 4? In short, we believe that this approach reflects an evolving paradigm around how we approach R&I evidence. This is notable because not so long ago, if one suggested that a data source — rather than a hypothesis — could be the starting point of empirical inquiry, he or she was likely to have been admonished with fierce objections on the grounds of sampling bias and false discovery. However, dramatic advances in data availability and computing capacity have opened new and promising avenues of exploratory analysis, data mining, and data-driven hypothesis generation (Salganik, 2018 and Carmichael, I. & Marron, J.S., 2018). While the strict dichotomisation of starting with either a hypothesis or a data source is likely too stark a distinction, acknowledging this debate and its implications for the EURITO project is nonetheless helpful framing.
A simple but elegant way to think about this broadening scope of evidence is offered by Princeton University’s Matthew Salganik. In his excellent 2017 book Bit by Bit: Social Research in the Digital Age, Dr. Salganik uses the terms ‘readymade’ and ‘custommade’ to describe this shifting paradigm. ‘Readymade’ describes the work of Marcel Duchamp, who famously repurposed ordinary objects into art. ‘Readymade’ data sources would therefore include, for example, social media or administrative data which were not originally collected/produced with the explicit aim of being used for research or policy. Notably, this category would also include databases of patents and peer-reviewed publications. Michelangelo, on the other hand, spent years creating magnificent ‘custommade’ sculptures from raw material such as marble. Through this lens, ‘custommade’ data include those collected through, for example, the Community Innovation Survey — a carefully crafted tool developed and deployed with the explicit intention of collecting data that could be used for research and policy.
The Basic Audit phase of the EDAF explicitly makes space for the Duchamp-ian data in the R&I landscape, as the Michelangelo-like R&I data have been extensively inventoried in past projects (see, for example, the inventories developed in a project on the use of data mining in the development, monitoring and evaluation of R&I policy).
The third step of the EDAF provides a soft landing site for ‘pilot’ ideas. The easiest way to think about a pilot in this context is as a process through which we develop a proof of concept around the application of new data and analytics to R&I policy questions. In other words, pilots allow us to explore whether using new data (or new combinations of data) or analytics is feasible, appropriate and likely to be robust and scalable, while leaving room to abandon ideas that are ultimately unlikely to deliver value in the long run. The pilot ideas currently captured in the EDAF were identified during the project’s conception and through engagement with policy stakeholders. A third (experimental) avenue we’re exploring is the ranking of existing ideas and collection of new ideas through our wiki survey — please take moment to try it out!
Crucially, this step of the EDAF links the Basic Data Audit — where interesting and potentially insightful data sources are captured without too much attention paid to whether they might address policy-relevant questions — and the advanced data audit, where the nuances of the data are explored in depth against the backdrop of a specific R&I policy question.
The Advanced Data Audit is the stage of the EDAF in which pilot ideas meet data sources, triggering an in-depth exploration of the data’s structure, coverage, nuances and potential limitations. Table 1 outlines the characteristics of the data we expect to capture, with an expectation that these categories may evolve as we begin the advanced auditing process.
Note that at the time of writing, the advanced data audits have not been carried out as we have not yet decided which pilot ideas will be further developed.
One of the most exciting aspects of the EURITO project is that it provides the necessary time and space to explore a wide array of possibilities in both R&I policy questions and data sources. The EDAF will be used to move us from a divergent, exploratory state, toward convergence upon specific pilot ideas.
As described above, the auditing process — when broadly conceptualised — touches upon several contemporary debates in evidence-based policy and academic circles, such as whether inquiry can/should begin with a data source or a theoretical hypothesis, as well as the opportunities and challenges of working with ‘readymade’ data (i.e. data that were not originally intended for research or statistics). We believe that the EDAF provides a starting point from which to begin engaging with some of these questions, but that it’s just one piece of a bigger puzzle. Many other factors will ultimately play into which data and pilot ideas are selected and developed, as well as which pilots go on to be scaled into full RITO indicators over the course of the project.
If you’re interested in participating in defining the future of EURITO, don’t forget to complete our wiki survey!
The EDAF was collaboratively developed by the members of the EURITO consortium, many of whom also provided helpful feedback on this blog post. If you’re interested in learning more about the EURITO project, please see the information below or have a read over the first blog post about the project.
Note that advancements since the development of the original EDAF have resulted in some slight discrepancies in the terminology/structure employed here and the project documentation.
Campbell, David, Chantale Tippett, David Brooke Struck, Christian Lefebvre, Grégoire Côté, and Éric Archambault. 2017. ‘Data Mining on Key Innovation Policy Issues for the Private Sector: Technical Report’. Prepared by Science-Metrix for the European Commission.
Carmichael, I. & Marron, J.S. Data science vs. statistics: two cultures? Jpn J Stat Data Sci (2018). https://doi.org/10.1007/s42081-018-0009-3
Hollanders, H. and Es-Sadki, N. 2017. ‘European Innovation Scoreboard 2017, Methodology Report’.
Nicholls, A., Simon, J., Gabriel, M. New Frontiers in Social Innovation Research. (2015) https://doi.org/10.1057/9781137506801
OECD/Eurostat (2005). Oslo Manual: Guidelines for Collecting and Interpreting Innovation Data, 3rd Edition, The Measurement of Scientific and Technological Activities, OECD Publishing, Paris, https://doi.org/10.1787/9789264013100-en.
Salganik, M.J. (2017). Bit by bit: social research in the digital age. Princeton University Press.
The EURITO team have developed a wiki survey to obtain input on pilot ideas. It can be accessed here , and will remain open until 12 July 2018.
If you have a question that is not covered here, please reach out to us directly at [email protected]
EU Relevant, Inclusive, Timely, Trusted, and Open Research Innovation Indicators
This project has received funding from the European Union’s Horizon 2020 research and innovation framework programme under Grant Agreement n° 770420.
If you want to receive this and other similar news, join the EURITO newsletter, where you can keep up to date on the progress of the project, the latest news and interesting opinion pieces on new mapping methods in the Research and Innovation (R&I) ecosystem, or write to [email protected]