GitHub: the Swiss army knife of civic innovation?
Nesta Tech4Labs Issue #2
There is a joke within the startup community about multi-purpose products that can do pretty much anything: "when is the last time you saw someone using a swiss-army knife at the dinner table?"
So, when you see such a product reaching wide acceptance amongst legislators, academics and government workers, you pay attention.
What are Git and GitHub?
Git (created by Linus Torvalds, the father of Linux) is a distributed revision control system (RCS). You can think of it as “Track Changes” in your favorite word processing application, but on steroids: multiple documents, multiple revision histories. Git is not the first revision control system, but it was the first of its kind used for mainstream software projects.
For people writing software, a RCS serves as a bookkeeping tool where each individual contribution is kept, modifications from different people are easy to manage and combine, changes can be packaged into releases, etc. This functionality has been totally essential for the development of open source software at scale. What civic innovators have come to realise is that such bookkeeping features are also useful for activities like legislative drafting or any activity that requires transparently keeping track of who contributed what, and how a set of files changed over time.
At the core of GitHub is the fork & pull model. Anyone can get a copy of some content (fork) and make their own changes, no questions asked. For the changes to be incorporated into the original, they need to be proposed, then approved and merged by the owner of the content. Pull requests allow line-by-line feedback, allowing the proposer and owner (and anyone else) to have a conversation around the changes.
Paraphrasing Clay Shirky in : 'A data scientist in Edinburgh and a public official in London can both get the same -- a copy of the same piece of legislation. Each of them can make changes and they can merge them after the fact even if they didn't know of each other's existence beforehand. This is cooperation without coordination.'
"This model reduces the amount of friction for new contributors and is popular with open source projects because it allows people to work independently without upfront coordination." [GitHub]
GitHub is a private company that offers "Git as a service" over a Web interface and an API.
Because of its clean interface, rich functionalities and friendly mascot Octocat, GitHub has received a wide acceptance among tech people and beyond.
So why should YOU care?
First, GitHub is trendy. Most open source projects are using it, private companies contributing to open source are using it, governments are using it (see  for a list). Clay Shirky even predicted it will transform government .
By having a GitHub presence, you instantly gain credibility with the developer community, because you now have a home in a recognisable environment. You create repositories (usually one per project), you add people to them (or not), and people can start contributing content. GitHub is totally domain-agnostic, which is why people in government and other industries are starting to use it for non-software projects.
GitHub is free, as long as you are fine with making all your content public. It comes with nice features such as a way to create GitHub Wikis, as well as a hosting mechanism called GitHub Pages. With Pages, the content of your repository (usually code and data for machines to process) is used to automatically create a website (for humans to view).
GitHub for your civic projects
Most successful civic projects require a combination of transparency, collaboration and participation, and GitHub can serve them rather well.
Versioning is key to transparency. Not only is all content always publicly available, but all versions and changes are too, with timestamps and identities of contributors. A website, in contrast, only shows the current version. Transparency extends to projects, contributors, etc. If you are interested in a given project, you can follow it using the watch feature. GitHub free hosting, Git-backed Wikis and GitHub Pages are also great tools for transparency as they make deployment to a wide audience much easier.
The fork-pull model is well suited for collaboration as it reduces the friction for contributors yet provides control over the quality of the contribution. GitHub comments and GitHub issues provide easy ways to create conversations. Both can be interfaced with existing productivity tools (calendar, chat, etc.) via the GitHub API.
The GitHub web interface makes it easy for most people to participate with limited knowledge of the underlying framework. Click-based navigation encourages the curious to learn more about content and its history. The issue system is usually the easiest form of participation for people. GitHub social features (e.g. user profile, aggregate statistics about people’s contribution) are also great incentives and gauges for a community.
Git and GitHub feature some advanced workflow functionalities (hooks) that can be triggered by various actions on a repository. For instance, a new dataset contribution (pull request) can trigger some schema validation before being merged. A merge contribution can trigger some post-processing like creation of data bundle (e.g. zip file) and notification to an external system.
GitHub shortcomings worth mentioning
Just like any good Swiss Army knife, GitHub is not perfect however.
Its interface can be intimidating to new users. There is no realtime collaboration like in GoogleDocs, as its original model was more geared towards asynchronous collaboration.
Git works best when dealing with line-based text-content (e.g. plain text, code, CSV, XML, JSON) where line-by-line changes are easy to spot, interpret and merge.
If you plan to use it as a data repository, support for very large datasets is not great. There is limited support for data visualisation; there is no query language to manipulate data. Open data portal solutions like Socrata, CKAN, OpenDataCatalog might be better suited for your needs.
GitHub Analytics have only been introduced recently and cannot compare with more advanced solutions.
Some corporate environments restrict access to GitHub because people use it to upload arbitrary content which could include sensitive information.
Having said that, the GitHub public API can help you address some of these shortcomings (e.g. Prose for a better text editor) and there is an emerging ecosystem of apps and services leveraging GitHub as their storage platform (e.g. GitHubUploader to upload pictures in one click.)
GitHub is not the only "Git as a service" offering. See here for a list of alternatives.
GitHub is a very versatile tool. Its raison d’être – version control – is essential for any civic effort that relies on transparency, collaboration and participation. If you haven’t yet, you should give it a try. Receiving is said to be harder than giving. So create an account if you don't have one, find a GitHub repository dealing with some data you care about, fork it, make a change and start "giving back".
Here are some concrete possible next steps:
See if any of your local governance organisations are using GitHub already at https://government.github.com/community/.
Take a look at the federal laws of Germany, which are online in their entirety. Browse through the history of changes, where each change can be compared line-by-line.
Look at a few examples of governments using GitHub tools to supply geographic data, and solicit feedback and corrections on that data.
Check the repositories from Cologne in Germany and Montpellier in France. Their data can be modified, and a request can be made to merge the changes back in.
If you’re a lawmaker, or in a position to propose policy, try placing drafts of your legislation on GitHub for comment. Get some inspiration from a member of the New York City Council who’s doing this.
And let's continue the conversation:
on Twitter, using hashtag #Tech4Labs.
on the Civic-Discourse forum.
on GitHub: open an issue :-)