Creating a custom git flow visualization

When we decided to make the switch from Subversion to using git for funda's version control, I had a number of worries. The main one was if we (me, actually) would ever understand git well enough to deal with the complications of having five teams working on the same codebase. This postponed our adoption for a while, but eventually, by doing a few proof-of-concepts and reading a lot (this article by Nick Farina helped a lot), we felt secure enough. So we set out deciding how exactly we would organize git in terms of branches and workflow.

Git flow, feature branches, release branches

Git is so flexible that you have to think about what way of working fits your organization best. We were coming from a workflow in which we created a fresh 'integration' branch every week and had everyone commit to that. Then once a week we had a code freeze and we would merge that branch to the trunk and create a new integration branch. We wanted to move to a flow where most of the work would happen in a feature branch, so that we could postpone the decision of merging a feature into a release until that feature was 'done'. We also wanted to have a workflow that would allow teams to perform a code review and peer test while having the merge coincide with the approval moment. But we also wanted to keep the tightly controlled period after the 'code freeze' for deploying to an acceptance environment and stabilizing before release. After evaluating a varying approaches (this article from Atlassian helped a lot), we decided that git flow as described by Vincent Driessen was the best fit for us. It has a clear terminology and naming conventions, some tool support and has both feature branches and stabilization branches (called release branches).

git log --graph, eeeeh what?

So we set out on this path, migrated our Subversion repositories, set up Atlassians Stash to host git, did a small workshop with all our developers on git and git flow and got to work. And people lost their way in git. Mostly in understanding which branch was which and what to merge to what. The main problem was the graph. Git flow is explained by using a commit graph that looks like this:

Git flow model

It is slightly intimidating at first, but quite readable once you get used to it. The main feature is that 2 of the branches run from top to bottom and the other branches exist only temporarily. Master and develop are the constants in every commit graph for git flow. Git can also show a graph for the recent commits. Let's have a look at a repository using git flow by asking git for a graph view (we use a public GitHub repo owned by Vincent Driessen, the 'inventor' of git flow, containing the git flow tooling):

git log --graph

This shows us the commits from our currently checked out branch (which is a feature branch). The results look like this:

Git log

Note that our last commit is not very recent (today is November 2014). Also, the commits use a lot of space. For readability, we'll do the same, but using only one line per commit:

git log --graph --oneline

Now we see more, but still not much context. Where are master and develop currently? How many commits have been done on develop since we branched? Which commits in the graph are on the feature branch and where is the point where we branched from develop? It is hard to see:

Git log

By adding yet a few more flags to the call, we can get more context:
git log --graph --oneline --decorate --all

Now we get not only our current branch (--all), but all branches and the --decorate flag labels the branches on the commits. Have a look at the output and see if you can figure out which line represents the develop branch in the recent past:

Git log

In this graph, if you look carefully, you can recognize the labels for master and develop (in red and green), as well as our currently checked out feature branch. However, this is not easy and the graph looks nothing like the chart as used by Vincent Driessen. There are Stash add-ons that show commit graphs, but they typically look very much like git's own.

Javascript to the rescue: a Stash add-in

So I set out to create a visualizer for Stash that looks like the original git flow graphs. It turns out that Stash has some nice JSON based API's that you can call from a Stash add-on using a users credentials. To do as little server-side processing as possible, I tried to create a javascript library that can take the Stash API output and visualize it as a graph in the browser, using d3.js to render HTML and SVG. I ran into 3 main difficulties:

  1. For drawing the git flow graph, you need to know not only which commit is the current develop branch, but also which commits have been develop in the past. This is harder than you'd expect. Git has very deep knowledge about the parentage of commits, but branches are rather transient things for git. It doesn't keep any historic information on them. Thus, there is often no way to be 100% sure which side of a merge was the original develop branch and which was a feature branch. You can come close by recognizing certain commit messages, but it will always be an educated guess.
  2. The git flow graph assumes that a commit is always on only one branch. It is either on develop OR on a feature branch OR on a release branch. However, git is not like that. A commit can be on a feature branch now, but if you later fast-forward merge the branch to develop, the same commit is suddenly part of develop. If you want to draw branches as parallel lines (as we do), you have to make calls on commits every now and then. Again, this makes the chart an approximation of the real history, but in most cases, this works out fine.
  3. Some repositories are very large. It is important to be able to draw a graph by looking at only the most recent commits, but when a user scrolls down far enough, you have to incrementally download extra commit data. This will cause a complete redraw, because the new commits may have an effect of our estimates of which commits were on develop at the time. This makes it unfeasible to scroll back more that a few thousand commits.

The result is clearly not created by a designer, but gives a much better overview of all branches in a git flow environment than the default git tooling does. It looks like this:

Gitflow visualization

Now (freeish) for everyone

If you would like to use the git flow chart of if you'd just want to have a look at it, you actually can. The code is not open source. It is also not owned by funda, as it was created as a private project, sparked by a need at funda. The main code is the javascript library used for evaluating the commit data from Stash. You can have a look here; if you'd like to use it for a free service, chances are that you can use it for free, but please ask. If you want to see the chart at work, you can have a look at beta.gitflowchart.com, where you can use the library to visualize any open GitHub repository. As an example: Vincent Driessens gitflow repo that we used above.

Enjoy!

written by Teun Duynstee, Software Architect at funda.