Lessons learned when tracking OSS projects (and what Inner Source projects can learn)

[Extra material available at the Open Source Leadership Summit talk and its slides in the Bitergia’s Speakerdeck account]

We are all used to open source projects. Concepts such as community, code review process, continuous integration, geographically distributed contributions, community managers, and a whole myriad of terms and collaborative way of working are usual for all of us. And enterprises are learning from this open process. Those are changing the direction of their development models to a more open one within the organization. Initiatives such as the Inner Source Commons where companies such as PayPal or Bloomberg are publicly exposing their case, help others to deal with the usual problems they face.

This is not a post to talk about the advantages or disadvantages of using an inner source approach, but let me bring some of the benefits when applying this method:

  • Reduce time to market
  • Increase engagement of developers
  • Escale large organizations
  • Reuse the code
  • Increase the quality of the code

With this in mind, what this post talks about is how to measure all of this to achieve  organization’s goals applying inner source. And this also brings experiences analyzing large open source communities and how this could be useful to inner source.

When using metrics, it is necessary to understand why we need them. For example we can use them for awareness to understand the current global situation. Those can be use to lead a process change as they provide some light about how far that change is from its goal. And for motivational aspects to push specific policies and help developers and managers to use a well defined path to walk. And of course, for transparency as fundamental piece of the puzzle to generate trustiness in the process and engagement from employees.

And this inner source process is mainly a matter of cultural change at the chief, middle management and developers level. Metrics can help with this process when organizations are willing to adapt their software development process to a more scalable, open, transparent and community based one. They need to understand if they are achieving their goals thanks to a more transparent and community oriented process as inner source, and if not make decisions to redirect the process as expected.

Open and inner source processes have different goals. And even more, each open source project has its own peculiarities that makes that community unique. Infrastructure -from mailing lists as code review to a more structured system such as Gerrit-, governance -benevolent dictator to flat decision structures-, and others. And in the same way, there is not an ideal inner source community as this depends. But open and inner source projects have some goals that need from a similar approach. Let’s detail some of those:

One of the main goals of any open source community is to have a great base of users and developers to foster the product (let us not enter into the reasons for developer). And for any community manager it is key to understand the attraction and retention rate of the developers. Turnover is something inevitable, but this can be measured. And inner source communities are not an exception of this. When applying inner source, the idea is to break the silos of developers and allow them to work together minimizing hierarchies.
Some metrics of interest:

  • Attraction of newcomers and evolution over time
  • Retention of developers and evolution over time
  • For how long developers keep contributing

Mentorship and helping newcomers

This is probably one of the aspects to foster in open and inner source communities. Mentors are in general developers that help other contributors to create great pieces of code (or any other contribution). Those are usually experienced committers that help new ones to understand how the community works, where to look for specific documentation, review carefully the code by newcomers and in general walk developers through the yellow brick lane to the point to finally accept a piece of code.
And mentors can be also tracked. Let’s bring in context Gerrit as the code review system. Any developer reviewing code can potentially act as a mentor. Google Summer of Code, Outreachy are great places where awesome mentors help students to participate for their first time in an open source project.
Some examples of metrics of interest:

  • Number of developers helping others (reviewing code, answering questions, etc)
  • Number of developers becoming mentors for their first time
  • Relative number of mentors per projects
  • Number of newcomers per mentor

Contributors funnel
From a broader point of view, it is also possible to analyze the contributors evolution in the community. From the first traces of activity left as emails or bug reports, to the first piece of code committed or the first review acting as mentor. It is also interesting to analyze how many of the initial users of the product are finally contributing in some way to the community.
Some examples of metrics of interest:

  • Percentage of users that become developers or code reviewers
  • Time to become a developer or a reviewer

Development cycle
As in open source, the total cycle that goes from a feature request to merge that code into master, is a needed metric to understand how fast the community is translating requirements into code. The organizations behind the code already have information about their own internal processes and probably times to deployment. It is even more important from a company perspective if they have numbers in median or some percentiles about the usual time that a feature request takes from the beginning till the end of its development.
The whole process in open and inner source projects should follow at least: feature request, implementation, CI, review process, some extra CI, merge into master. And also the time between each pair of steps is necessary to look for potential bottlenecks.
Some examples:

  • Median (mean, percentiles…) time to close a feature request or user story
  • Median time from the feature request until its implementation, from implementation to CI, to code review and to merge into master.
  • Total number of people involved and iterations the code had (changes requested by the reviewer) till this was merged
  • Percentage of abandoned code review processes
  • Backlog management index, that can be also seen as the effectiveness of the community closing tasks.

Spreading the knowledge
As mentioned, turnover takes place and open and inner source communities face this type of issues. One of the initial goals when inner sourcing is to allow developers to work in other areas not directly related to their business unit. But there is an issue about who maintains the code and even more if that developer leaves the community or the company. It is important to know who are the main contributors, but also their areas of knowledge, so the areas of the source code that they are mastering. Some analysis with this respect are related to the analysis of territoriality that measures the number of files touched only by one developer (other definitions may apply). Also the orphaned code defined as the number of lines that do not have parents as the developers simply left the project. And finally some analysis such as building social networks may help to understand who are those that are acting as bridges of knowledge between two groups of developers (betweenness).

There are a lot of extra analysis that can be carried out, and other metrics can be applicable instead of the mentioned ones, but the important aspect here is that you can compare your internal inner source organization and community with other open source communities of reference. You can decide what governance or financial model applies to your organization and check how far you are from others such as the Apache Software Foundation, the Openstack or the Linux Foundation projects you are.

As open source communities have publicly available data sources, you can benchmark your own community and say if you are successful with the goals that the organization detailed at the beginning of inner sourcing. And this is an advantage that companies that decide to inner source their development processes should take care of.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s