On the 8th of October we joined LinuxCon to share our Gender Diversity Analysis of technical contributions to the Linux Kernel.
We are aware of the diversity gap in the tech industry and the efforts of some institutions, like Linux Foundation, are doing to solve it. So, we decided to add our five cents, working over the weeks to bring some stats about women’s contribution to Linux Kernel.
Before diving deep into our results, let’s take a look at some data we already have from the tech sector. Women represent:
The Xen project is an open source software project that does pre-commit peer code review. This means that every change to the source code follows a code review process, in which any developer can participate, before being accepted in the code base. During that process, the patch is carefully inspected and improved thanks to the contributions of the reviewers. The process takes place in the xen-devel mailing list.
Code review is a very important process to improve the quality of the final product, but it can also be very time-consuming, and impose too much effort on the reviewers and the developers themselves. Therefore, when a few people in the Xen Project detected an apparent increase in the number of messages devoted to code review, they become concerned. They decided to use the services of Bitergia to determine what was happening, and how it could impact the code review process of the project. And the report “Code review process analysis in the Xen Project” was born.
The main objective of the analysis (composed of two stages) was to verify if this apparent increase was really related to the code review process, and to study other parameters to determine if the code review process was deteriorating in some way. The first stage has already been completed, with three key findings:
Time-to-merge, probably the most important parameter to express the toll that a project is paying for reviewing code, is under control. Time-to-merge is counted from the moment a change is proposed, to the moment that change is merged, after running its corresponding code review.
Time-to-merge increased from 2012 to the first semester of 2014, running from about 15 days (for 75% of the changes), to close to 30 days. But since then, time-to-merge has decreased: 28 days in the second semester of 2014, and 20 in 2015 (again, for 75% of changes).
The trend of time-to-merge is similar despite the size of the change. The same trend is observed for changes composed of one, two, three, four or more than four patches (individual components of a change).
Currently, the second stage of the analysis is being performed. It is expected that this stage will produce actionable dashboards with detailed information that will allow to track in detail the main parameters of the Xen code review process.
These findings and more will be shown in our talk at OSCON. Remember that we will be exhibiting there and you can get a discount using BITERGIA25 code… Don’t miss the chance to visit us there!
The October-Devember 2015 penStack Community Activity Report shows a stable growth of the OpenStack Community. As new repositories and teams keep being added, the number of projects keeps growing. On the other hand it is worth mentioning the decrease in activity during the latest quarters in project teams such as Nova, or stabilization of some others, such as Horizon or Cinder. This is a clear signal of the maturity reached by the some of the project teams in the OpenStack Foundation.
Active Core Reviewers reach a new peak
Although Git activity (changesets merged in the code base) does not show a large increase of activity, if compared to Gerrit, the development effort in the project keeps increasing. This last quarter of 2015 the number of Active Core Reviewers reached a new record of 449 different developers.
Time to merge keeps decreasing
During the third quarter of 2015, a small increase in time to merge seemed to signal a change in trend. However, this last quarter of 2015 keeps the previous decreasing trend. During this period the median time to merge a changeset into master decreased from 2.91 days down to 2.38 days.
Efficiency closing tickets decreases
It is noticeable the decrease of the relative number of tickets closed in OpenStack projects (also known as the efficiency of the ticket closing process). Previous quarters topped at about 60% of closed tickets (with respect to tickets opened during the same period), while this quarter shows a much lower 44%. This could be seen as a poor performance indicator of the project teams.
However, the efficiency closing changesets in the review system (number of changesets merged or abandoned with respect to number of new changesets being proposed for review) remains stable at around 80%.
The next January 15th Wikipedia turns 15 years. In Bitergia, we don’t want to miss the opportunity to say Happy Birthday and congratulate Wikimedia Foundation for “encouraging the growth, development and distribution of free, multilingual, educational content, and to providing the full content of these wiki-based projects to the public free of charge“.
Wikipedia is one Wikimedia’s projects. In fact, it is the oldest, and largest, Wikimedia project, predating the Wikimedia Foundation itself. Wikipedia is often described as a wiki, but it is in fact a collection of over 200 wikis, one for each language, all running on the MediaWiki software. MediaWiki is a free software open source wiki engine, originally developed for and used by Wikipedia, that now is used on other projects of the non-profit Wikimedia Foundation. MediaWiki is freely available for others to use (and improve), and it is used by severals organizations around the world.
After 15 years, many contributors have participated in the MediaWiki project. In Bitergia, we collaborate with the Wikimedia Foundation by analyzing and providing up-to-date development community metrics. The project is characterized by being very inclusive accepting code. Let’s see some of the numbers behind the development of the Foundation’s project in these years (all data available at Wikimedia Foundation Dashboard so feel free to play with the dash and find out data you are interested in):
These are some of the big numbers, however there are much more interesting data in the project. Some examples: Who is contributing code, what regions have more weight in the development or how is the evolution of the merged commits? Take a look in the Wikimedia project evolution:
Liberty is the new release of OpenStack. This shows an increase in activity and people participating in the development of OpenStack.
There are 25,268 commits in total merged into master thanks to the work of 1,873 different developers.
In order to have that code into master, it was necessary the effort of 2,239 people that submitted at least one patchset to Gerrit. That means that 83% of them are actual contributors of Liberty release.
In terms of community, Launchpad activity shows that 9,919 people helped participating in the bug tracking process, opening, commenting and closing tickets.
The mailing lists are a busy channel of communication with 1,742 participants, but IRC seems to be the preferred channel with more than 6,000 detected nicknames.
Ask.openstack.org is also an interesting communication channel where there have been 1,386 people participating and around 2,200 different questions.
As a disclaimer, this post will not focus on organizations participating in the OpenStack development, but in the software development process. Organizations information can be easily retrieved in the Activity Board. We believe that process in the OpenStack community is important and even more when we are talking about a team of more than 2,000 different contributors.
Efficiency of the community closing changesets
As mentioned, there have been more than 2,200 people in Liberty release that aimed at contributing with a patchset in the OpenStack community. And only a subset of those pieces of code were eligible to be part of the project and merged into master. However, not all of the people that were part of the that set were even reviewed. Each quarter, the community leaves around a 20% of the changesets population open. This can be observed in the following chart. The y-axis represents the percentage of changesets that were closed (abandoned or merged) per quarter (x-axis).
Finally we released a new version of the OpenStack Quarterly report. This is intended to provide insights about the software development process of the OpenStack projects. This covers information from several data sources such as Git, Launchpad, Gerrit, Mailing lists, ask.openstack.org and IRC channels. And it aims at providing quantitative and qualitative information about the activity, community and software process.
The executive summary of the document is as follows:
Users community keeps growing. With an increase of more than a 300% in the total number of questions posted in ask.openstack.org, this is the unique communnication channels with such measured increase. Other channels such as IRCor mailing lists also grow but slower. Although this is the activity measured for the last year, the last quarter shows a significant activity drop.
Active Core Reviewers are increasing. Although mid-release cycle analysis usually show drops of activity in most of the areas, the total number of active core reviewers has reached a new peak. Up to 339 core reviewers participated in the review process. This shows an increase of around 9% if compared to the previous quarter analysis.
Process keeps stable. The time to merge patches for the main projects show similar numbers than in previous quarter. However, Nova seems to be the project out of the common time to review with up to 10 days. On the other hand, Glance is more in line with the rest of the projects if compared to previous quarter. In any case, Glance shows the second highest meadian time to review with up to 7 days.
IRC activity recovers old activity. Due to unknown issues, the total logs analyzed in the last two previous quarters indicated a huge drop of activity. However, last analysis on this data source shows activity in line with previous quarters. Although this is still an unknown issue, the IRC activity has recovered expected activity.
Project teams are also covered in this quarterly report. This quarterly report follows the Governance file for projects, but ignores specs files. Previous versions of the quarterly report such as 2015-Q1 or 2014-Q4 divided projects into the integrated project, specs, clients and others, with specific sections.
Kilo, the new OpenStack release, shows a continuous increase of activity if compared to Juno. From Icehouse to Juno, there was an increase of 6.22% in the number of commits and 17,07% in the number of unique authors. From Juno to Kilo, there’s a higher jump in terms of commits (11,23%) and a lower increase in terms of authors (11,16%). However, with this increase, there is a new peak in the number of unique authors contributing to the OpenStack Foundation projects with close to 1,600 different people participating in its development.
After the continuous increase of activity from release to release that we observed in the past, Kilo, the latest release of OpenStack is showing some stabilization. The differences between Juno (the previous release) and Kilo are the lowest in the history of the analysis we’ve performed for the OpenStack Foundation. Although this release has reached a new peak in contributors, close to 1,500 different persons, the increase from Juno to Kilo was of around 900 commits and 200 authors while from Icehouse to Juno it was of 700 commits and 70 developers.
The list of organizations participating in the development of OpenStack keeps growing as well: close to 170 different organizations have contributed with at least one commit to the development of Kilo.
As the top ten contributors, we find the following organizations:
Regarding to the community itself, the timezones analysis shows a widespread activity around the world. OpenStack is a truly 24 hours-a-day continuous development community. There are three main groups of activity: America, on the left side of the chart, Europa/Africa in the center and Asia, on the right.
Ignoring the UTC 0 activity, that may be biased by developers using UTC 0 as their timezone with independence of their point of residence, the rest of the activity shows North America East and West coasts as the main contributors in number of commits. Europe/Africa is quite close to this activity (most of it due to Europe), although biased by the UTC peak of activity. India could be represented by the the small peak in UTC+5, and finally the rest of Asia, with China and Japan in first place, which is consistent with the localization of some contributing companies.
Within a few hours the OpenStack Juno release will be delivered. At the moment of writing this analysis the OpenStack Activity Board shows 91,317 commits spread across 108 repositories. All of this activity was performed by close to 2,600 developers, affiliated to about 230 different organizations. In addition, around 75,000 changesets have gone through code review, submitted by 3,082 developers.
With respect to community communication channels, there were more than 3 million messages exchanged in the IRC channels, close to 10,000 questions asked on the Askbot instance of the OpenStack Foundation and about 3,600 people posting to the mailing lists.
Focusing on the Juno six-month release cycle, activity was intense:
18,704commits, 8.65% increase if compared to the previous release cycle, Icehouse.
More than 130organizations contributing.
Close to 1,420developers, an increment of about 16%.
IRC channels grew from 926K to 1,024Kmessages, 10% more than for Icehouse.
Mailing lists reduced activity (8.8% less messages). However, there is an increase of 18% in the number of new questions in Askbot, with a total of about 3,000new questions during the Juno release cycle.
BMW (> 1,500 commits) keeps growing in the community, being the leader of the open source development process in GENIVI ecosystem, or at least to the publicly available repositories that Bitergia has had access. However, we do not find another automobile and engine manufacturing company until Ford (10 commits), with a discrete development participation. Although raising to the 5th position if counting commits from its owned subsidiary Livio Connect.
Bitergia, the software development analytics company
Bitergia is focused on software development analytics. We aim to produce useful information about how software projects are performing, how different actors are contributing and how they could be improved.
We provide tools and means to track all these aspects, and to evaluate how policies and decisions are shaping the development processes.