[Updated results based on methodological changes]
Kilo, the new OpenStack release, shows a continuous increase of activity if compared to Juno. From Icehouse to Juno, there was an increase of 6.22% in the number of commits and 17,07% in the number of unique authors. From Juno to Kilo, there’s a higher jump in terms of commits (11,23%) and a lower increase in terms of authors (11,16%). However, with this increase, there is a new peak in the number of unique authors contributing to the OpenStack Foundation projects with close to 1,600 different people participating in its development.
[This post is part of the lightning talk presented at FOSDEM 2015. The talk was titled as “Data, data and data about your favourite community” whose slides are available in the Bitergia’s Speakerdeck place. The ipython notebook used for visualization purposes is accesible through nbviewer and can be downloaded in GitHub. This is a basic introduction to GrimoireLib.]
GrimoireLib aims at providing a transparency layer between the database and the user. This helps to avoid the direct access to the databases while providing a list of available metrics.
This is a Python-based library and expects an already generated database coming from some of the Metrics Grimoire tools. CVSAnalY, MailingListStats, Bicho and most of the tools are already supported by this library.
Within a few hours the OpenStack Juno release will be delivered. At the moment of writing this analysis the OpenStack Activity Board shows 91,317 commits spread across 108 repositories. All of this activity was performed by close to 2,600 developers, affiliated to about 230 different organizations. In addition, around 75,000 changesets have gone through code review, submitted by 3,082 developers.
Last week I took part as master of ceremonies on an special event for FLOSS developers at .. Microsoft Spain offices! The idea for the meeting was to explore the different FLOSS technologies already supported by Microsoft Azure with speakers from different companies and communities like MongoDB, PhoneGap/Cordova, etc.
The event is part of the new openness strategy that is driving the company. But, I have thought about how open is really this movement? Of course, they are releasing a lot of code as Open Source, but is the company contributing to other FLOSS projects beyond their own ones? And by suprise, the answer has come from our own dashboards.
[This post is based on the Executive Summary and other sections of the full report about OpenStack and the Icehouse release (part of OpenStack reports) and data retrieved from the OpenStack Activity Board, both developed by Bitergia]
Less than two weeks for a new release of the OpenStack software. As usual, we at Bitergia keep contributing to this project through the Comunity Activity Board project as part of the openstack-infra project. A beta version of our companies analysis of the Icehouse release is already available at the OpenStack releases dashboard, where previous releases are accessible as well: Havana, Grizzly, Folsom and Essex.
An interesting fact: while for previous releases contributing organizations changed a lot, from Havana to Icehouse release top contributors keep stable with no big changes. Even more: no big changes in the top organizations, and no big changes in the number of commits. The only new entry in the top ten is Intel, with the rest contributing in a similar way as they were in Havana.
Turnover is inevitable. Developers leave a project and others join it. And this effect may be more harmful in open source communities than in companies. Depending on the community, it is hard to find new people willing to participate. And even more, there is a knowledge gap left by those that gave up developing. So the issue is double: people leave and those leave a knowledge gap that in some cases is hard to fill.
However, is it possible to analyze that regeneration of developers? How good is my community retaining developers? Is it possible to measure the number of newcomers joining the community? It is clear that having this type of information is basic to define policies to attract new members, retain current ones and check if the current situation is driving the community to good terms.
This post is an example of the type of things that in Bitergia we are building on top of the CVSAnalY tool. In previous posts we introduced the concept of commit, its peculiarities as a metric, and several ways to calculate this, adding filters such as bots, merges or branches.
The demographics of open source communities allows us to understand how the community has evolved, and potentially how this community will evolve through the time. Demographics in open source communities can be seen as the typical analysis of pyramids of population in countries or cities. Typically on the top of the chart the oldest people are found, while the age decreases going to the bottom of the chart. Those are named as pyramids given their typical triangle shape. However during the last decades and in developed countries, this shape is moving to an inverted pyramid, although this is another discussion :).
Thanks to the study of the demographics of developers, it is possible to know a bit more about the community. We already introduced the demographics of the Linux Kernel, and this post is focus on the analysis of the OpenStack community as a case study. The following figure shows the demographics of the OpenStack community (daily updated in the OpenStack activity dashboard). The x-axis indicates the number of developers, while the y-axis shows the timeframe of activity.
Green bars show the number of developers that in each of the periods started contributing with at least one commit. And blue bars show the number of those developers that still contribute to the community. By definition, a developer is still contributing to the community if a commit has been detected during the last six months. If not, this developer is considered as a developer that left the community. There may raise the case when a developer after more than six months, returns and submit another change to the source code. In this specific context, this developer would appear as not leaving the community.
In a previous post (Commits: that metric), we were talking about all of the flavors we should take into account when measuring commits.
An example was provided and in some cases, and depending on the development policy of the project, commits ignoring merges represented around a 50% of the total activity that we can find.
CVSAnalY is one of the tools that is used as input in our dashboards. It is specialized in versioning systems, and parses the log provided by some of the most used in the open source world. It does this with the priceless help of Repository Handler, in charge of adding a transparency layer.
Its procedure is simple: CVSAnalY reads a log from SVN, CVS or Git and builds and feeds a relational database. For other distributed versioning systems, there are hooks to migrate from those, such as Mercurial or Bazaar to Git.
In order to illustrate this post, the publicly available database for the OpenStack project is used. This database is the basement of the dashboard that can be visualized at the Openstack Activity Dashboard page. Bitergia provides and daily updates this database. So, this analysis is done with dataset up to today.