The OpenStack Icehouse release: activity and organizations

[This post is based on the Executive Summary and other sections of the full report about OpenStack and the Icehouse release (part of OpenStack reports) and data retrieved from the OpenStack Activity Board, both developed by Bitergia]

At the moment of this analysis OpenStack projects are close to reach the 74,000 commits since their start as observed in the Activity Board. That activity was developed by more than 2,000 different contributors that at some point started 68,000 code reviews processes and sent and reviewed close to 270,000 different patches. There are more than 33,600 reports in the ticketing system, that were opened by 3,303 different participants. And high activity is also registered in the discussions forums, with close to 52,000 emails messages posted by 2,800 participants and more than 6,200 questions in the OpenStack question and answer tool.

Focus on the development activity, developers can be divided into 246 core developers, 461  with regular activity and 1,214 occasional ones that at some point submitted some patches and contributed to the code.

Structure of the OpenStack community of developers

Structure of the OpenStack community of developers

Continue reading

Companies contributing to Icehouse: preliminary results

Less than two weeks for a new release of the OpenStack software. As usual, we at Bitergia keep contributing to this project through the Comunity Activity Board project as part of the openstack-infra project. A beta version of our companies analysis of the Icehouse release is already available at the OpenStack releases dashboard, where previous releases are accessible as well: Havana, Grizzly, Folsom and Essex.

Preliminary number of commits per organization in the Icehouse release

Preliminary number of commits per organization in the Icehouse release

An interesting fact: while for previous releases contributing organizations changed a lot, from Havana to Icehouse release top contributors keep stable with no big changes. Even more: no big changes in the top organizations, and no big changes in the number of commits. The only new entry in the top ten is Intel, with the rest contributing in a similar way as they were in Havana.

Continue reading

Measuring demographics: OpenStack as case study

Turnover is inevitable. Developers leave a project and others join it. And this effect may be more harmful in open source communities than in companies. Depending on the community, it is hard to find new people willing to participate. And even more, there is a knowledge gap left by those that gave up developing. So the issue is double: people leave and those leave a knowledge gap that in some cases is hard to fill.

However, is it possible to analyze that regeneration of developers? How good is my community retaining developers? Is it possible to measure the number of newcomers joining the community? It is clear that having this type of information is basic to define policies to attract new members, retain current ones and check if the current situation is driving the community to good terms.

This post is an example of the type of things that in Bitergia we are building on top of the CVSAnalY tool. In previous posts we introduced the concept of commit, its peculiarities as a metric, and several ways to calculate this, adding filters such as bots, merges or branches.

The demographics of open source communities allows us to understand how the community has evolved, and potentially how this community will evolve through the time. Demographics in open source communities can be seen as the typical analysis of pyramids of population in countries or cities. Typically on the top of the chart the oldest people are found, while the age decreases going to the bottom of the chart. Those are named as pyramids given their typical triangle shape. However during the last decades and in developed countries, this shape is moving to an inverted pyramid, although this is another discussion :).

Thanks to the study of the demographics of developers, it is possible to know a bit more about the community. We already introduced the demographics of the Linux Kernel, and this post is focus on the analysis of the OpenStack community as a case study. The following figure shows the demographics of the OpenStack community (daily updated in the OpenStack activity dashboard). The x-axis indicates the number of developers, while the y-axis  shows the timeframe of activity.

Demographics of the OpenStack developers community

Demographics of the OpenStack developers community

Green bars show the number of developers that in each of the periods started contributing with at least one commit. And blue bars show the number of those developers that still contribute to the community. By definition, a developer is still contributing to the community if a commit has been detected during the last six months. If not, this developer is considered as a developer that left the community. There may raise the case when a developer after more than six months, returns and submit another change to the source code. In this specific context, this developer would appear as not leaving the community.

Continue reading

How to measure commits: merges, branches, repositories and bots

In a previous post (Commits: that metric), we were talking about all of the flavors we should take into account when measuring commits.

An example was provided and in some cases, and depending on the development policy of the project, commits ignoring merges represented around a 50% of the total activity that we can find.

CVSAnalY is one of the tools that is used as input in our dashboards. It is specialized in versioning systems, and parses the log provided by some of the most used in the open source world. It does this with the priceless help of Repository Handler, in charge of adding a transparency layer.

Its procedure is simple: CVSAnalY reads a log from SVN, CVS or Git and builds and feeds a relational database. For other distributed versioning systems, there are hooks to migrate from those, such as Mercurial or Bazaar to Git.

In order to illustrate this post, the publicly available database for the OpenStack project is used. This database is the basement of the dashboard that can be visualized at the Openstack Activity Dashboard page. Bitergia provides and daily updates this database. So, this analysis is done with dataset up to today.

Continue reading

Commits: that metric

Source code versioning systems are tools that help to facilitate the life of developers. Basically those are used to have a list of all of the changes in the source code and allow to navigate and recover old version of the project. Each of those changes to the source code is defined as a commit, and this may be considered as the nuclear piece of information in these systems.

And commits are nowadays considered as a “good” metric to have an initial idea of the total effort developed in a project. However, this is not as simple as it seems to be, and each versioning system and even each project with its particularities may distort this metric. So we all need to be a bit careful when raising this metric as “the most wonderful, marvelous and incredible metric in the world”.

So, in first place, what kind of information can we find in a commit? Typically commits provide information about the time when the change took place, files that were affected by that change,  added, removed or modified lines, the author of the commit, and maybe extra information such as the reviewer, specific acknowledgements and others. The following example shows information that can be found in a specific commit (using the git log command):

commit 160ae59a76e2ce3fb6589137d90bb9e80f056fa0
Author: Daniel Izquierdo <dizquierdo@bitergia.com>
Date:   Fri Mar 7 13:32:25 2014 +0100

Add turnover in ITS and SCR

diff –git a/vizGrimoireJS/alerts.py b/vizGrimoireJS/alerts.py
index ff5a703..12b1de6 100755
— a/vizGrimoireJS/alerts.py
+++ b/vizGrimoireJS/alerts.py
@@ -82,15 +82,29 @@ if __name__ == ‘__main__’:

[...]

However, the definition of commit is really specific of the versioning system. Just an example, a commit in CVS is a modification in one file. So N modified files, implies, N commits. But, on the other hand, Subversion or Git may have several “touched” files in the same commit. Are comparable projects at the level of commits using different versioning system? The answer is probably that they are not comparable simply counting commits. You need a bit more advanced way to count them.

Continue reading

Intense 2014 start

It has been an intense 2014 start. First, we have been working hard on January to update our published dashboards look & feel, to have them updated for FOSDEM 2014. Beyond the two talks we had there (A comparison between MediaWiki, TWiki and XWiki communities and Project development & community metrics for fun and profit), we have had a lot of meetings during FOSDEM due to the interest on the services we provide. So, new customers are coming (stay tunned for updates), and the current ones start to show their dashboard on public, like oVirt dashboard as part of the work we are doing for Red Hat. There aren’t too many photos from Brussels, but you can find some in our Google+ profile.

On the other hand, we have been included in two European R&D projects, and during February we have had the MARKOS project plenary meeting in Berlin. It is aimed to realize the prototype of a service and an interactive application providing an integrated view on the Open Source projects available the on web, focusing on functional, structural and licenses aspects of software code.

And this is only the start of a promising year..

Let’s meet in 2014!!

2013 is almost over and it has been an intense year starting up Bitergia. We have worked hard and we would like to thank our customers for trusting us. It’s time to face 2014 with energy!

We continue improving our free / open source tools and services, starting by presenting our latest developments in Fosdem 2014 (two talks approved, stay tuned!), and working hard to help free / open source software communities success by the use of metrics and software development analyses as key factor.

Happy 2014!!