Bitergia participated in the last LinuxTag event in Berlin that brought together the industry and FLOSS (Free/Libre/Open Source Software) communities in the same event.
I had the pleasure to present the basics of the analysis of FLOSS communities from a quantitative point of view, specifically focusing on the analysis of companies. Openstack was the project selected as a case study, where volunteers and companies are working together to build an open source software to build public or private clouds.
Among other questions,
- Main developers
- Understand who are the main developers: a company could be interested in hiring them or providing some financial support for specific activities.
- Typical patterns of activity: in order to guess the effort that is being actually developed.
- Regeneration of developers: turnover is almost impossible to avoid, but some policies could be derived in order to avoid knowledge loss.
- Study of companies participating: some companies could be interested in better understanding what other companies are doing and the regions of the source code that they are modifying. Or even their importance in terms of number of developers and overall productivity.
- Responsiveness of the community: when fixing issues in the source code, process is usually undertaken in the issue tracking systems or for instance, support provided in the mailing lists or forums.
- Evolution of licensing and issues derived from them: this is probably a key difficulty when redistributing source code and integrating third part software
- Orphaned areas of the source code that might be more prone to be buggy as well as low maintained areas of the source code.
From this perspective, companies, public administrations or other actors in the open source world might be interested:
- Companies freeing a product and willing to create community around it. From this perspective, companies participating in this process are interested in attracting other actors that will help to evolve the community. However how is it possible to measure that activity?. Are there companies coming to my product? Who are the main developers of this? Thus this type of companies are interested in checking the adoption of their product.
- In other cases we can find companies that have to make decisions. In this case the community of developers and users usually play a key role. Thus, is there a way to evaluate several open source projects providing a set of metrics? How is the responsiveness of the community to open bug reports or questions in the mailing lists? How is the typical maintenance activity of the project?, are there abandoned areas of the source code?
- Forges in general could also be interested in this type of analysis. In general, they are interested in attracting as many users and developers to their websites as possible. Thus, is it possible to measure new authors coming to the forge? What about providing a whole set of metrics that makes easier their life? This type of analysis could be also extensible to another type of actors and these are the public administrations. In this case, adoption is again a key factor for them.
- Finally, it is worth mentioning the fact that some companies are not aware of the open source advantages in general. This may lead to situations where open source is not seen as a trustable source of software. Thus, first steps of the companies in the open source world is also necessary. Stuff like the general structure of the community, typical activity, general channels of communication or even training with general open source tools is needed.
All in all, this is information needed and pretty valuable for those that are making decisions. Tools such as CVSAnalY or Bicho help to retrieve information from open source projects.
For more information, please, have a look at Bitergia’s speakerdeck presentation where there are more details, using Openstack as case of study. As a disclaimer I will say that the dataset presented for Openstack is only based in commits. This means that there are some very well known limitations, but there is no more time in a 30 minutes presentation to deal with them.
For specific questions about the presentation or other inquiries, please do not hesitate to contact us.