Bicho was born during the Summer of 2007. Its main goal was to develop a tool that automatically could retrieve information from the Sourceforge trackers. At that point in time there were not tools to retrieve information from such data sources and we, in LibreSoft, decided to start working on this. Since then, and mainly due to requirements from research projects, Bugzilla (initially KDE, GNOME and Apache ones) and Jira support was added.
With the creation of the spin-off Bitergia from the research group, we agreed that Bicho was getting old and it needed another extra set of functionality (In 2007 GitHub or Allura did not exist!). The first step was the creation of a neutral place for everyone where the data mining tools could attract developers and users, as explained in previous posts and named as Metrics Grimoire.
So, together with the community of Metrics Grimoire, we have provided support for the GitHub, Allura, Google Code and Launchpad trackers.
In general, the process of the tool consists of retrieving information from the specified tracker and store that in a MySQL database. The existence of several trackers with different information to be retrieved is solved by providing an extra table with all of the fields that are not common among the trackers.
But, why should you use Bicho?. As mentioned, this tool parses information found in the most usual issue tracking systems. Information from such data sources are useful to measure from several perspectives any open source project (and proprietary ones using those data sources!). Among others, the following ones provide a first glimpse about the project:
- How is the typical time to fix an issue?: or how is a technical decision affecting the typical time to fix an issue?. In general technical decisions are made to improve parts of a project such as the process, the product, the interaction among the members of the community or any other aspect. However, is forcing that decision to have more bureaucracy and later to delay the fix of bugs?. At least it is important to know about this and make decisions with data!.
- Responsiveness of the community: what is the time between a report is opened until this is assigned or reviewed by a member of the community?. This type of metric could help to smooth the relationship of the community to their general base of users. Faster responsiveness may help to improve the perception of a dynamic and active community if compare to slower reaction when a report is opened.
- Are more severe issues being faster closed?: it is probably a key aspect of the project to fix critical issues as soon as possible. However, is this true that critical issues are faster closed than any other?. Perhaps more resources should be provided to this type of reports if this is happening.
- Most active developers: most of the studies are only focused on the analysis of the versioning system, but there are not so many analysis focused on the ticketing system. Thanks to Bicho it is possible ot measure activity from the developers found in the versioning system. Questions such as number of open and closed reports per developer, time to fix a bug, general activity commenting, participation in the components, number of attachments are easily obtained.
- Size of the community: the use of gits and general development is usually restricted to developers. And even more, developers are usually users and in some cases they are willing to improve the product they use but they do not have the time to do this. A good approach to measure the size of the community is through the study of the ticketing system. And in fact there are users that open issue reports based on their experience of a crash in their systems.
And how is the development of Bicho so far?. As you can see in the following chart, this has been discontinuous during last years. Four main peaks took place from 2007 up to 2012.
- First one: providing initial lines to the project and adding SourceForge tracker and Bugzilla support together with Storm, a transparency layer to access databases.
- Second: where Beautiful Soup was added in order to facilitate the parsing of HTML and general improvements in the database schema.
- Third one: improvements in the Bugzilla backends, incremental support, GitHub and Jira backends were added. This activity was donde together with the ALERT project, an EU funded project of the FP7.
- And finally the fourth peak where Bitergia started to work. Launchpad, Allura and Google Code were added and general bug fixing and maintenance activity. As it is observed it has started a more stable phase where there is a minimum of activity per month.
If you are interested in participating, there are several channels of communication where we will be very pleased to help you:
- Metrics Grimoire at GitHub: there you will find the source code of Bicho (and some other data mining tools), a tracking system to open reports (bugs, features and any other technical request) and the README file with information about the usage and installation steps.
- IRC channel: you can find us at #metrics-grimoire using the freenode server. (You can also find us at #bitergia or #libresoft channels)
- Mailing list: Metrics Grimoire is using a mailing lists kindly hosted by the LibreSoft group at the University Rey Juan Carlos in Madrid. It is a low traffic list where you are more than welcome to introduce yourself and ask for any pieces of advice that you may request. The address is the following: https://lists.libresoft.es/listinfo/metrics-grimoire .
- Bitergia team: you can reach us using our contact form or sending an email to info at bitergia dot com.
Hi, thank you for this post I agree with you that Its main goal was to develop a tool that automatically could retrieve information from the Sourceforge trackers. very useful information