One benefit that comes from the open source nature and transparency of open source covid19 tracing apps, is how quickly these solutions can scale up to stop disease outbreaks. This post shares some insights from a previous analysis, which investigates how robust the software development activity is on the different open source contact tracing apps.
What can we learn from software development activity
Governments and public administrations from several countries have been relying on open source tracing apps to help citizens while simultaneously advancing the technology ecosystem. Some of the examples include Immuni (Italy), NHSCovid19 (UK), Corona-warn-app (Germany) and more.
Indeed, there are a lot of things we can analyze from these open source apps, from code effectiveness to how data is being anonymized. But today’s post is focused on something not so well-known: How robust the software development activity is. The study was inspired by other Bitergians who performed similar analysis
What can software development analytics tell us from these apps? you might be asking. First of all, let me introduce a set of questions we can answer with these data:
Q1: Is there an open source community around these projects or are they just focused on releasing code?
Many companies have seen open source as a promotional resource to improve positive word of mouth. This way of thinking might lead to an inefficient use of open source projects if companies only focus on its license and ignore the adhered community benefits that open source projects have.
Q2: How well is the project onboarding developers?
In case there is a community involved in the project, how is its community response? How is the evolution of active developers? How responsive is the community overall?
Q3: How does the organizational diversity look like on the different projects?
Despite the official companies in charge of developing such apps, are there any other entities or individuals involved?
Scope and patterns
I decided to focus on the following list of covid19 contact tracing apps that you can find in this blog. To let everyone replicate this same analysis, I used Cauldron.io tool, a Saas platform built on top of GrimoireLab.
Data sources analyzed were Git (Commits) and GitHub (Pull Requests and Issues): the channels involved in the open source software development of these apps:
- Commits: code contributions
- Pull requests: can be seen as “answers” provided by the community. People that send feature requests and merge changes to the code
- Issues: can be seen as “questions” given by the community. People who report bugs and ask questions.
Having said that, let’s discuss the four key points I found relevant after looking at the data:
- Deployment pattern
- Onboarding pattern
- Community response pattern
- Organizational diversity pattern
It was interesting to see how Radar Covid (Spain) and Covid Tracker (Ireland) had the same behavior in terms of lines added lines removed across time
As you can see, both seem to have been working on private repos first and then “open source” everything using a single commit.
This can tell us a low level of transparency as they are not allowing contributors to see a proper git history that will give us the evolution of the development of such apps (what have been changed, what did not work, what worked in the past but not anymore, etc)
For this section, I look at this visualization that shows when people made their first commit (newcomers) and when people made their last commit (contributors leaving). This can give us an idea of how well these projects are onboarding developers, and if they are losing contributors.
Given the example of Radar Covid, we can clearly see how they progressively lost contributors right after their mayor release:
We can also see that the trend when looking to all the apps listed below is either remaining the same level of developers or losing them:
For this specific topic, we first needed to make sure there was a community in first place, as there were some apps that were just focused on releasing code and there was no community involved. A good example of this is Covid Tracker (Ireland) where we can find no issues created since March.
Once we make sure we can actually find some community interaction within projects, we can see how the median review duration and the median time to close among the different projects.
Both metrics can tell us how fast the community is when answering issues (time to close) and review code (time to merge).
|Median review duration||Median time to close|
|Inmuni||0.06 days||2.07 days|
|Radar Covid||4.25 days||2.97 days|
|NHS COVID19||9.53 days||30.73 days|
|Corona-warn-app||0.07 days||0.73 days|
|Coronamedler||0.02 days||7.97 days|
|Stay Away Covid||0.47 days||3.94 days|
It’s interesting to see the fact that Corona-warn-app is the project with most issues and PR created and the fastest one solving issues (time to close) and reviewing code (time to merge).
Let’s now move to identity working hours patterns that might tell us something about the project organizational diversity. Here we can see two different situations:
This scenario shows that there is activity also during weekends and not only during weekdays. This is the case of Corona-warn-app, Inumi or Coronamedler. The most common reasons why this can happen are:
- Overwork risk: developers from those official companies in charge of developing the app are working more than expected, committing code also during weekends and thus creating a potential developer burnout
- High Diversity: We know that a project who only releases code during the week within regular working hours is more likely to be a company-based project (which only contributors are developers from companies) . Thus, these projects might have several kinds of contributors; from developers from private companies to single open source enthusiasts, which implies a more rich organizational diversity in terms of project contributors.
On the other hand, the second scenario (Radar Covid and Stay Away Covid) might tell to us just the opposite: company based-projects with a clear working hour schedule:
We can also see organizational diversity in git by looking at the email domains and comparing them with the official organization or organizations in charge of the development of the app, to identify potential third party developers involved.
Two good examples are Radar Covid and Corona-warn-app:
the tech Spanish company Indra is the only one as the official company involved in Radar Covid. In the following visualization, we can see how Indra and its child companies (such as Minsait) are the ones doing most of the code.
On the other hand, Corona-warn-app shows a different scenario. Even though there are more than one single official company working on the development of the app (SAP, HealthyToguether, HELMHOLTSZ, and more) we see a big amount of “other” contributors which might indicate a more diverse number of organizations involved.
Closing thoughts: What transparency brings to the development and adoption of tracing apps
Transparency is important in order to build a sense of fair play (among contributors): We can start building a better environment to onboard and engage skilled contributors that bring to the project new ideas to improve code efficiency, test accuracy, code scalability, and , overall speaking, increase innovation. This can also help to encourage national tech companies to get involved in the development of the app.
Transparency is important in order to build a sense of trust (among third parties): some citizens are still reluctant to download and use these apps. Transparency can help to increase the use of the app among third parties.
Cauldron.io: How to run your own analysis
If you find this study interesting, feel free to run the same analysis in Cauldron for the projects that matter to you and share your results with comments to this post. Not sure how to start? During my last talk at OSSummit, I briefly explained to attendees how they could get and visualize the same data I had for this analysis using current cauldron UI:
If you are still interested in knowing more about Software Development Analytics, you can get your monthly dose of knowledge by signing up to the Bitergia newsletter.