[Update: we have published a more accurate and validated report, please have a look at it]
WebKit is a well known free, open source software project which is producing the core of several of the most popular web browsers. Several companies (and other actors) are collaborating together to build this component, which is key to many of them. The two main players in WebKit are Apple and Google, but it is less known that there are many others participating actively as well. They are far away from the big players, but all together account for a sizable fraction of the total activity.
This post is the first of a series on different aspects of WebKit development, based on the analytics we at Bitergia are gathering about it. Our take is that WebKit is one of those projects massively used by the industry, and therefore worth studying with the aim of providing quantitative and objective data about it.
Specifically, this post is focused on the analysis of the evolution of the activity of companies in the WebKit source code management repository (currently Subversion, formerly CVS) since it was released as an open project back in 2005 and before, when it was still an internal project at Apple (if you don’t know about it, have a look at the fascinating history of the project since its ancient origins in KDE). The analysis of this activity provides useful information to understand, for example, how strongly companies are betting for the project (in terms of contributions to it), and what is probably more relevant,which companies are having some kind of “soft” control.
Being WebKit an open community, clear policies and procedures have been established to avoid control by companies in the traditional, direct meaning. But meritocracy, together with varying amounts of contributions and involvement in the community, let some companies be more central to the project. From this point of view we were interested in paying attention to developers with rights to directly modify the source code (committers), and their activity (commits).
With respect to commit activity, Figure 1 shows a general overview of the project over all its life time, with commits assigned to companies according to the affiliation of developers. It is clear how Apple (close to 40% of commits), followed very closely by Google (about 38%), lead the activity. There are also some other actors with a relevant level of activity, such as Nokia, with 4.91% of the total commits, Igalia with 3.4%, Research in Motion with 3.12%, and the University of Szeged with 2.23%.
But a very different picture emerges if we focus on the latest activity, and consider only commits during 2012.
Figure 2 shows how during 2012 (up to October 24th) Google is by far the most active company, with almost 50% of all commits. Apple is now second, with about 19%, while previously mentioned players such as Nokia, RIM, Igalia or the University of Szeged are among the most active in 2012 as well.
From the differences between Figure 1 and 2, it seems clear that Google has been the major contributor for quite some time, certainly for longer than 2012. Indeed, the policy for granting committer status asks for developers to “have submitted around 10-20 good patches, shown good judgment and understanding of project policies, and demonstrated good collaboration skills”, which means that it is not possible to suddenly have a large increase in committers for a certain company, because they must follow a certain training and testing process. When was the time when Google overtook Apple in terms of activity? Can we visualize this process?. These two questions are answered by Figure 3 and 4 (below).
In Figure 3, we have isolated the activity (again, number of commits) per year of Google and Apple. It shows how Google reached the current level of activity after about 3 years of increasing activity, starting in 2009. And how this happened despite Apple maintained a stable level of activity since 2007-2008. It can be said, therefore, that Google activity came as a surplus to Apple’s, not as a substitution. In other words, Apple has been steadily contributing for more than five years, but since about three years ago, and on top of that contribution, Google is putting its own, which right now is close to doubling Apple’s, thus significantly helping to boost the project.
Figure 4 shows the activity of the next ten companies, with some different patterns. In addition to Apple and Google, which are contributing almost 70% of the activity, there is a second group of companies and institutions which have been clearly increasing their participation during the last years: Nokia, Igalia, RIM and University of Szeged . They account for about 17% of the activity, have been increasing their net activity during the last years, and are currently between 1,000 and 2,000 commits per year. Finally, there is a third group with yet more actors involved, with a lower (not less important!) activity. In that group we can find names such as Collabora, Adobe, Nuanti, Openbossa, Samsung or Intel, all well below the 500 commits per year.
All in all, this analysis is showing not only how Google is pushing WebKit with Apple, but also a glimpse of the structure of the community of companies participating in it. Behind these numbers, it is certain that a story of strategies, competition and collaboration between competitors could be written. But that’s another story: we’re only providing the numbers ;-)
From a methodological point of view, this analysis is based on committing activity in the Subversion repository of the WebKit project. This means that authors (that is, developers actually submitting the changes, when they are different from committers) are (so far) not considered: we are only taking into account contributions by people who have the right to commit to the project repository.
In addition, we scrutinized commit-queue activity (commit-queue is the bot which actually commits changes to the source code in many cases, such as those that follow code review procedures). Of a total of about 10,000 commit-queue commits, we identified code reviewers in charge for about 8,500, and considered them as committers for those commits.
We then sorted committers by number of commits, and tried to find the institutions (companies or other) to which they are affiliated. We succeeded with a high degree of certainty for committers accounting for more than 95.6% of the commits. In fact, all committers with more than 600 commits were linked to an institution, and only 4 were not linked with more than 200 commits. In total, we have identified affiliations for 387 committers (out of a total of 439 identities found in the Subversion repository), corresponding to 29 institutions. We have tracked only the current company: if some committers were hired for a different company in the past, all their commits during that time would be wrongly assigned to the current one. Of course, we could have some errors in the assignment of affiliations, but data is correct to our best knowledge.
The exact data we used for this analysis, and all the methodological details will be published when we release our upcoming report on WebKit. The data shown in this post could have (small) errors, which should not affect the general statements in it.