[Update (2013.03.01): New post in the series: Reviewers and companies in the WebKit project]
Today Bitergia presents the first of a series on analytics for the WebKit project. After the preview we published some weeks ago, we finally have more detailed and accurate numbers about the evolution of the project. In this case, we’re presenting a report on the activity of the companies contributing to WebKit based on the analysis of reviewed commits.
Some interesting results are the share of contributions by the two main companies behind the project (Apple and Google), and how it has evolved from a project clearly driven by Apple, before 2009, to the current situation, with Google leading the top contributors table, and both Apple and Google being almost equal in contribution share over the whole history of the project. During the last years, it is also noteworthy how the diversity of the project is increasing, with new players starting to show a significant activity.
With respect to our former preview, data has not changed radically, despite this report being methodologically much more solid. Reviewed commits are those that went through a review process, and were accepted by a developer other than their original author. Therefore, are a good indicator for the “non-maintenance” activity of the project. In addition, we have used the data that the project itself maintains about who is the author of each changeset (commit), so the data is more reliable from that point of view. For those interested, we have written some notes about the methodology.
Coming back to the results, the number of reviewed commits and the number of authors through the life of the project is interesting as well. Apple and Google seem to have contributed very similar amounts of changesets now, but it is also interesting how the share for other parties is of about 25%, and growing. This can be observed more clearly in the pie chart for authors per company, where those other parties amount for almost half the authors. The number of Google and Apple authors is also interesting, and probably it says a lot (in combination with the other charts) about how they are organizing the contributions to WebKit and their own browsers based on it.
The evolution of the number of active companies per month is also interesting, showing how the diversity of the project is increasing. From about five companies contributing actively around 2007 to the more than 20 contributing these days, the situation in WebKit is very different.
The comparison of the evolution of the contributions (as number of reviewed commits) by different companies is worth mentioning. In particular, the one of Apple with Google shows two of the most important histories for the future of the project. While the contributions of Apple are relatively stable since as soon as 2008 (with a peak in late 2010 and early 2011), those by Google have grown very quickly since late 2008, when they were hardly relevant, to now, when they amount for about 800-1,000 commits per month (of the total of about 1,500-2,000).
The case of Nokia (the third contributor by number of commits) tells a very different history. After the quick rise from 2008 to 2010, and later to the peak in late 2011, their contribution is quickly becoming of little relevance. It is interesting to notice, in fact, how the project seems to tolerate without much problem that no less than its third contributor lowers dramatically its implication.
Just to compare, the evolution of RIM is almost the opposite. After a first period of higher level of contributions, around 2006-2010, in mid 2011 its contributions came down abruptly. But since then, its participation has grown to a very substantial (in comparison) level.
After all these graphs, probably one could infer (or at least glimpse) corporate decisions, and company policies with respect to the browser of choice for their products, or for other technological bets. The more one digs into the numbers, the more fascinating facts in these areas can be found.
Some notes about the report itself. We are providing the whole database (stripped of email addresses) on which the study is based, and from which the JSON files supporting the visualizations are produced. This means that it is possible to play with the data, create some queries, and in general improve the analysis. As previously said in older posts, our main concern is to be as open and trustable as possible and this is the reason why our tools, scripts and datasets are open source. Of course, if you dig some interesting information, we would be more than happy to know about it.
With respect to the details, the report only contains information from reviewed commits. This means that only activity in the source code that was peer reviewed is taken into account in this dataset. In addition, extra work has been put to integrate all of the identities found in the ChangeLog files found in WebKit. Those are close to 1,600 different identities (including bots). And those identities could also be part of several companies in different timeframes. This issue has also been fixed, adding such information in the database and being controlled for specific developers (not all of them).
All this said, we have to remind the reader that measuring commits is just one of the possible approaches to learn about the activity of companies in project. We are aware that not all commits are equal or contribute equally to the project, and that there are other areas of activity. Therefore, the presented results should be considered as “a certain view” of the activity of companies, and not as “the view”. Therefore, to complement this “certain view” some more reports, from other angles, will follow.
To finish, we have to thank all of you that sent some feedback after our previous post comparing main companies participating in WebKit. It has allowed us to improve the methodology, stressing the point on reviewed and not reviewed commits. Thanks also to the WebKit developers who have answered our questions about their project, and have helped us to validate the results.