Source code versioning systems are tools that help to facilitate the life of developers. Basically those are used to have a list of all of the changes in the source code and allow to navigate and recover old version of the project. Each of those changes to the source code is defined as a commit, and this may be considered as the nuclear piece of information in these systems.
And commits are nowadays considered as a “good” metric to have an initial idea of the total effort developed in a project. However, this is not as simple as it seems to be, and each versioning system and even each project with its particularities may distort this metric. So we all need to be a bit careful when raising this metric as “the most wonderful, marvelous and incredible metric in the world”.
So, in first place, what kind of information can we find in a commit? Typically commits provide information about the time when the change took place, files that were affected by that change, added, removed or modified lines, the author of the commit, and maybe extra information such as the reviewer, specific acknowledgements and others. The following example shows information that can be found in a specific commit (using the git log command):
Author: Daniel Izquierdo <email@example.com>
Date: Fri Mar 7 13:32:25 2014 +0100
Add turnover in ITS and SCR
diff –git a/vizGrimoireJS/alerts.py b/vizGrimoireJS/alerts.py
index ff5a703..12b1de6 100755
@@ -82,15 +82,29 @@ if __name__ == ‘__main__’:
However, the definition of commit is really specific of the versioning system. Just an example, a commit in CVS is a modification in one file. So N modified files, implies, N commits. But, on the other hand, Subversion or Git may have several “touched” files in the same commit. Are comparable projects at the level of commits using different versioning system? The answer is probably that they are not comparable simply counting commits. You need a bit more advanced way to count them.