The Pony Factor: Metric of the Month, November 2022

Welcome to the second chapter of the Metric of the Month! We’re very excited to continue this metrics series to show you a complete guide for different metrics each month so you can understand more about them.

Last month we talked about the Elephant Factor; this month, it’s the turn for the metric that inspired the creation of that metric called the Pony Factor. Daniel Gruno published this indicator in 2015 due to some discussions in the Apache Software Foundation and inspired by the Tragedy of the Commons.

Gruno’s idea was to calculate the smallest number of code contributors who had committed 50 percent of the codebase in the past two years. After applying this metric to all ASF projects, the result was so interesting that Gruno used it with popular FLOSS projects. A lot of trendy projects had a very low Pony Factor. This meant that a minimal set of contributors was responsible for half of the codebase.

The Pony Factor, how sustainable is the project?

A higher Pony Factor means that a project has a good tolerance for continuing to survive if one or more of the core contributors leaves. Gruno found that a small set of developers mainly maintained many critical projects in 2015. Their Pony Factor was low.

To see how this indicator can offer insights, let’s imagine a scenario where you decide to import two Javascript libraries to be used within your organization. The first one has a Pony Factor of 2, while the other has a Pony Factor of 7. Only having this information, our recommendation would be to mark the first one as “risky” because most of its maintenance depends on only two contributors. 

In the real world, this metric needs to be put into context. If we go back to our scenario, other indicators can help us make better decisions. Using the Elephant Factor we mentioned in a previous blog post, we would have an idea of the organizations behind those contributors. A Pony Factor of 7 looks much better, but what if all those developers work for the same company? In that case, the decision is not that easy.

Do not get discouraged by having more questions than answers. It is common when we start digging into the data sets. What is important is that you start having more tools to allow you to make informed decisions.

Pony Factor
This pie chart shows that 2 contributors do 50% of the commits. The Pony Factor for this project is 2.

What is the difference between Pony and Bus Factors?

Both terms are very related, and they are often used with the same meaning, but their origin is a bit different. The Bus Factor’s origin was discussed in the Python mailing list in June 1994. The subject of the threads speaks for itself: “If Guido was hit by a bus?“. Michael McLay sent a message where he expressed concern about the project’s dependency on Guido van Rossum.

“I just returned from a meeting in which the major objection to using Python was its dependence on Guido. They wanted to know if Python would survive if Guido disappeared. This is an important issue for businesses that may be considering the use of Python in a product.”, said Michael McLay.

The Bus Factor became a recurring concept years later in software development projects. By definition, this indicator measures the risk of a project getting stalled by indicating the number of key team members that can be lost before that happens. The idea behind it was simple and powerful but also undefined. 

Pony Factor
The Bus Factor for the CHAOSS Project in 2020, when considering only git commits, was 5.

On the other hand, the Pony Factor was specific from the very beginning. It was designed to measure the key code developers that can be lost in your team before the development is at risk. After publishing the results obtained by calculating the Pony Factor, some improvements were applied in other derivative indicators. The Augmented Pony Factor, for instance, excluded from the calculation the inactive contributors.

In case you are in doubt about which term you should use, for simplicity, our recommendation is to use the Bus Factor as defined by the CHAOSS project. It can be applied to code contributions and other types (issues, pull requests, etc.). 

Implementation

The calculation of the Pony Factor is simple. The complexity here is hidden by the platform that collects and manages the information about the individual contributors. This dataset ensures that the contributors are properly represented even if they use different accounts because they are grouped in what we call unified profiles. So each contributor has a unified profile on our Platform. That way, we avoid over-representing the number of contributors to the project.

In case you are interested in calculating this metric, what you need is a percentage calculation that will be used as our threshold. If you are looking at a pie chart of how many developers have contributed, then the Pony Factor is the smallest number of slices that cover 50% of the pie.

The calculation of the Pony Factor will change depending on the time period selected. It will be different if you compare the history of a 10-year project with its last year. Our recommendation is to calculate a medium size period of time, such as “last year,” and use the time selector to see its evolution over the past years.

Watch a short video about the Pony Factor and how to set up this metric on our Platform, explained by our Bitergian Luis Cañas-Díaz:

Where can I find this metric?

GrimoireLab and Bitergia Analytics provides this metric out of the box, not as a single number but as a visualization: a pie chart, as shown in some examples above.

  • View an example on the CHAOSS instance of Bitergia Analytics.
  • Download and import a ready-to-go dashboard containing examples for this metric visualization from the GrimoireLab Sigils panel collection.
  • Add a sample visualization to a dashboard following these instructions:
    • Create a new Pie chart
    • Select the git index
    • Metrics Slice Size: “Unique Count” aggregation for the field “hash.”
    • Buckets: Split the slices by “Terms” aggregation with the “author_name” field

Want to know more about Bitergia Analytics Platform?

Bitergia Analytics is a complete solution to get full insights into software development projects. It is a comprehensive solution that includes an analytics platform, training, support, and consultancy to make the most use of it.

Customers of the Bitergia Analytics services can reach their goals and make data-driven decisions about their software development teams. In fact, they can achieve this through the 4 services: Strategy, Customization, Analysis, and Reporting.

Useful Links

Next metric chapter: Attracted Developers

In the next Metric of the Month chapter, we’ll continue talking about code contributions to projects, but this time we will study how many new contributors are attracted to the project. The Attracted Developers metric is defined as “the number of new developers that join the project over time.” Don’t forget to subscribe to our Newsletter to get your next edition first!

Did you like this chapter? Give us your opinion in the comments or share it on social media. And every suggestion or comment is welcome. The Owl wants to share its knowledge with you!

Latest Posts

Leave a Reply

Up ↑

%d bloggers like this: