As more and more organizations are becoming geographically dispersed, remote, and multinational, there is increased competitiveness and pressure for building high-performance teams for the success, growth, and survival of organizations. This applies to software development teams as well. As major organizations rely on team-based software development, being able to measure and improve software team performance is crucial for success. But, what’s a high-performing team and how do we measure the performance of a team? I recently read Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations from Nicole Forsgren et. al. They have some great ideas, I wanted to write about this and add to it from my experience.
Measuring the performance of teams in software engineering is not straightforward. First of all, in contrast to more traditional engineering teams, the products are not easy to quantify and they continuously evolve after launch. You could have an amazingly well-performing team working for months to add new features to an existing product and not launching any new product. So the number of products we launch may not be relevant in software products. Most volume-related metrics are not applicable in software development. A refrigerator manufacturer could measure the number of refrigerators they can manufacture at a given year per employee but that’s not relevant in software. As there are many challenges, there are many invalid metrics in use today.
What doesn’t work in measuring performance:
There are commonly measured metrics in organizations that simply measure output. I have seen the following list being used as a performance metric and at each instance, in my experience, it brought more damage than benefit.
- Line of codes: It’s not a great idea to utilize metrics that relate to lines of code as performance. First of all, you don’t want more lines of code, your product will become hard to maintain. Second, it’s easy to game this by simply adding more comments to code or using long syntaxes unnecessarily. Third, some programming languages are more verbose than others, meaning 2x lines of code in one language could be doing less work compared to x lines of code in another language.
- The number of commits per developer or the number of days that developers make commits: These metrics are not as bad as lines of code, but easy to game and can cause engineers to make unnecessary modifications just to make one more commit. It can also cause engineers to try to make a commit every day and prioritizing making a commit instead of focusing on the actual objective.
- Points per developer (velocity): This is highly utilized in the industry but it really does not add enough value to earn such high utilization. The major problem is that the points assigned to work items are subjective. If these points are left to engineers, they can (and will) grade each work item as high as possible to make the velocity look higher. Another issue is that you might achieve to complete a huge amount of points but perhaps none of the work you are doing has any business impact.
- Utilization of the team: This is typically measured by how many points a team delivers, how much work engineers have during a specific period of time or how close they are to their maximum capacity in terms of points per sprint, etc. This is more of a combination of the factors explained above and it can be incorrect to focus on the utilization. Engineers could be fully utilized but they may not be working on any item that actually has business value or moves the project further. Also, you really don’t want to keep your team fully utilized because in engineering there are always unexpected problems and you need to be able to have a buffer to account for unexpected problems. For instance, during the sprint, you can have production issues and if your team is fully utilized and evaluated by points then they won’t have any room to take production errors. Even if the product is stable and production issues are rare, you still want to have room in your planning so the engineers can take a step back and resolve issues that come up during the sprint and think about the work they are doing instead of rushing to finish things. In my opinion, you never want to plan to utilize your team beyond 80% capacity on average.
What works in measuring performance:
First of all, the team performance must be based on the outcome, in other words in business objectives. Because in essence, a software team or any other team is a group of individuals that come together to achieve a business result. Naturally, a team’s performance will be significantly impacted by the performance of the other teams they are interacting with. It would also be significantly impacted by the software architecture they are working with (which is related to their interfaces with the other teams). Obviously, you can have the best technology and perfectly performing teams but if you are not developing features that your customers need then you can’t have a successful outcome. I would like to propose 3 simple questions that relate to team performance, instead of providing instructions for measurable criteria. Depending on the situation and the project, there can be quantitative metrics driven to measure each question. I would like to think that the team members and the business stakeholders can derive the most important metrics around these questions, depending on the situation they are in.
Question 1: Is the customer happy?
This question relates to our past and current interactions and deliverables for the customer and being able to support our existing product in case of failures. Our customer could be an internal group in the company or could be an external business partner or simply the public. Teams should find the most important metrics relevant to their situation but some examples are:
- How quickly we can fix defects
- How quickly we can respond and how quickly we can recover from production issues. The team to respond and team to fix and deploy could be different and I think it’s beneficial to measure both.
- What percentage of our product releases have defects (or if the code is not in production yet perhaps the highest level environment available)
Some lower-level measurement examples that can surface code/architecture issues
- How many deployments do we need to fix defects? If you have to deploy many services to fix one defect that is something to focus on to improve.
- How many people need to be involved to fix defects. If you need engineers from multiple teams and perhaps cross-functional teams (business analysts etc) to fix defects, that might indicate a broader issue.
Question 2: Can we keep the customer happy?
In this context, this question relates to being able to deliver new features to the customer. There are many factors for being able to deliver new features, which can be complicated. You can have a dependency on other teams, or your code could be messy, this translates to spending more time on coding when any changes are needed. Even if this is not an isolated measure, it does have a significant benefit to continuously measure this to improve, you have to start somewhere. Teams should find out the most important metrics for themselves but some example metrics for keeping the customer happy:
- How quickly we can implement new features for our customers (aka lead time)
- How many teams need to communicate to implement a feature (If the average is much higher than 3, this can indicate an architectural or organizational issue)
Question 3: Is the team happy?
Traditionally, a team’s culture-related metrics are excluded from performance, but as the culture of the team plays a critical role in the performance, I think teams should develop metrics to measure how they are doing themselves. Team culture is about what people do and how they do it. A happy team has a strong culture. Although it’s a challenge to measure happiness, here are my ideas that I think works to a certain extend:
- Is the team innovating? Best teams innovate, they come up with ideas to improve the code base, the work process, and the overall organization. How you measure innovation can be tough. One idea is to create a database where the team enters new ideas. We would then measure the percentage of work that team is doing from new the ideas database.
- Is the team empowered? If the team can take decisions themselves and implement those decisions without asking the management then they will have a direct stake in the success of those ideas. In my opinion, empowered and independent teams will come up with ideas and will be more willing to go the extra mile. You don’t want a team to only do what it is told. You are then not taking their brainpower into account when making decisions for them.
- Are we collecting data from the team? Making anonymous surveys on the team members to read what they think about the current workload or team dynamics is something. Does the team know the common objective? Do the team members know what they need to do to get to a common objective? Does the current plan make sense? Do they feel empowered to make decisions? Are they burned out?
Eventually, even if you follow the best practices, there are things that team managers or teams cannot change in a given organization. And because team performance is highly dependent on the broader organization, I recommend reminding Reinhold Niebuhr’s serenity prayer when teams hit organizational barriers.
I’d like to hear your opinion and experience about how to measure software team performance. I’m always looking to meet curious software engineers. Please drop me a line at firstname.lastname@example.org
Thanks to Andy Souvalian, Akshay Bagai, and Hakan Sonmez for their valuable review and comments.