Understanding Engineering Team Performance Metrics and How to Get Started

7 min readJan 27, 2022

In this article, I will talk about understanding engineering team delivery performance. Measuring team delivery performance is a topic that people may have mixed feelings about. Therefore, it is important to understand why doing this, what it really means and how it can help engineering teams.

Revised on Sept 2022

Understanding Performance

One of the most referenced resources, when talking about performance in technology organisations, is the book Accelerate. In this book, the authors talk about the metrics that are predictive of performance and outcome-based software engineering. I would like to explore each of the metrics below and try to explain how they can help teams to understand their performance.

Delivery lead time
Deployment frequency
Mean Time to Restore — MTTR
Change Failure Rate
Unplanned Work

Why teams should care about it?

Teams often build systems to provide rich data insights to customers. What is missing is data to improve their own software delivery lifecycle.

Your team is likely responsible to build, deploy, operate and observing software in production. The team should react to bugs and other anomalies in a timely fashion because they will happen. The system needs to be reliable, and so does your team. It is important to understand how unplanned work affects your feature quality and lead time. Is your team reacting quick enough to fix bugs in production? What is the ratio between deliverables with business value and bugs? If teams can understand how they are doing regarding these and other metrics that I gonna highlight here, they are likely to end up improving not only their software but their processes as well.

Delivery Lead time

This is the time it takes for a feature to be fully available to customers. It starts counting when the engineering team is able to start working on it. So, time spent on product discovery, design and validation is not taken into account (Donald G. Reinersten called it the “fuzzy front end” in his book The Principles of Product Development Flow, 2009).

One common caveat while capturing delivery lead times is knowing if they are related to outcomes or if they relate to outputs, as it is important to understand the difference between the two.

Let’s try to understand the distinction between the lead times by reading the picture above from left to right. First, we have the desired impact (i.e. “Increase profitability”). To actually create that impact a team has to deliver outcomes or features (i.e. “Provide return insights to partners”) so that customers could behave in a way that would create the expected impact. An outcome is, normally, a collection of outputs (i.e. “Consume data from Stream”). The outputs are part of the daily team activities and they are the issues/tasks that a team works on to create an outcome.

By measuring the lead time of the outputs we will find values in hours or a few days, while measuring the lead time of the outcomes, values will range from weeks to months, usually.

Is common to see teams measuring the lead time of the outputs, as it could be easily obtained in existing tools. Obtaining the lead time of the outcomes would require some extra work, but it could help to establish better communications between product and engineering.

Deployment Frequency

Measures how frequently the team ships changes to production. High frequencies denote that a team works on small batches and pull requests are likely short and easy to review. Small pull requests may translate into more reliable software, once reviewers will be focused on small changes, one at a time, making the review process more effective. Also, it may impact the time to restore (see MTTR below) by making it easier to isolate specific changes that could be the cause of an eventual anomaly or incident. Delivering multiple times in small batches may increase confidence while building complex features.

Mean Time to Restore — MTTR

Time to recovery is the time it takes for the team to recover from an undesired system state (e.g. recovery from an incident).

It is necessary to identify what kind of bug requires immediate action, as not all bugs have high severity. For that, a team may decide to categorise the bugs and tag the ones that would be counted for this metric as "blockers". Moreover, it is important to capture how much time was spent on understanding the bug to fix it. The MTTR may consider the time spent in investigation, the time the pull request took to be approved and the time the change took to traverse the delivery pipeline.

Tip: The lead time of pull requests can be used as an extra metric that would give support not only for the MTTR but also for the lead times and deployment frequency.

Change Failure Rate

While a team produces outputs to build a feature, some of these may cause anomalies or incidents. The change failure rate captures this ratio. In practice, taking my team as an example, we use thresholds to classify the performance as high, medium or low. For example, if the number of bugs represents less than 5% of the deliverables, we understand that, for this metric, the team performance is high.

Unplanned Work

Measures the amount of work that was not planned to be executed during the running sprint/cycle. It can be split into two types:

Unplanned work due to long support queries, incidents or overlooked functionality needed for a feature, for instance.
Unplanned work due to extra capacity in the team (e.g. pull work from future sprints). That may help to understand how assertive the estimations were.

The unplanned work can be used to reason about the delivery lead times, once high lead times may be due to extra work on support activities or lack of planning or design.

Unplanned work that happens due to missing functionality can be particularly harmful. That may lead to last-minute changes in design or implementation and that can cause new bugs and affect the overall system reliability.

How it can be helpful

How can the performance metrics help me? A simple answer would be “To take actions based on data and not rely only on feeling, memory and ticket data mining” to:

Understand how the unplanned work is affecting the deliverables.
Understand how reliable the software is and take action to improve quality.
Understand team capacity and velocity.
Understand if the team processes need optimization — to have all the metrics above it is necessary a certain level of organisation.
Understand if there is space to optimise pull requests — to reason about user story and pull request sizing.
Reason about the software quality and improve the test strategy, test coverage and overall observability when necessary.

For product managers, the metrics can be helpful to understand the feature turnaround and clarify the reasons why an outcome is delayed or delivered out of schedule so any issues at that level can be mitigated. That is important because delays in engineering may have their root cause in early phases like discovery and design (which may relate to the amount of unplanned work), even though these stages are not taken into account while building the lead times. It is important also for having transparency between engineering and product.

How to get started

Extracting metrics from the process in place will require some effort to start. The different ways that teams work, including the tools in use, are also a factor that makes it a little harder. What works for one team may not work for others. There is no standard recipe.

The first step is to have good “ways of working” in place. That includes good issues management and making use of labels properly.

The second step would be to understand what is available in your organisation around this topic and which other tools may help you. Some metrics could be extracted from the standard CI/CD and version control system tools in place, and your company may have something out-of-the-box for you to start.

There are tools like Implement.io. This tool could put you steps ahead in terms of understanding engineering delivery performance.

I wrote an article describing a simple approach for enabling metrics gathering.