Is it time to rethink velocity?

Over the past couple of years, I have had a paradigm shift when it comes to velocity generally. I can still see its value at a team level;

Benefits of Story Pointing

It aids important conversations and ensures there is a shared understanding of what is required, for instance in refinement if one engineer has said 1 pt and another 5pt you can uncover complexities or unknowns by trying to ascertain why there is a difference.
In a similar vein, it can act as a prompt when thinking about how to structure stories and tasks and when to split/break the requirement down further. If the requirement is pointed at say 8pt, could this be a signal that it’s verging on the too complex and would we be better off removing some of the risk and slicing it smaller
It can also aid in forecasting and ensuring teams work sustainably, using past velocity we can be realistic in our planning endeavours and aim for something that is realistic

But there are downsides and I have come to realise there is a big problem with velocity when it is used as a metric outside of the purpose of aiding forecasting and reflection at a team level and here’s why:

Downsides of Velocity / Story Pointing

Transparency: There is a lack of transparency around whether or not what is counted as ‘done’ is delivering value. A team can be delivering X amount of story points consistently every sprint, what does this tell us? It could be that teams are working on a lot of BAU, Bugs, and Keeping the Lights On which are important but perhaps not aligned with the strategic direction. It could be that easier, less complex solutions are favoured if velocity is the main driver of success.
Output over Outcomes: Aligned to point one, it encourages teams to think about outputs (number of points they can burn through) above whether what they are doing will drive the right outcome. What use is delivering the wrong thing quickly? This would apply to throughput and velocity if we use them in isolation. We need to focus on outcomes and ensure that we have validated the direction we are travelling before we optimise and focus on speed.

Working software is the primary measure of pro

Waste: I see a lot of precious team time wasted talking about story points – and I strongly believe that the time would be better spent developing or discussing the actual approach and solutions ahead of us. Something far more important than an arbitrary value the team finally ‘agree’ to associate a piece of work with.
Unknowns: Story points don’t account for unknowns and aren’t revisited. We learn as we work and story point estimates don’t often change in line with increases or decreases in the size or complexity of work.

What could we do differently?

Flow Metrics: What does velocity give us that can’t be achieved via throughout?

I looked at the data and compared velocity and throughput across all teams for the past 3 months. As you can see in the graphs below, there’s very little difference:

The conclusion I have made from this data is that in both approaches, teams would be able to forecast by saying the average number of issues or the average number of story points and achieve the same results. If velocity is used as a gauge for predictability – switching to throughput would provide the same result.

The benefits of using throughput though would be it is based on reality – not a guessed figure and could be even more accurate than velocity and it takes away all of the effort and time (which is an accepted waste activity) associated with assigning a story point.

Teams could drop estimating via story points completely (if they wanted to). Grab back the average X minutes per session used to point and use it to do what they love doing. BUT If they see the value in pointing, some of the reasons are covered above, by all means, they still could point – the field would still be there, and you could still inspect the data in retros and team-level discussions – it just wouldn’t be something used outside of the team.

Optimising for the flow of value

Throughput and cycle time correlate to make a good KPI. In real time you can see the impact of one on the other in conjunction with metrics such as WIP, time in status etc.

When velocity is down… often you don’t know why. It doesn’t tell you anything about how the squad’s work is flowing. It could be related to the estimate more so than the work. However when throughput is down… it tends to paint a picture. For instance, you can look to what’s happening with cycle time? Oh CT is up and TP is down let’s investigate this. How’s WiP etc etc

Right-sizing: how would we know what we could plan for?

Similarly to using story points as an indicator for splitting requirements or forecasting, teams can get into a good predictable rhythm by using data such as cycle time and throughput.

A prerequisite is right-sizing. As the graphs above show, generally, there are the same ups and downs in throughput as there is in velocity. This is normal in our domain, we are in the complex domain where unknowns are still present throughout our delivery cycle. Regardless of which framework we use, it’s proven best practice to ensure we break big requirements down into independent small vertically sliced batches to minimise the risk associated with the big batch approach.

If we get the batch sizes correct and focus on the question ‘could we achieve this in [our average cycle time] days?’ the same prompts that arise from story pointing would still show up in this approach. One engineer could say – sure it’s low in complexity and another could offer up insights in a grey area unknown to some member of the team by saying oh I’m not so sure it feels like a bigger task to me….

It’s still a gut-feeling call, and like story pointing I can imagine it will be out a little now and again, but the conversation can still be had.

Using SLE via Cycle-time

Through right-sizing, we could set an Service Level Expectation of each story ‘should’ flow through the system in X number of days. This gives us a clear guideline when we are mid-way through a cycle on where we may need to optimise and intervene. For instance, our SLA is 5 days and a story has become stuck, using the average cycle time we can help identify the need to split or swarm just as we would with story points.

Forecasting using throughput / Monte Carlo simulation

As already mentioned, there’s no notable difference in the data that would indicate we couldn’t switch to using the number of issues rather than the number of points in our forecasting and planning endeavours.

Summary

This hasn’t been designed to be exhaustive, more of an overview of where my head is at when it comes to velocity and potential alternatives. I’d welcome thoughts/conversations / alternative approaches to surface off the back of this blog post so if it has provoked anything you might want to discuss further please reach out.