BLACK SWAN FARMING

How to do a *really* basic forecast

Forecasting delivery is something every organisation should be doing. Unfortunately, hardly any do. This is a shame because it’s actually quite easy, as hopefully you’ll see below.

Even a very basic forecast is better than blindly following a plan. It doesn’t need to be super complicated. There will be flaws, of course, but much like qualitative cost of delay, I’m hopeful you’ll see the benefit and go further.

What you’ll need

You really don’t need a lot, but there are a couple of things you do need:

Throughput – How fast are we going?

There are lots of different ways you can measure this. The easiest is to simply count up the number of things that get completed by the end of a week or sprint. This could be Stories, or Trello/Jira tickets, or Storypoints, or Acceptance Criteria. It doesn’t really matter what you count, as long as you are consistent.

Here’s some actual data from a company I’ve worked with, looking back over the last 6 Sprints (2 weeks each).:

 StoriesPoints
Sprint 251647
Sprint 261043
Sprint 271550
Sprint 281347
Sprint 292052
Sprint 301437

As you can see, I’ve got both the number of stories and the story points. You don’t need both, but you’ll see later why this is useful information to have. (Before you start throwing bricks, they were already doing sprints and estimating in storypoints before I arrived.)

If you’re not doing sprints (or estimating in storypoints), just count the number of tickets that get completed each week. Personally, I prefer to count Acceptance Criteria, but there often isn’t an easy way to extract this from the usual ticket tracking systems. Let’s use what they already have anyway…

Backlog – How far do we think we need to go?

The other key question is, based on what we know today, how far do we think we need to go? For this organisation they had three key things that they needed before they felt they would be ready to launch. The launch required alignment with a marketing campaign, for which there was some lead time to organise, and of course some costs involved. Once the date was set, it was expensive to move, so they really needed a high degree of confidence that the product would be ready enough for the associated marketing campaign.

 StoriesPoints
Search14100
Configure794
Book15172
Total36366

Again, you don’t necessarily need Storypoints, but they’d already done them, so let’s see if they’re of any use to us. We can do a very basic forecast using stories first, and points second, and see if there’s any difference. (You may already be able to spot a problem with using stories alone, but we can get to that in a minute.)

 Super Basic Forecast (using number of stories only)

Looking at the historical data, we can quickly get a sense of how quickly they are chewing through stories. Using Wolframalpha we simply input the number of stories of the last 6 sprints, one after the other:

https://www.wolframalpha.com/input/?i=16+10+15+13+20+14

Which gives us this result:

Mean: 14.67
Standard deviation: 3.327

So, on average, they deliver 14.67 stories per sprint. The backlog currently has 36 stories to be done. Based on this you might think that it would therefore take another 2.45 sprints for them to complete the remaining stories. Before we run off to tell the marketing team we’ll be done in 6 weeks, let’s try doing the same with the story points…

 Super Basic Forecast using storypoints

Again, using Wolframalpha we simply input the number of storypoints delivered over the last 6 sprints, one after the other:

https://www.wolframalpha.com/input/?i=47+43+50+47+52+37

Which gives us this result:

Mean: 46
Standard deviation: 5.367

So, on average, they deliver 46 storypoints per sprint. The backlog currently has 366 storypoints to be done. Based on this you might think that it would therefore take another 7.95 sprints for them to complete the remaining stories. That would be 8 x 2 week sprints = 16 weeks…

Wait, what?

16 weeks is more than double the 6 weeks we got from using story count. Why the difference?

Well, if you look at the historical data you can see that total number of stories delivered over the last 6 sprints is 88 stories, with a total of 276 storypoints.  So, the average number of storypoints per story is a 3.14. However, if we look at the backlog, the average storypoints is 366/36 =  10.16. Clearly, the stories on the backlog are at least three times the size of the typical story we have delivered in the past.

This is a really common scenario. As we refine stories, we find edge cases, things we hadn’t considered, and generally more work – which we break into smaller stories. (This is a good thing). The trick is that you need to convert your backlog stories somehow to roughly match the size of the the stories that you end up actually delivering. In this case, we can estimate that for each backlog story, this is likely to be broken down into ~3 smaller stories.

OK, so great – we can go and tell marketing to book the TV spots for 16 weeks from the end of this current sprint, yes? Well, no. Probably not.

How risky do you wanna be?

46 Storypoints per sprint is the average. This means, you have roughly 50% change of delivering more than this, and a 50% chance of delivering less. A 50:50 chance of making a date is probably not sufficient for expensive and difficult to delay TV spots. Most senior executives would also be unimpressed if you missed half of the forecast dates you provided.

So, unless you really like to live on the edge, I would not recommend using the average. On top of this, the average says nothing at all about the spread. Not only are you quite likely to fall short, but if the spread is large (which it often is), you may fall short by a quite a lot.

The cheap, nasty, but dead simple adjustment

If you’re in a rush and the thought of all that Monte Carlo and Weibull stuff scares you, there’s a super cheap and nasty but very simple way to get a ball-park forecast. Take the average throughput (46 story points) and subtract one standard deviation (5.367) = 40.6 stories per sprint.

If the distribution was Gaussian (hint: it’s not), this would give you ~80% confidence of delivering this many in a sprint. Since the actual distribution is Weibull, with fatter tails than Gaussian, I would suggest being a little more conservative and refer to this as roughly 70% confidence delivery forecast.

So, at a rate of 40.6 stories per sprint (roughly 70% confidence) it would take us just over 9 sprints to delivery 366 story points. Assuming the future is similar to the past, that would take 18 weeks from the start of the next sprint.

That’s it. You’ve produced a forecast. Pretty easy, huh? Don’t forget to update it with new information as it comes in. The backlog is likely to change and your throughput also. Redo the numbers at the beginning of each sprint and/or week.

Going further with Monte Carlo Forecasting

This is just to get you going. There’s a lot more you can do here with not a lot more effort. If you have a copy of Microsoft Excel, I’d recommend downloading Troy Magennis‘ Throughput Forecaster.xlsx. (Troy explains it a little here.)

Here’s how I’ve modelled the same data as above. On the “Throughput Samples” tab I’ve punched in the story points per sprint as above, like this:

Then there’s a few options on the “Forecast” tab worth exploring, but here’s the most simple setup:

On the same tab, you can see the resulting forecast:

As you can see, this much more thorough Monte Carlo approach gives fairly similar results as our cheap and nasty “70% chance of delivering 366 points in a little over 9 sprints”. In the background though, Troy’s spreadsheet runs 500 simulations, rolling the dice with the same variability as the data you provide. There’s also useful options for splitting larger backlog items into smaller delivery items, and (if you don’t have data) the ability to use some guesstimates of throughput.

Try it out

Hopefully this has given you confidence to have a go with some simple forecasting. If nothing else, make sure whatever date you do give, that you also communicate the uncertainty. Pretending that uncertain things are certain is a great way to lose credibility and trust. Ignoring the data you do have and trying to twist reality to fit your plans is often a recipe for disaster.

Forecasting is actually quite simple – give it a go! Find some data, try a nasty-simple 70% forecast. Give Troy’s spreadsheet a try and forecast like a pro 🙂