Estimation is hard – but worthwhile

One of the most painful parts of product development is the estimating part. Dan Milstein talks (at length) about the complexity of this in his post about why we all suck at making estimates.

When you first hit this pain, you think “We should just be more careful at the specification stage”. But this turns out to fail, badly. Why? The core reason is that, as you can see from the examples above, if you were to write a specification in such detail that it would capture those issues, you’d be writing the software. And there is really just no way around this. (if, as you read this, you’re trying to bargain this one away, I have to tell you — there is really really really no way around this. Full specifications are a terrible economic idea.

A few thoughts on this…

It’s the same with estimating value

You know all those problems we have estimating how long something takes? Well, we have the same issues when it comes to estimating the value side of the bargain. The only way we can actually be absolutely sure about the value is to actually build it, discover whether people use it and see if it has the desired effect. On top of that, we have the complication of separating the effects of the change from other changes we make at the same time. Add the inevitable swings and roundabouts of the wider market and economy and it really is fiendishly difficult.

The positive side of this uncertainty is that the value is effectively unbounded (but in a good way this time). An example would be the original value estimate for SMS messaging, which would have been fairly small in comparison to the trillions of dollars in additional revenue that they turn out to have delivered. The best case scenario on the cost side is that the cost and duration are zero – so it’s bounded. On the value side there exists the possibility of discovering an asymmetric payoff, just like with SMS messaging.

It’s still worth doing

Just because it’s hard doesn’t mean it’s not worth trying. Even poor estimates can have a big impact on the resulting value you get, for a given effort. You may choose to ignore the economics, but the economics won’t ignore you. The alternatives (FIFO scheduling, Eurovision, random selection) are a poor solution from an economics perspective. Consider:

  1. If we don’t attempt to quickly estimate the size (or duration, for CD3) then we will end up prioritising some things that we all expect will take longer. This is a significant suboptimisation.
  2. If we don’t attempt to quickly estimate the value and urgency (and express this as Cost of Delay) then we will end up prioritising things that we all expect to be less valuable and less urgent. Again, a massive suboptimisation.

We just have to accept that there is inherent uncertainty and try it anyway – by making brave assumptions if necessary. As a first step, it’s a good idea to start by asking the experts – those who have the most knowledge of your customers and users. But don’t stop with the expert (or the HiPPO, who is often wrong). Make sure that your value estimates and assumptions are stated and visible to as many people in your organisation as possible.

Bias and random variability

Accuracy vs PrecisionWe can separate the difference between estimates and actuals into two parts: bias and random variability. Bias shows up when we aggregate and measure the drift from the actual. We know from Civil Engineering projects that optimism bias leads to estimates of project duration that are about 20% shorter than actual and with cost overruns of about 50%, even after an awful lot of up-front analysis. We should expect the bias to be worse in product development since the number of degrees of freedom are probably an order of magnitude bigger and we are working in the complex domain. (Thanks to hundreds of years of experiments and empirical knowledge, Civil Engineering projects are now more complicated than complex – or at least our ability to apply heuristics and patterns that we know work is very well developed). What is interesting though is that on the benefits side of the equation, optimism bias is only ~2% (though again, difficult to measure, even post-project). Systematic bias when it comes to estimating value is actually fairly small, at least with civil infrastructure projects.

Bias is ridiculously easy to correct for – by simply applying an adjustment based on past performance. For Civil Engineering projects, the Treasury Green Book mandates exactly this for all public expenditure. Another example, from one place I worked, is that whenever we got an estimate from the Head of Architecture, we would simply multiply it by three. Most organisations will have developed some sense of how wrong their initial estimates are. Unfortunately, in the complex domain, Hofstadter’s law applies – possibly because the target technology domain is often increasing in complexity (technical debt?), or perhaps because the estimator starts to take into account the adjustment that they know will be applied.

What we can’t correct for though, is random variability. What we can do is recognise that, on aggregate, it cancels out. This is why portfolio theory works. Decisions where we were simply unlucky are cancelled out by decisions where we were simply more lucky than we expected. Of course, the distribution of lucky to unlucky is usually a pareto distribution – the long tail. With value, we are in the realms of non-deterministic statistics – we cannot know in advance whether an idea will yield the value we expect. The discovery of whether an idea is truly valuable or not is a stochastic process, with a lot of random variables. This is how Venture Capital firms and Incubators (like Y-combinator) work – they are hunting for positive Black Swan events. The random winners that will pay for all of the random losers.

Where does that leave us?

There is no panacea: making estimates of duration and value is really hard. Punishing people for getting their estimates “wrong” certainly won’t reduce the inherent uncertainty. Neither will incentivising them to get them more “right” – they will simply pad their duration estimates and low-ball their value estimates (or focus purely on cost reductions that can be more easily measured). We can break things down and estimate the duration of the smaller parts (or even go for a goldilocks approach, where we flow work that is of roughly equal size). This doesn’t work on the value side of the equation though, so don’t be tempted to throw the baby out with the bathwater either – there is a point to estimates. We are attempting to make the most of a scarce resource. This requires estimates of both how long the pipeline will be blocked for, and the value that we hope will result.