One of the first articles I published highlights the problems created by doing funding and approval in large batches. Unfortunately, this is the status quo in most organisations and a lot of the malaise we see in I.T. is difficult to improve because of this.
Addressing this requires a viable alternative though. It’s not enough to just point out the problems, you have to provide a credible mechanism to maintain the control and visibility required. Like it or not, there are people in the system who are making investment decisions and have a fiduciary responsibility. Just “doing away” with the controls is not an alternative. How can we improve the system while still providing the necessary control?
The first step to changing this is, as with many things, to make the problem visible. Unless there is some acknowledgement and agreement that there is a problem, then you’re likely to meet significant resistance in changing anything. The incumbent works reasonably well from the perspective of those who have designed and operate the current system – they just can’t see the damage.
Feast and Famine
Every context is different. The negative impact of large batches upstream manifest differently depending on the setup. At Maersk Line, one of the key pain points we highlighting was the effect of funding large batches of work on the development teams.
As you can see, the funding and approval of large batches upstream created a shock loading on the downstream processes (Scoping, Design, Development, Testing). In particular, the Development Capacity was prevented from working on anything that wasn’t approved. This meant they often ran into starvation mode roughly every 13 weeks. They were basically going from feast to famine, causing more than 10,000 hours of idle time in one year.
So what? Maybe they didn’t need that extra capacity? Even if that was true, it would be more efficient and effective to have a stable team with a constant capacity that is 10,000 hours less, and smooth the flow of demand through this system.
Smooth Sustainable Flow
(The truth is that this system was an organisational bottleneck. Any meaningful change in the organisation was highly likely to be dependent on this system. There was a severe shortage of capacity to deliver changes and it was effectively holding the whole organisation back. They desperately needed that 10,000 hours they were losing but they just couldn’t see the problem. Previous attempts to alleviate this problem foolishly focused on efficiency, which predictably made the problem worse.)
Maersk Line were guilty of the same oversimplification that most large organisations fall into: they were treating I.T. like a tap.
Unfortunately, this is a myth. Great software is not like beer. You can’t just turn on a tap and the magic flows. You can certainly turn it off pretty quickly – and destroy a huge amount of value in minutes. But recreating that capacity and any meaningful flow can take several months.
If you don’t believe me, try looking at the Cumulative Flow Diagram for any project where you pulled together a team and set them to work on something. It’s typically an S-curve, with a slow ramp up before the team work out how to apply themselves to the problem domain, the technology (and any existing codebase) and not the least of all, how to work with each other.
Alternative Funding Models
But, if we fund the capacity, how do we maintain control over what the team does? Well, there are some options here. When we proposed this to Lucas Vos, the board member at Maersk Line, we presented three options, all of which would enable us to smooth the flow of work:
Just-in-Time
In this model, funding and approval is attached to each and every requirement. This provides maximum control, but it also requires a very fast mechanism for responding on demand. If funding is delayed the team run dry, with an immediate escalation signal highlighting the source of the delay.
Buffering
In this model, we fund and approve a small buffer of requirements, with the buffer size set based on the 90% confidence throughput (using historical data) for the duration between funding and approval meetings. This provides a high degree of control over both funding and approval, whilst ensuring the team will have the funds to maintain capacity and throughput as well as clarity about the requirements they will be working on. The trade-off is that any high Cost of Delay requirements that arrive immediately after a funding an approval meeting either have to wait until the next window of opportunity, or be dealt with outside the normal controls.
Time Based
In this model, funding and approval are “decoupled”. The team’s capacity is funded for a given period (until the next meeting). There is also visibility of the current priorities on the Dynamic Priority List. Those priorities can and will change though, especially as new high-cost-of-delay work arrives. If the priority is high enough, this can be triaged and delivered within the period. As such, this model requires a higher level of trust that the triage process is both visible and prioritises correctly. Approval can be either implicit (based on priority order – where the top 10 or so requirements are de facto approved) or explicit, where there is a specific check on each requirement before it can be pulled by the downstream capacity. An item that is high on the list but needs specific approval (say, from Enterprise Architecture) is held in the DPL until approval is granted. In the meantime, lower priority but already approved work is available to be pulled and developed. For as long as the capacity is funded, the downstream pipeline never waits – it operates smoothly, pulling and breaking down or slicing work as capacity becomes available.
…then select a scheduling method
When presented with the above three options, the Maersk Line board member, together with the senior executives who operated the funding and approval process opted for the time-based model. Having made the problem visible, the response to the proposal was, “I can’t think of a reason not to do this”.
They chose a rhythm of Quarterly funding for this system. We provided a Quarterly report of key value, flow and quality statistics like value delivered, cycle time and throughput using a cumulative flow diagram, the speed of key feedback loops and other improvements in the speed of the last mile to production. We also reviewed a current snapshot of the top of the Dynamic Priority List. We modelled options to make small adjustments to capacity (up and down) and based on the Cost of Delay of things currently queued for development asked them to fund the next quarter. Other systems wanted to do this more frequently, so went for a monthly cadence.
It is possible to use whatever scheduling/prioritisation mechanism you prefer with this funding system. At Maersk they used CD3 – Cost of Delay Divided by Duration – where Duration at the portfolio level was a very simple T-shirt sizing based on a combination of educated guess of complexity and reference-class with other similar requests. The only caveat with the scheduling method is that it works much better with an effective “slicing” mechanism in place, and some way of encouraging smaller items (which CD3 provides).
How to increase or decrease capacity?
Of course, the capacity you have today is probably not what you’re going to need in the future. Assuming you take this approach, how can you then make changes?
Well, firstly, changing capacity should be considered a strategic decision, not a tactical decision. Demand almost always exceeds supply, but that supply isn’t cheap to develop nor is it fast. The way to deal with it normally is to turn it into a scheduling problem. CD3 is designed to maximise the outcome possible for a given scarce capacity.
If, however, the Cost of Delay of things that are forced to wait is high enough (and persistently so), this may justify attempting to develop additional capacity. This is not a temporary commitment though – it should be considered that you are developing a strategic asset – the ability to deliver change quickly from Lightbulb to Live. Likewise, the decision to reduce capacity should be carefully considered. It is easy enough to switch off, but there is often a significant investment in time and money getting that capacity back.
A case study
If you want to read about the results of doing this at Maersk Line, we wrote about this in the Black Swan Farming paper, which you can download and read here.