Updates on “Getting Good Estimates”
June 22, 2011 1 Comment
This posting is an update to the Getting Good Estimates article based on the comments received and further research from a number of sources. I include discussion on who should do the estimate, what’s included, references to other estimation techniques, refinements on the probabilistic estimation curve with contrasts to PERT and other techniques. New discussion is had on the “doubling of estimates”, effort vs calendar time, the funnel of uncertainty and finally thoughts on the experience level for shaping estimates against the engineer’s experience.
Who Does the Estimate?
When managers request software estimates from engineers, engineers should frown, look them dead in the eyes, and tell them that making estimates is a managerial/administrative task.
Interestingly is the polar opposite to Joel on Software
Only the programmer doing the work can create the estimate. Any system where management writes a schedule and hands it off to programmers is doomed to fail. Only the programmer who is going to implement a feature can figure out what steps they will need to take to implement that feature.
I tend to agree with Joel on this one. The person solving the problem is in the best position to determine how long a particular task will take. Moving away from software, people get very frustrated with cookie-cutter estimates from tradespeople, independent of the actual effort associated with the problem.
The estimate is not a negotiated value between the engineer and the manager, it is instead a shared consensus on the effort of the task. The engineer’s responsibility is to integrate the assumptions, risks and their individual capability into an estimate. The manager’s responsibility is to provide the bigger picture to provide the information needed for the engineer to integrate, and the commitment to
Another point that was brought up by my friend and colleague, Piranavan, is that estimates should be tempered by the individual’s strengths and experience. An architect who knows the system through and through will usually be able to deliver a task within a considerably shorter time than an intern that is new to a system. This underscores what Joel and I mention above. The estimate should really come from the person doing the work. It can be workable to have experienced engineers create estimates, but before the estimates become plans of record, the person making the estimate needs to temper the estimate with the individual doing the work.
What’s Included in an Estimate?
Immediately below the statement from Joel on Software, there was the following statement.
Fix bugs as you find them, and charge the time back to the original task. You can’t schedule a single bug fix in advance, because you don’t know what bugs you’re going to have. When bugs are found in new code, charge the time to the original task that you implemented incorrectly. This will help EBS predict the time it takes to get fully debugged code, not just working code.
This concurs with what I look for in estimates. Completed work, done… Done, done… Hands off keyboard, done… Delivered with minimal bugs, done…
I look the engineer dead in the eye, ask them to put their hand on their heart confirm that their estimate includes all work that’s needed for the task to be complete. Most engineers will pause and possibly realize that there is other work or unconsidered risks that might affect the estimate.
The intent isn’t to beat the engineer up, the intent is to dig down and expose any assumptions, concerns or other issues that may affect the estimate. Remember that the captured form of the estimate is either an explicit range or a single effort value with confidence interval applied
What Other Estimation Techniques Are There?
There are obviously many different techniques that can be used for estimations. A google query for “Software Estimation” yields 31,400,000 results. 10 pages worth of results in,
There are many, many different methods. Here are a couple of interesting and accessible ones.
Planning Poker is a group consensus system. There is a group discussion on the details regarding the task, and then everybody creates their estimates. These are then combined to determine a group estimate.
Evidence Based Scheduling by Joel Spolsky in 2007 predicates estimations down to less than 16 hours. This forces a level of design as part of the estimate. The estimates are not trusted until they get down to that timeframe. I’d imagine that the estimate/design is revised and improved over time. Jump down to the uncertainty funnel below for a discussion there.
Probabilistic Evaluation and Review Technique (or PERT) for short, provides a full system for estimation. The methodology also takes the probabilistic estimation curve and boils it down to 3 points on the curve: the optimistic, most likely and pessimistic. These are then calculated into a single estimate as shown below.
Probabilistic Estimation Curve
One of the key parts of the previous post presented the characteristic curve. As part of the research for this update post, I saw the curve in multiple places from papers on terminology to more NASA handbooks on estimation. The research provided a lot more nuance to estimation. It is also referenced heavily as the basis for the 3 point estimation technique used in PERT.
Although I haven’t confirmed, I believe that the probability function that closely matches this shape of a particular beta distribution
Refreshing with the graph.
Notice that I’ve marked the three critical parts.
|Absolute Earliest||The absolutely earliest that the task could be completed. This is assuming perfect understanding of the task and no unrealized risks. Basically the impossible estimate. Way too many estimates are based on this value.|
|Highest Confidence (engineer’s estimate)||This represents the highest confidence, and the likely point at which the task will be completed.|
|Mean (planning estimate)||This represents the mean of the estimate. I’ll dig deeper into this shortly.|
PERT provides a basis for determining the estimate based on the formula
Estimate = Mean = (optimistic + 4*Likely + pessimistic)/6.
This of course assumes that the least likely estimate is captured as a number, which in a lot of cases is quite hard to do.
If you’ve been in software engineering for a while, you probably have heard someone say “Take the estimate and double it”. The paper by Grimstad et al actually positions this in context. They make a similar explicit observation that the estimates for any task have a probabilistic shape with the two critical points. The highest confidence and the mean.
These two points carry particular value and should be used in two different scenarios. The highest confidence should be used by the engineering in tempering and improving their estimation. The mean should be used within the project management team to determine a likely cost or planned effort for the project. Both are rooted in the same estimation but are derived differently.
The estimation doubling is triggered by a gross simplification of the estimation process. Simplifying the estimation to a single scalar value from a probabilistic range makes it easier to aggregate numbers, however the aggregation should be the mean rather than the highest confidence. If you conflate the two values together you will end up with poor overall planned effort. Remember that the engineers will optimistically provide something between the absolute earliest and their highest confidence estimate so this is generally the number used as a scalar estimate and hence as the basis for estimate doubling.
Since there is the tendency to use the highest confidence estimate as a basis for planning and these estimates will be typically be lower than the planned effort we end up with a shortfall. To recover from this shortfall, the simplest model is to use an arbitrary multiple. Falling back to our probabilistic model, we see that the mean is a non-linear distance from the highest confidence estimate. The management of risks and unknowns shape the confidence associated with a task.
A well understood task may have a small difference between the absolute minimum, the highest confidence and the mean, a poorly understood task will have a greater spread. The size of the task (or the absolute minimum) carries no direct relationship to the spread of the estimates.
This removes the “double the estimate” for the purpose of planning. The use of the more nuanced mean or planning estimate should be used instead. Or put differently, the factor by which the engineers estimate is transformed into the planning estimate is proportional to the level of risk and number of unknowns. The higher the level of understanding of the risks and issues for a task, the lower the multiple should be.
Of course if this means that doubling the estimate may make sense in some environments, particularly when the estimate carries a lot of unknowns and is known to be optimistic or has not been tempered by the sorts of discussions suggested in the original article.
The Uncertainty Funnel
An implication of the shaping and discovery process I described earlier is that over time the estimates become more accurate as more information is discovered and as the project continues.
A number of papers show this in different forms. Page 7 of the NASA Handbook of Software Estimation shows a stylized funnel, and page 46 of Applied Software Project Management (physically page 15 in the Chapter 3 PDF referenced) book shows an iterative convergence of estimations in its discussion of the Delphi Estimation model.
Both these reference re-iterate that estimates are not static. Estimates should be revisited and re-validated at multiple stages within a project. New information, changes in assumptions and changes in risk profiles will shape the estimate overtime. I’d also suggest that the engineers quickly do a sanity check on the estimates they are working against before starting work on a new task. The estimate will generally improve over time as the risk discovery, problem understanding and task detail awareness increase the accuracy of the estimate.
Visionary Tools provides an interesting observation that if you don’t see estimate uncertainty reducing over time, it is likely that the task itself is not fully understood.
Interdependencies & Effort vs Wall Time.
Piranavan highlighted another area that I had left ambiguous. The discussion on estimates is focused on looking at the particular effort associated with a singular task. It does not expand into managing interdependencies and their effect on meeting an estimate. For the purposes of an estimate in these articles, it is the time applied to delivering the estimate. The external factors such as interdependencies, reprioritization, etc should not affect the value of the estimate. Here I do see the prime value of the manager being in the running defence for the engineer and ensuring that they have the capability and focus to successfully deliver the work with minimal interruption or distraction. This may mean delaying the delivery of the work, or assigning other work elsewhere. Always remember that sometimes you need to take the pills and accept late deliver.
A further subtlety on the interdependencies is that if an interdependent task is not well-defined or delivered cleanly or completely, that may force rework or rescoping of tasks and consequently it is likely that these external factors will inject bumps into the uncertainty funnel.
On Feedback Cycles and Historical References
logjam made the following point:
Managers should collect, maintain, and have access to substantial historical data upon which they can make estimates and other administrative trivia. What else are managers for? Of course, what engineers need to understand is the game at work here: making an estimate is primarily about making you commit to a date, with which you will be flogged by those asking for such an estimate.
I also think that estimating work is something that needs to be adopted in a weekly cycle. Capturing the changing estimates are important to understand that things are changing and need to be accounted for as well as a strong feedback tool for engineers to understand where previous estimates went awry (estimates vs actual). It also gives managers a chance to understand how close engineers were and whether or not that was an estimation error or an outlier (external priority change for example).
Whilst I don’t agree with logjam’s assertion regarding managers making estimations in isolation of the engineers, both of the responding comments point out the need to capture, manage and maintain estimates throughout the life of a project, and if possible educate the engineer on how to improve the accuracy of their estimates. That’s a topic for a later article.
Comments, suggestions or pointers are welcome below.