Regression Isolation vs Code Diving

As developers we deal with regressions on a regular basis.  Regressions are changes that are introduced to a system that causes a potentially unwanted change in behaviour.  Engineers, being wired the way they are have a tendency to want to fix first, understand later (or understand as part of the fix).  In a large number of cases however, it is considerably more effective to isolate and understand the cause of the regression before even diving into the code to fix it.

This is a continuation of a series of blog postings I am making on regression isolation  and bisection, the first of which was  “A Visual Primer on Regression Isolation via Bisection”.  If bisection and regressions are terms that you don’t solidly understand, I strongly suggest you read the primer.

Continue reading “Regression Isolation vs Code Diving”

A Visual Primer on Regression Isolation via Bisection

Identifying regressions via bisection is one of those software debugging techniques that I find under utilized and under appreciated in the software industry.  Bisection can be used to isolate changes in anything from BIOS updates to software updates to source code changes.  This article provides a backgrounder on what bisection is, and how it is useful in identifying points where a regression has been introduced.

This is the first in a set of three posts covering regressions.

Continue reading “A Visual Primer on Regression Isolation via Bisection”

Getting Good Estimates

Good estimates are hard to come by.  They are typically too optimistic or too pessimistic or aren’t grounded in reality.  Here is my approach to effort estimation.  I’ve used it successfully in a number of roles and have seen engineers go from poor to reasonable to good estimators.

UPDATE: I have gathered some thoughts and comments and included them in this update.

UPDATE 2: I have an update on the methodology and some further insights in this blog post.

What I look for in Estimates

Typically when asked for an estimate, you will get a single value with no qualification.  “The work will take 3 weeks”.  Experience has shown me that the a single value implies a lack of understanding the nuance of the problems and issues that the task might have lurking just below the surface.

When asking for an estimate, I’m looking for two things.  1) A baseline effort, and 2) A confidence interval.  This comes in one of two forms

  1. 4 weeks of effort with 60-70% confidence
  2. 3-6 weeks of effort

Both these values are effectively the same.  I let the engineers choose which ever one they are comfortable with.

Characteristic Curve

I can’t recall when I began to understand the characteristic curve within the methodology I use for engineering. I’d say that a long-term colleague Larry Bonfada was a strong influence in the thought process and I have since seen similar characteristic curves in Waltzing with Bears: Managing Risk on Software Projects by Tom DeMarco and Timothy Lister.  I don’t have sufficient a background in statistics to define the shape.  Feel free to leave a comment to educate me on the distribution type.

The critical sections of the curve in the table below.

Section

Description

Confidence

Absolute Earliest

The absolute earliest date that the task can be complete.

0%

Highest Confidence

The date that represents the highest likelihood of being delivered on or around.

60%

Long Tail

Worst case scenarios, if things go wrong, this date will be hit.

<10%

Typically engineers will choose one of those sections for their estimates.  Optimists will communicate the absolute earliest date, pessimists will go for the long tail and your more experienced realists will go for the point of highest confidence – somewhere in the middle.

Shaping the Curve

Quite possibly you are thinking that to get this curve you have to apply painful or difficult to use models; fortunately, it’s not rocket science.  Most engineers actually have a strong gut feel for the shape of the curve, so it’s a matter of teasing out a good estimate.

The way it works is through a set of questions to the person providing the estimate.

Question

Answer

What’s the lowest effort for this task?

 2 weeks

What’s the likelihood it will task 20 weeks

 1%

What’s the likelihood it will take 10 weeks

 5%

What’s the likelihood it will take 5 weeks

 30%

What’s the likelihood it will take 4 weeks

 50%

What’s the likelihood it will take 3 weeks

 60%

I intentionally use an number of extreme points (10,20 weeks) to drive the shape of the curve.  When graphed, it comes up similar to below.

I find that most engineers will naturally have a strong gut feel for the estimates and in the majority of times will give numbers that result in more or less the same shape.

Now of course, there are a class of engineers who are either so cautious that they always estimate in the long tail – or too optimistic (or naive to the real effort) that they will always resist this sort of analysis.  My advice is to push through with them (or at least work out a way to interpret their estimates).

From the answers to the questions in the examples above, I’d walk away with either of the agreed to estimates of 2½ – 4 weeks of effort or 3 weeks with 60-70% confidence.   Each team or organization will have it’s own sweet spot of acceptable range.  Tightening and getting the estimates to the right shape, usually involves a mixture of analytics and soft management skills.

Tightening the Curve by Managing Unknowns

The uncertainty in the curve is representative of a number of different factors, be it experience, unknown complexity, inter-dependencies and so on.

A hallmark of a large amount of unknowns in this sort of analysis is overtly broad ranges.  I’ve had engineers give a range of 2 weeks to 3 months.  Obviously the estimate isn’t workable by any stretch of the imagination.  The engineer in this case is either being obstructionist or hasn’t, or isn’t willing, to look at the unknowns that would drive such a broad estimate range.

The types of questions that I tend to ask the person giving the estimate will be along the lines of

  1. What could happen that will prevent the absolute earliest time from occurring?
  2. What could happen that would push you from the 60% confidence date to a later date?

As each of these questions are answered with the unknown factors becoming more visible, you can revisit the original estimating questions again after those factors have been determined.  If you are lucky there are some factors that are either issues that can be dealt with or risks that can be mitigated or removed.  In addition, it is worthwhile and discover these issues and risks and have them tracked formally as part of the greater project.

I generally find repeated cycles of this sort of analysis serve to improve the estimates to the point where I am comfortable to accept the estimate into the project.  With each iteration the discovery process either moves the overall curve to the left or the right (smaller or large effort) or tightens the shape of the curve (increasing the confidence).

Feel free to provide feedback below on how you deal with estimates.

At-Launch Linux Hardware Enablement

The Linux market presents some unique challenges to Independent Hardware Vendors (IHVs) in bringing their products to market with broad support available at the time of launch.  Independent of ideological or pragmatic rationale, both Open Source and proprietary drivers are constrained by similar mechanics.  This article provides a broad outline of the mechanics and considerations that are needed for delivering hardware support at-launch.

Continue reading “At-Launch Linux Hardware Enablement”