Whether you’re looking at your own code before (or after!) you have shipped it, or you’re picking up someone else’s code after they have shipped it, tracking down and fixing bugs is a fundamental part of programming. If you know the code well, perhaps you can make an intuitive leap to immediately jump to where the bug is. But how do you go about tracking down a bug when intuition doesn’t help?

The nature of all code is that larger systems are built from smaller underlying systems and components. They in turn are also constructed from smaller components. The bug you are tracking down will have a cause in one of these systems, and will have symptoms that are visible in other systems. The remaining systems work fine (as far as the bug you’re looking for is concerned), and you can use this to quickly and reliably find where the bug is.

Divide your larger systems down into smaller systems at logical points, such as different server stacks, APIs, major interfaces, classes, methods and if necessary individual lines of code. Test both sides of the divide, with your tests focusing on the data that crosses the divide. If one side works as expected, the bug is not in there, and you can eliminate that side from further testing. Continue testing the remaining systems and components, which you have now isolated, by dividing those up into smaller systems and components. Keep going until you’ve reached the smallest testable system, component, unit, or lines of code that show the fault. Congratulations: you have isolated the fault.

Apart from being a strategy that allows you to work on code you’ve never seen before, this approach also has the advantage that it is evidence-based. This approach eliminates guess work, and it forces developers’ assumptions about how their code actually works in practice to be challenged. The data never lies, but be aware that it can be mis-interpreted!

The approach is iterative, and you’ll find that you’ll often go back and forth between your code and your tests, making your code easier to test and your tests have clearer and more targeted test domains and results. Fix the tests that are relevant to the bug you are tracking down, and make a list of any other issues you find along the way for you to come back and address at a later date. Stay on target, and park potential tangents and distractions for another time.

Although this sounds like a slow process when described on paper, with practice it can be executed at high speed during an emergency situation. However, the need to restore service in a timely manner isn’t always compatible with this approach, and you’re normally better off returning to your test environment where you can study the fault without inconveniencing your customers any further.

3 Comments

  1. Webby Scripts Stuart Herbert On PHP – Isolate To Eliminate says:
    October 10th, 2009 at 8:01 pm

    [...] the original post: Stuart Herbert On PHP – Isolate To Eliminate [...]

  2. Stuart Herbert’s Blog: Isolate To Eliminate | Webs Developer says:
    October 12th, 2009 at 3:02 pm

    [...] his most recent post Stuart Herbert has a suggestion that can make your development life simpler and make debugging less of a headache down the road – [...]

  3. Jatinder Bhambri says:
    January 12th, 2010 at 9:22 am

    Nice comment

    isolate to eliminate….

    i HV ALSO D SAME OPINION TO OTHERS…?//