Ain't testing finished yet, Ma?

Original URL: https://www.theregister.com/2006/12/19/testing_finished/

Making the most of seminar breaks

Posted in Software, 19th December 2006 00:28 GMT

A week or so ago I met Geoff Brace, a Director of Owldata Ltd, at a BCS CMSG (Configuration Management Special Interest Group) seminar on ITIL - yes Matilda, ITIL is relevant to developers. We got into a discussion about testing, sparked by this article.

It reminded Geoff about some attempts he'd made to predict defect rates in software - "if testing finds a defect rate above a certain threshold," Geoff suggests, "perhaps the code should be rejected [and rewritten] rather than tested further".

This makes sense: if your testing is estimated at 50 per cent efficient, then finding 20 bugs means you probably have 20 left. The more you find, the more likely the software is to be defective when you finish (making some assumptions about the quality and consistency of testing).

Now consider a certain “cowboy” style of programming - code a chunk of a programme and throw it at the compiler, using the compiler to check for the undeclared variables and syntactic errors. You then iterate, refining and correcting the code until it compiles - any defects found in testing are then obviously the tester's fault :-)

Well, Geoff says that he found out the hard way that if you write carefully and desk check thoroughly, you not only get code that compiles first time but also runs correctly. So, he suggests, why not combine these ideas. Write the code and compile it - then count the number of iterations needed to get a clean compile.

Geoff suggests that this number could be a good predictor of the number of run time detectable defects. Unfortunately, it's difficult to prove - he points out that you'd have to deny the code writer use of a syntax directed editor and make access to the compiler such that you can actually count the number of times the same item (at different versions) is submitted.

You'd probably end up with something like the old batch processing of FORTRAN compiles, when you punched the program up on cards, put them in a tray in the computer room and got back a print out the following morning. Well, who'd want to go back to those days - but, thinking about it, desk checking code to get out compile errors did find a lot of logic errors too.

So, we decided that this idea probably didn't get us anywhere in the end. But it did highlight one point. If you don't know how many errors are in your code when you start testing, how do you know whether you've finished testing?

The obvious answer is 'when the time allocated for testing has run out' – i.e., when you hit the go-live date the boss agreed to before anyone knew what the project involved in detail - but that's really hard to defend.

Another approach is to plot 'defects found' against time and to stop when the curve flattens out - but that might just show that the test pack is inadequate....

And there are various mathematical predictors for potential defects, so you can stop when you've found something like that number.

However, it really comes down to balancing risks - the longer you delay going live, the greater the risk of business losses (assuming the program does something useful) caused by using the old processes. The earlier you go live, the greater the risk of the new application not working ,or causing damage to the business. I find risk-based testing (see here, for example) a very attractive approach.

But there's another predictor in all of this. If someone doesn't have a rational idea of the "success factors" for testing and can't come up with a rational approach for deciding when it's finished, I predict that there's a good chance that the application will be rubbish.

Last word to Geoff: "When I was working with a team developing avionic systems, their final acceptance criterion was, 'Would you fly it?' They actually visualised themselves as the pilot. This was an interesting approach and seemed effective."

Ain't testing finished yet, Ma?

Related stories