Original URL: https://www.theregister.com/2007/07/15/all_pairs_testing/

How can you possibly test modern software fully?

Pairing up for fun and profit

By Keith Vanden Eynden

Posted in Software, 15th July 2007 07:02 GMT

The common assumption about software testing is that "more is better", and testing all the possible states and variable combinations guarantees you will find all the bugs.

In the real world, however, there is not enough time or enough testers to test every combination of every variable. Not all bugs will be found, making quality assurance a risk management discipline. How can you validate that your product is ready to ship within reasonable time and cost parameters? In other words, how can you manage the risk of not testing everything? One solution is to use structured testing methodologies, supported by proper tools, which help you quantifiably manage this risk.

Practically speaking, the role of quality assurance is to reduce the risk of these bugs ending up in the final product. Software complexity puts a huge burden on QA teams, which are typically much smaller than the development teams writing the software (it's even worse if there isn't a QA team and developers take on the role part time). It is also very easy for one developer to write a small amount of code that requires a significant amount of testing to ensure it functions properly in all situations.

For example, you have to test a dialog box with three drop-down lists to see if any of the combinations cause the program component to fail. The first list has five options, the second has eight options, and the third has three; see Figure 1:

Figure 1.

To determine all the possible combinations, you can create a matrix like the following (Figure 2):

Figure 2.

As you continue adding combinations, you discover that 120 test cases are required to cover all the possible combinations. You can also determine the number of combinations by multiplying the number of values available in each option (5 x 8 x 3 = 120). If each test takes around two minutes to perform, you are faced with about 4 hours of testing on a simple dialog box. What if you need to test 100 dialog boxes? What if some dialog boxes contain 15 options instead of three?

Now take the concept of complete coverage a step further and consider environmental variables such as operating system, database, and hardware components. How do you ensure you find a bug that occurs only when the application is running on Windows XP and is using MySQL without testing all the possible OS and database combinations?

These examples demonstrate how quickly complete coverage becomes unmanageable. Luckily, you can find most bugs without testing all the combinations. The simplest bugs are single-mode faults, which occur when one option causes a problem regardless of the other settings. For example, a printout is always smeared when you choose the duplex option in the print dialog box regardless of the printer or the other selected options.

Another type of bug is one that occurs when two options are combined - the printout is only smeared when duplex is selected and the printer is a model 394. These are called double-mode faults. Finally, multi-mode faults, which occur when three or more settings produce the bug, are the types of problems that make complete test coverage seem necessary.

However, complete coverage is usually not necessary. A study by Telcordia Technologies found that "most field faults were caused by either incorrect single values or by an interaction of pairs of values" (Cohen, et al. 1996). Another study of the software in medical devices showed that only three of the 109 failures resulted from the combination of more than two conditions (Wallace, 2000).

If you have limited time and resources, you want to find the most common bugs and those that present the highest risk. Suppose the printer error only occurs when the operating system is Windows, the print option is set to duplex, the print quality is draft, and the collate option is not selected. Is it worth your time to find that bug? Does the bug present a big enough risk to the user or application that it will even require a software fix?

Except in the rare cases where life and death are at stake, you can achieve a statistically acceptable level of quality by testing less than 100 per cent of the combinations. One approach to doing this is called pair-wise or all-pairs testing.

Implementing all-pairs testing

If testing all the possible combinations of values is not necessary, the question becomes how to construct the all-pairs tests. There are several applications that allow you to enter variables and generate the tests. However, it is helpful to try constructing the test cases manually once or twice so you understand exactly how all-pairs testing works.

Identifying the variables

Before you can implement all-pairs testing, you need to identify the variables. For example, single sign-on support was recently added to your company's sales management application, and the login screen was redesigned for this feature. The application runs on a variety of operating systems, supports several databases, and includes a cross-platform graphical user interface (GUI) and a web-based client. The following table (Figure 3) summarises the variables that affect the tests:

Figure 3.

[Note: For the sake of simplicity, the following examples assume that all the variable combinations are valid.]

How do you effectively test single sign-on support based on these different variables? A simple calculation shows there are 48 possible combinations (3 x 4 x 2 x 2). All-pairs testing can significantly reduce that number.

Creating the first pair of values

After identifying the variables, use a spreadsheet to combine the values from a pair of variables. Before starting, arrange the variables by the number of values they contain from greatest to least. In our example, the variables would be placed in the following order: Database, Operating System, Client Type, and Browser. Label the first column in the spreadsheet with the name of the variable with the most values (Figure 4):

Figure 4.

Label the second column with the name of the variable with the second-highest number of values (Figure 5):

Figure 5.

Pair the first value in the first variable column with the first value in the second variable column (Figure 6):

Figure 6.

Then match the first value from the first variable column with the second value in the second variable column (Figure 7):

Figure 7.

Continue the process until the first value of the first variable is paired with each of the second variable values (Figure 8):

Figure 8.

Skip a row in the spreadsheet to improve readability and to allow for expansion. Then match the first variable's second value with all the values from the second variable (Figure 9):

Figure 9.

Repeat the steps until all values in the first two variables are paired up (Figure 10):

Figure 10.

Matching the other pairs of values

To add a third variable, start by entering the values in order in a third column, repeating as necessary (Figure 11):

Figure 11.

Next, compare the combinations and make sure you have all the possible pairs for the second and third variables. In our example, there is a pair for the Windows Operating System and each Client Type (Figure 12):

Figure 12.

There is also a pair for the Linux Operating System and each Client Type (Figure 13):

Figure 13.

Finally, there is a pair for the Solaris Operating System and each Client Type (Figure 14):

Figure 14.

If a pair is missing, rearrange the values to create the necessary combinations. Notice that it only takes six test runs to cover all the Operating System and Client Type combinations (Figure 15):

Figure 15.

Next, compare the combinations between the first and third variables to make sure you have all the possible pairs (Figure 16):

Figure 16.

Notice, from Figure 17, that 10 test runs cover all the Database/Client Type and Operating System/Client Type pairs:

Figure 17.

If there are more variables, continue the same procedure of creating pairs with the values from the last variable and the values for the other variables.

In our examples, the Browser values are only valid when the Client Type is Web. The following table (Figure 18) shows the Browser values added to the Web test runs:

Figure 18.

Verify that all the Web pairs are present for the other variable values. There is a pair for each Browser value and the Web Client Type (Figure 19):

Figure 19.

As you can see in Figure 20, there's a pair for each Operating System and Browser value:

Figure 20.

Finally, check the pairs for the Browser values and the Database values. Notice, from Figure 21, that the Oracle/Internet Explorer and Access/Firefox pairs are missing:

Figure 21.

You could simply add two new test runs to cover the missing Database and Browser combinations, however, before you do that, check for duplicate pairs to see if you can incorporate the missing pairs into the existing runs. Notice, from Figure 22, that the Solaris/GUI and the Windows/GUI pairs are duplicated in the Access and Oracle groups.

Figure 22.

You can change the Solaris/GUI pair in the Oracle group and add the Internet Explorer/Oracle pair to that run without affecting the other pairs. You can also change the Windows/GUI pair in the Access group and add the Firefox/Access pair to that run, giving us Figure 23:

Figure 23.

From 48 to 12 Test Runs

After pairing up the variables, add a column and number the test runs for easy reference. The following matrix, Figure 24, summarises the final test runs.

Figure 24.

Instead of 48 test runs, only 12 are required to provide the proper test coverage. If particular variable combinations have resulted in a higher number of defects in the past, you might want to include additional runs to cover those combinations.

Now, the next thing you have to do is find an automated testing tool to code up the tests and manage the test runs. [There'll be a follow-up interview with Keith in Reg Dev soon, where I'll ask him about implementing this technique in practice - David Norfolk, Ed]

Follow-up Resources

Berger, Bernie. Efficient Testing with All-Pairs International Conference on Software Testing 2003: available here.

Dustin, Elfriede. Orthogonally Speaking: A Method for deriving a suitable set of test cases STQE September/October 2001: available here.

Cohen, David M., Siddhartha R. Dalal, Jesse Parelius, and Gardner C. Patton. The Combinatorial Design Approach to Automatic Test Generation IEEE Software September 1996.

Wallace, Dolores R. and D. Richard Kuhn. Converting System Failure Histories into Future Win Situations Information Technology Laboratory, National Institute of Standards and Technology January 2000 (pdf file); there's a link to a related website here.

About the Author: Keith Vanden Eynden is a senior technical writer at Seapine Software, Inc, with 15 years experience in performance improvement, training, and documentation.