white box testing and its testing techniques

White-box testing

The White-box test design allows one to peek inside the box, it focuses specifically on using internal knowledge of the software to guide the selection of test data.
white box testing

White-box testing is also known by other names such as Glass-box testing, structural testing, clear box-testing, open-box testing, logic-driven testing, and path-oriented testing.

In white-box testing, test cases are selected on the basis of examination of the code, rather than the specifications. 

White-box testing is a software testing approach that examines the program structure and derives test data from the program logic. Structural testing is sometimes referred to as clear-box testing since white boxes are considered opaque and don't really permit visibility into the code. 

White-box testing requires the intimate knowledge of program internals, while black-box testing is based solely on the knowledge of the system requirements. Being primarily concerned with program internals, it is obvious in software engineering literature that the primary effort to devoted to glass box tests. However, since the importance of black-box testing has gained general acknowledgment, also a certain number of useful black-box testing techniques were developed.

It is important to understand that these methods are used during the test design phase, and their influence is hard to see in the tests once they're implemented.

Advantages of White-box testing

The main advantages of white-box testing are:
  • Forces test developer to reason carefully about implementation.
  • Approximates the partitioning done by execution equivalence.
  • Reveals errors in hidden code.
  • Beneficient side effects.
  • Optimizations.

Disadvantages of White-box testing

  • Expensive 
  • Miss cases omitted in the code.

White-box testing techniques

There are a number of different forms of white-box testing. The importance of these are:
  • Basis Path Testing
  • Structural Testing
  • Logic-Based Testing
  • Fault-Based Testing

Basis Path Testing

Basis path testing is a white-box technique. It allows the design and definition of a basis set of execution paths. The test cases created from the basis set allow the program to be set or executed in such a way as to examine each possible path through the program by executing each statement at least once.

To be able to determine the different program paths, the engineer needs a  representation of the logical flow of control. The control structure can be illustrated by a flow graph. A flow graph can represent any procedural design.

Structural Design

Structural testing examines source code and analyses what is present in the code. Structural testing techniques are often dynamic, meaning that code is executed during analysis. This implies a high-test cost due to compilation or interpretation, linkage, file management, and execution time. Test cases are derived from the analysis of the program control flow.

A control flow graph is a representation of the flow of control between program regions such as a group of statements bounded by a single entry and exit point.

Structural testing can't expose errors of code omission but can estimate the test suite adequacy in terms of code coverage, that is, execution of components by the test suite or its fault-finding ability.

Some of the important types of structural testing.
  • Statement Coverage Testing
  • Branch Coverage Testing
  • Condition Coverage Testing
  • Loop Coverage Testing
  • Path Coverage Testing
  • Domain and Boundary Testing
  • Data-flow Testing

Statement Coverage Testing

This is the simplest form of white-box testing, whereby a series of test cases are run such that each statement is executed at least once. Its achievement is insufficient to provide confidence in a software product's behavior.

Often, a CASE tool keeps track of how many times each statement has been executed. A weakness of this approach is that there is no guarantee that all outcomes of branches are properly tested.

Ex: if (s>1 && t==0)
                x = 9;
       Test case: s = 2; t = 0;
Here, the programmer has made a mistake; the compound conditional (s>1 && t==0) should have been (s>1 || t==0). The chosen test data, however, allow the statement x = 9 to be executed without the fault being detected.

Branch Coverage Testing

Branch Coverage is an improvement overstatement coverage, in that a series of tests are run to ensure that all branches are tested at least once. It is also called decision coverage. Techniques such as statements or branch coverage are called structural tests.

Branch coverage requires sufficient test cases for each program decision or branch to be executed so that each possible outcome occurs at least once. It is considered to be a minimum level of coverage for most software products, but decision coverage alone is insufficient for high-integrity applications.

Condition Coverage Testing

This criterion requires sufficient test cases for each condition in a program decided to take on all possible outcomes at least once. It differs from branch coverage only when multiple conditions must be evaluated to reach a decision.

Multi-Condition coverage requires sufficient test cases to exercise all possible combinations of conditions in a program decision.

Loop Coverage Testing

This criterion requires sufficient test cases for all program loops to be executed for zero, one, two, and many iterations covering initialization, typical running, and termination conditions.

Path Covering Testing

Path coverage is the most powerful form of white-box testing; all paths are tested. The number of paths in a product with loops can be very large, however, and methods have been devised to reduce the number of paths to be examined.

This criterion requires sufficient test cases for each feasible path, basis path, etc. from start to exit of a defined program segment, to be executed at least once. Because of the very large number of possible paths through a software program, path coverage is generally not achievable. The amount of path coverage is normally established based on the risk or criticality of the software under test. 

One criterion for selecting paths is to restrict test cases to linear code sequences. To do this, one first identifies the set of points, C, from which control flow may jump. Set C contains entry and exit points and branch statements. The linear code sequences are those paths that begin at an element of C and end at an element of C.

Another method of reducing the number of paths is an all-definition-use-path. coverage. In this technique, each occurrence of a variable is labeled either as a definition of the variable.

Domain and Boundary Testing

Domain testing is a form of path coverage. Path domains are a subset of the program input that causes the execution of unique paths. The input data can be derived from the program control flow graph. Test inputs are chosen to exercise each path and also the boundaries of each domain.

For example, in a program analyzing the height and weight of a population, the input domain is height and weight, where the inputs are real numbers greater than zero and bounded by some upper limit.

'if(weight >= 50.0 and height >= 1.8 ) then S1 else S2'

would partition the path domain into two from the true and false evaluation of the predicate. A true evaluation would result in statement S1 being executed and S2 is executed when the predicate evaluates to false.

Test inputs for branch, statement, and domain testing cold be:
Test 1: weight = 48.0 height = 1.8
Test 2: weight = 50.0 height = 1.8

A boundary test would incorporate test inputs on and slightly off the boundaries of the paths. to determine data slightly off the boundary an amount E must be added or subtracted to the value which lies on the boundary.

When the boundary is determined by an integer, E is 1. That is, the value 1 must be added or subtracted to the value in a predicate to form an input value that will be close to the domain boundary. When working with real numbers the procedure is more complex. The value E must be the smallest number distinguishable by the base system of the program under test. For example, if the reals are single precision it could be one of the orders of 0.001.

to test the boundary of 'weight = 50.0' the following three input cases would be valid:

Test 1: weight = 50.0 height = 1.8
Test 2: weight = 50.0 height = 1.6
Test 3: weight = 50.000001. height = 1.8

Great care must be taken when working with real numbers in predicates because of the precision problems of reals. Boundary testing aids the identification of these problems and errors pf path selection. However, domain and boundary analysis are only suitable for programs with a small number of input variables and with simple linear predicates.

Data Flow Testing

Data flow testing focus on the points at which variable receive values and the points at which the values are used. This kind of testing serves as a reality check on the path testing and that's why many authors view it as a path testing technique.

This technique requires sufficient test cases for each feasible data flow to be executed at least once. Data flow analysis studies the sequences of actions and variables along program paths. It can be considered and applied as both static and dynamic techniques.

Test data must traverse all the interactions between a variable definition and each of its uses. The program path between a variable definition and use without an intervening definition is known as a DU path. Variables can be created, killed, and used.

Logic-Based Testing

Logic-based testing is used when the input domain and resulting processing are amenable to a decision table representation. Some steps are used to develop a decision table:

List all actions that can be associated with a specific algorithm.
List all conditions during the execution of the algorithm.
Associate specific conditions with specific actions eliminating impossible combinations of conditions.
Define rules by indicating what action occurs for a set of conditions.

Alternatively, for step 3, develop every possible permutation of conditions. This will reveal any inconsistencies or gaps in the user specification and can thus be corrected.

Fault Based Testing

Fault-based testing attempts to show the absence of certain classes of faults in code. Code is analyzed for the uninitialized variables, parameter type checking, etc. The main technique, and its variants, which perform fault-based testing is called Mutation Analysis.

Mutation Analysis

Mutation analysis is a fault-based technique for determining the adequacy of a test suite in terms of its test effectiveness. Mutation analysis is one of the most thorough testing techniques. Empirical studies have shown it to be more stringent than other techniques.

A mutant is a copy of the original test program with one component, such as an operand or operator, altered to simulate a syntactically correct programming fault. The syntactic transformation is a mutation.


     while(index > 10) do could be mutated to while (index >=10) do

Thus, mutation analysis simulates simple programming errors. The test suite must be enhanced until all non-equivalent mutants are detected by generating incorrect output. It incorporates strategies from coverage, data flow anomaly, and domain testing strategies.

For example, the above statement has to be traversed by the test input index = 10 to differentiate between the correct and the incorrect version. All statements, branches, and paths must be executed to differentiate incorrect mutants from the original program.

By altering the constant 10, in the example, to the constants 11 to 9, boundary testing is performed. The test inputs must include cases of index = 9,10 & 11 to detect those mutants. By altering the definition of 'index' or replacing the use of it with another integer variable in scope, data flow anomalies can be detected.

Mutation analysis provides the tester with guidelines for the development of the test suite. However, it is resource-intensive requiring a large number of mutants to be created and executed on the test suite. Research indicates that the number of mutants varies with the number of code statements and variable squared.

A mutation test is a large program, such as would be found in an industrial or commercial environment, would require the generation of a substantial number of mutants. A test on this scale would require the management of resources. A strategy must be found to make mutation testing applicable to unit testing in a reasonable time scale and without tying up valuable resources such as time and manpower.

Post a comment