Test Data Generation
This chapter explains the principles behind the Test Data Generation feature of Coco.
Principle of the Genetic Algorithm
The way Coco discovers new test cases is to use a genetic algorithm based on an existing unit test to discover the optimal set of input data.
First, the user provides a unit test, with input parameters (integer, float, strings...) and which produces some output (also integer, float, strings...). Suppose that test T has 2 parameters, a string and an integer, and 1 output, a float. This defines a row of data.
Each time that this test is executed, Coco knows the coverage.
As an example:
# | Call | Output | Coverage |
---|---|---|---|
1: | T("", 0 ) | 0.0 | 20% |
2: | T("a",1) | 3.0 | 50% |
Coco will try to find new test data rows which increase the code coverage by mixing 3 techniques:
- Using a random parameter
- Mutating a parameter
- Performing a crossover of 2 tests
Each new row generated will only be kept if it brings a benefit to the overall coverage.
Let's run this on our sample. At the beginning, there is no test data available. The algorithm chooses only random values:
# | Call | Output | Coverage |
---|---|---|---|
1: | T("x", 0 ) | 0.0 | 20% |
2: | T("a",10) | 3.0 | 50% |
3: | T("ab",-4) | 3.0 | same as 2: |
After the execution of 3 tests, 2 rows will be kept. The 3rd has the same coverage as the second one and so is redundant. So the full list will be:
# | Call | Output | Coverage |
---|---|---|---|
1: | T("x", 0 ) | 0.0 | 20% |
2: | T("a",10) | 3.0 | 50% |
For the next test, the algorithm can choose to perform a mutation or a crossover. This decision is made randomly.
Suppose that it performs a mutation: it takes a previous row and changes one parameter. We take T("x", 0 ) and replace the second parameter with -1: T("x", -1 ). If the coverage increases, the result is kept, if not, we try other alternatives.
If the crossover is used, then we mix 2 test parameters together. In this case, we could take the first parameter of the first row and the second parameter of the second row. This would give the test T("x",10).
This algorithm can be iterated indefinitely.
Benefit of the Genetic Approach
The main benefit of the genetic approach is that if a set of data discovers an uncovered branch of code, the mutation and crossover are efficient techniques to discover the branch. The measurements of code coverage guide the algorithm.
Coco v7.0.0 ©2023 The Qt Company Ltd.
Qt and respective logos are trademarks of The Qt Company Ltd. in Finland and/or other countries worldwide. All other trademarks are property
of their respective owners.