defects4j icon indicating copy to clipboard operation
defects4j copied to clipboard

Compute list of dependent tests per fault and not per project

Open jose opened this issue 5 years ago • 0 comments

The current version of the bug-mining framework creates a file named dependent_tests for each mined project. This file has the list of test methods that fail due to, e.g., some random behaviour of the project under test or because they depend on the order of execution. That list of dependent test methods are later used by the checkout command which removes them from the project source code of a buggy or fixed project version.

As far I can see, there are two issues with the above approach.

Issue 1

What if org.TestFoo::testBar is considered a dependent test method of fault 1, but it is the only trigger test of fault 5? The currently implementation of the bug-mining framework considers fault 1 as a valid and reproducible fault but ignores fault 5, as no existing test method triggers the faulty behaviour. This occurs because the single list of dependent test methods is created for a project and not for each fault.

Although I do not think this is an issue for the current set of faults in D4J, it might be for new projects/faults.

Issue 2

Suppose a new fault is mined for an existing project in D4J and for that new fault there is one dependent test method not listed in the dependent_tests file. The dependent test method is added to the dependent_tests file and two problems could occur: 1) some existing faults are no longer reproducible because the test that triggers the faulty behaviour is now discarded by the checkout process; 2) assuming it does not invalidate any existing fault, if for example the new dependent test method is considered as a relevant test of some (worst case all) existing faults in the database, the metadata of those faults needs to be re-generated (which is a very time consuming task).

Proposed solution

Instead of creating a single dependent_tests file per project, I would propose the creation of a directory named dependent_tests and a file per fault with the list of dependent tests of each specific fault. The checkout process would read the dependent_tests file of the correspondent fault and only discard the dependent test methods of that specific fault. This solution would address the issues described above and guarantee full isolation of each fault.

-- Best, Jose

jose avatar Jun 20 '19 22:06 jose