RepoSense
RepoSense copied to clipboard
Parallelize backend tests
What feature(s) would you like to see in RepoSense?
The Test
class in Gradle has a maxParallelForks
property that would allow multiple JUnit test classes to run at the same time. Running test classes in parallel can reduce the build time.
Potential Problem
A potential problem lies in the GitTestTemplate
class and test classes that extend it:
https://github.com/reposense/RepoSense/blob/b12d3485d4b7a3aca5f2545404b5545a5277ff5e/src/test/java/reposense/template/GitTestTemplate.java#L107-L111
The @BeforeAll
annotation indicates that the method is called once when a test class such as AnnotatorAnalyzerTest
is started. Within this method, the repo is cloned (in short, the testrepo-Alpha.git
remote is cloned to the same output folder every time a test class that extends GitTestTemplate
starts):
https://github.com/reposense/RepoSense/blob/b12d3485d4b7a3aca5f2545404b5545a5277ff5e/src/test/java/reposense/util/TestRepoCloner.java#L28-L33
If 2 test classes that extend GitTestTemplate
run in parallel, it is possible that while 1 test class is still running through some test methods, the other class is re-cloning the repo, which could interfere with the first test class.
Limitation
According to this website, the methods within a single test class do not run in parallel. This means that there won't be time savings for ConfigSystemTest
.
For this, I think maybe we can try to change outputFolderName
. So even if they are the same repo, the outputFolderName
will be different and hence the 2 tests that relies on the same repo can be executed concurrently.
Currently, I can lower it down to about 25s (as compared to ~40s originally). But there are still some test cases that didn't pass. Got a half-working branch here.
~~The only problem left is the .git-blame-ignore-revs
as it is now in a different directory than the others.~~
Update: the above is resolved by updating the path.
Now the problem is with the systemtest
. Will try to fix it soon.
@yhtMinceraft1010X what do you think about my idea? Perhaps I can give this issue a try?
@yhtMinceraft1010X
For systemtest
, I realized that it may not be possible to achieve parallel execution since the code for RepoSense doesn't really support parallelism. There are a lot of static
variables that causes error when I try to run something in parallel.
I tried to break the methods down to achieve as much parallelism as possible. But the improvement in timing is pretty small (~3s). So perhaps I can make a PR for the test
to be parallel first?
Meanwhile test
in parallel can drop from ~40s to ~16s. This number can go even lower but I feel that it is not really necessary.
Note to anyone who wishes to work on this, test
is now running in parallel. More details are in the merged PR above.
systemtest
is not in parallel, which you can still work on. Of course, you are still welcome to improve on test
if you wish to.
Closing this issue so that the description for systemtest
can be better phrased.