RepoSense
RepoSense copied to clipboard
Backend: Parallelize System Tests
What feature(s) would you like to see in RepoSense?
The running time for system tests is currently quite long. Although unit tests were parallelized in #1806, methods in system test classes still run sequentially.
If possible, describe the solution
Although I previously mentioned in #1770 that methods in ConfigSystemTest
could not be parallelized using maxParallelForks
in build.gradle
, it turns out that #1806 did the parallelization using junit-platform.properties
. This allows methods within a test class to be parallelized.
A junit-platform.properties
file can be added to systemtest/resources
. Since we currently have very few systemtest
methods, a fixed approach using junit.jupiter.execution.parallel.config.fixed.parallelism
should suffice.
One problem related to concurrency is that all methods within the same systemtest
class will clone to an output directory with the same name.
Additional context
Please read through the discussion in #1806 before attempting this.
After an investigation of ConfigSystemTest.java
, I believe a new flag may be necessary in order to allow each repository to be cloned into different folders.
Currently, test
is able to get down to the lower levels of being able to specify the directory of each cloned repository via EXTRA_OUTPUT_FOLDER_NAME
and using the thread number as a folder name. This is not possible in ConfigSystemTest.java
to do so, as I believe the main purpose of ConfigSystemTest.java
is to run the RepoSense process via a single entry point of RepoSense.main
.
data:image/s3,"s3://crabby-images/2d82f/2d82feefc04e31ba85ab58a8071cd3d62d948e09" alt="image"
We are also not able to circumvent this via the technique used by LocalRepoSystemTest.java
. In LocalRepoSystemTest.java
's case, the idea was to test local repos (unlike in ConfigSystemTest.java
). As such, it is able to define the parameters for cloning from a remote repo to a local repo and name each repos accordingly.
What do you guys think? Should I proceed with this via specifying a new flag, or does anyone have any other solution for this? @reposense/active-developers @reposense/active-reviewers
- Add a flag to RepoSense to allow the customization of the cloned repository's folder name
- Pros:
- Ease of use inside
LocalRepoSystemTest
- Ease of use inside
- Cons:
- Need to add a new flag, which complicates the usage more
- But likely to add on to only DG
- Need to add a new flag, which means going into the
main
src folder to make changes - All the tests will then rely on the flag working, which is not the case for normal RepoSense usage. This is however in a similar fashion to that of testMode, which is currently enabled for all
ConfigSystemTest
.
- Need to add a new flag, which complicates the usage more
- Pros:
- Add a flag to RepoSense to allow the customization of the report’s output folder name
We do have a flag for the output directory. Not sure if this is the one you're looking for.
My bad, I actually meant the cloned repository folder name
My bad, I actually meant the cloned repository folder name
Personally, I'm okay with adding a new flag for this. What does everyone else think?
I'm ok with it if it provides significant speedup in the system tests. My personal preference would be to reduce the amount of test-only behavior to be as small as possible. For example,
This is however in a similar fashion to that of testMode, which is currently enabled for all ConfigSystemTest.
This is something that I wish to get rid of at some point since I already believe that is causing some issues as per #1879. So it would be preferable to avoid these if possible.
However, I've also spent quite a bit of time thinking about a nice solution for it and I'm not sure RepoSense is currently configured to be able to achieve something like that. So we can try with your suggestion first.
Just want to add that we should attempt this with an open mind. If the payoff is not significant after we implement it, we should not merge it no matter how much effort the implementation took. Parallelizing using threads often introduces hard-to-locate bugs and complicates the code. More often than not it turns out parallelizing is more trouble than it is worth.
I have given an update/report on #1900 regarding my progress on this issue. I don't believe my PR can justify the tradeoff in parallelizing systemtest, but I think that it may be possible for other developers to try it out and see if they are able to resolve the issues mentioned in my update.