RepoSense Backend: Parallelize System Tests

What feature(s) would you like to see in RepoSense?

The running time for system tests is currently quite long. Although unit tests were parallelized in #1806, methods in system test classes still run sequentially.

If possible, describe the solution

Although I previously mentioned in #1770 that methods in ConfigSystemTest could not be parallelized using maxParallelForks in build.gradle, it turns out that #1806 did the parallelization using junit-platform.properties. This allows methods within a test class to be parallelized.

A junit-platform.properties file can be added to systemtest/resources. Since we currently have very few systemtest methods, a fixed approach using junit.jupiter.execution.parallel.config.fixed.parallelism should suffice.

One problem related to concurrency is that all methods within the same systemtest class will clone to an output directory with the same name.

Additional context

Please read through the discussion in #1806 before attempting this.

Jan 28 '23 07:01 yhtMinceraft1010X

After an investigation of ConfigSystemTest.java, I believe a new flag may be necessary in order to allow each repository to be cloned into different folders.

Currently, test is able to get down to the lower levels of being able to specify the directory of each cloned repository via EXTRA_OUTPUT_FOLDER_NAME and using the thread number as a folder name. This is not possible in ConfigSystemTest.java to do so, as I believe the main purpose of ConfigSystemTest.java is to run the RepoSense process via a single entry point of RepoSense.main.

We are also not able to circumvent this via the technique used by LocalRepoSystemTest.java. In LocalRepoSystemTest.java's case, the idea was to test local repos (unlike in ConfigSystemTest.java). As such, it is able to define the parameters for cloning from a remote repo to a local repo and name each repos accordingly.

What do you guys think? Should I proceed with this via specifying a new flag, or does anyone have any other solution for this? @reposense/active-developers @reposense/active-reviewers

Feb 10 '23 11:02 sikai00

Add a flag to RepoSense to allow the customization of the cloned repository's folder name
- Pros:
  - Ease of use inside LocalRepoSystemTest
- Cons:
  - Need to add a new flag, which complicates the usage more
    - But likely to add on to only DG
  - Need to add a new flag, which means going into the main src folder to make changes
  - All the tests will then rely on the flag working, which is not the case for normal RepoSense usage. This is however in a similar fashion to that of testMode, which is currently enabled for all ConfigSystemTest.

Feb 10 '23 11:02 sikai00

Add a flag to RepoSense to allow the customization of the report’s output folder name

We do have a flag for the output directory. Not sure if this is the one you're looking for.

Feb 10 '23 14:02 yhtMinceraft1010X

My bad, I actually meant the cloned repository folder name

Feb 10 '23 16:02 sikai00

My bad, I actually meant the cloned repository folder name

Personally, I'm okay with adding a new flag for this. What does everyone else think?

Feb 11 '23 01:02 yhtMinceraft1010X

I'm ok with it if it provides significant speedup in the system tests. My personal preference would be to reduce the amount of test-only behavior to be as small as possible. For example,

This is however in a similar fashion to that of testMode, which is currently enabled for all ConfigSystemTest.

This is something that I wish to get rid of at some point since I already believe that is causing some issues as per #1879. So it would be preferable to avoid these if possible.

However, I've also spent quite a bit of time thinking about a nice solution for it and I'm not sure RepoSense is currently configured to be able to achieve something like that. So we can try with your suggestion first.

Feb 12 '23 20:02 chan-j-d

Just want to add that we should attempt this with an open mind. If the payoff is not significant after we implement it, we should not merge it no matter how much effort the implementation took. Parallelizing using threads often introduces hard-to-locate bugs and complicates the code. More often than not it turns out parallelizing is more trouble than it is worth.

Feb 13 '23 02:02 damithc

I have given an update/report on #1900 regarding my progress on this issue. I don't believe my PR can justify the tradeoff in parallelizing systemtest, but I think that it may be possible for other developers to try it out and see if they are able to resolve the issues mentioned in my update.

Apr 05 '23 20:04 sikai00

RepoSense RepoSense copied to clipboard

Backend: Parallelize System Tests

RepoSense
RepoSense copied to clipboard