flacoco icon indicating copy to clipboard operation
flacoco copied to clipboard

Optimize flacoco

Open andre15silva opened this issue 4 years ago • 16 comments

On the basis of pure observation, flacoco is considerably slower than GZoltar right now.

Some ideas for optimization:

  • Test detection takes some time
  • Parallelize tests could help
  • Online instrumentation

I will also do some profiling and update this issue

andre15silva avatar Jul 29 '21 13:07 andre15silva

Hi @andre15silva

Nice issue to tackle. The problematic code to be refactored should be here: https://github.com/SpoonLabs/flacoco/blob/master/src/main/java/fr/spoonlabs/flacoco/core/test/TestDetector.java

martinezmatias avatar Jul 29 '21 13:07 martinezmatias

Hi @martinezmatias

Yes.

Based on pure observation tho, I'd say the bottleneck is larger on test-runner's side.

I'll try to confirm this with some profiling.

andre15silva avatar Jul 29 '21 13:07 andre15silva

Hot spots of the flacoco process running on math_70 from astor's examples:

Self times pic-selected-210729-1700-23

Total times pic-selected-210729-1702-01

Optimizations that jump to mind:

  1. SpoonTestMethod's fields can be computed just once, so we don't need to call the model getters all the time. (yields ~50% cpu time reduction in the flacoco process)

  2. Loading the serialized file produced by the test-runner process itself takes ~10% of the flacoco CPU time after optimization 1.

  3. Test detection takes ~80% of the cpu time of the flacoco process after optimization 1. (70s out of 90s in my machine).

andre15silva avatar Jul 29 '21 15:07 andre15silva

Hot spots of the test-runner process running on the same example as before:

Self times pic-selected-210729-1739-13

Expanded self times for the top result. This highlights that the most time consuming operation is analyzing the instrumented classes after execution. pic-selected-210729-1739-56

Total times pic-selected-210729-1740-26

Where to optimize?

  1. Coverage analysis for sure. This is the most time consuming operation with ~77% of cpu time being used there. (169s/218s)

andre15silva avatar Jul 29 '21 15:07 andre15silva

Update here, I think implementing a org.jacoco.core.internal.analysis.ClassAnalyzer that takes in several ExecutionDataStore's is the best way.

Currently we do a full analysis of the binaries each time a method finishes. What we want to do is store the ExecutionDataStore's and do just one pass through the binaries at the end of the test run.

Doing this analysis is costly, and so much more when you do it several thousands of times.

Edit: As far as I have come to understand, doing this in a single thread with just one pass of the class files requires an almost entirely new analyzer package of jacoco. WIP on this front.

andre15silva avatar Aug 03 '21 10:08 andre15silva

Using jacoco, I think the best we can do is https://github.com/STAMP-project/test-runner/pull/112

The only way I see we could achieve a performance similar to GZoltar is to either adapt jacoco so that the analyzer takes several ExecutionDataStores at once, and does just one sweep (not sure how feasible that is, nor if it will even reach GZoltar's performance). I'm experimenting on that, but still soon to tell.

andre15silva avatar Aug 04 '21 13:08 andre15silva

Hi @andre15silva

Thanks for the update.

martinezmatias avatar Aug 06 '21 08:08 martinezmatias

Hi @martinezmatias,

Another update. I've come to realize that the problem might be that jacoco analyzes every single class, not just the ones executed. GZoltar only analyzes the ones that were executed.

It is the difference between being quadratic and "quasi-linear". I'm working on having a PR for jacoco that would allow for this.

andre15silva avatar Aug 06 '21 08:08 andre15silva

I have opened a PR https://github.com/jacoco/jacoco/pull/1212, that introduces an option for skipping non-executed classes in the analysis.

This does improve performance, but we are still a bit far from GZoltar.

The next important steps are:

  • Optimizing test detection. GZoltar, as far as I know and according to the implementation in Astor, doesn't have this step. Still, it is one of the most time consuming steps of our pipeline, so we should aim at optimizing it.
  • Removing serialization and de-serialization from test-runner. Right now, on my PC on math70, we spend more than 20/25 seconds just saving and loading, out of 1m30s.

andre15silva avatar Aug 06 '21 15:08 andre15silva

Test detection might be feasible by using surefire.

https://github.com/apache/maven-surefire/tree/surefire-3.0.0-M5_vote-1/surefire-providers https://maven.apache.org/surefire/maven-surefire-plugin/api.html

andre15silva avatar Aug 09 '21 14:08 andre15silva

Hi @martinezmatias

I have opened a PR jacoco/jacoco#1212, that introduces an option for skipping non-executed classes in the analysis.

That's a great contribution.

Optimizing test detection. GZoltar, as far as I know and according to the implementation in Astor, doesn't have this step. Still, it is one of the most time consuming steps of our pipeline, so we should aim at optimizing it.

Optimizing test detection. GZoltar, as far as I know and according to the implementation in Astor, doesn't have this step

Do you mean to a) retrieve the list with the names of the test methods/classes to execute or b) the test framework from each test?

Removing serialization and de-serialization from test-runner. Right now, on my PC on math70, we spend more than 20/25 seconds just saving and loading, out of 1m30s.

Agree, it's a big portion.

martinezmatias avatar Aug 10 '21 06:08 martinezmatias

Hi @martinezmatias

Do you mean to a) retrieve the list with the names of the test methods/classes to execute or b) the test framework from each test?

Both. Building the spoon model itself already takes around half the time, while checking the frameworks the other half.

See the graph: pic-selected-210810-1041-00

andre15silva avatar Aug 10 '21 08:08 andre15silva

test-runner inter-process communication optimization PR opened in https://github.com/STAMP-project/test-runner/pull/116

Running flacoco on math70 with this optimization reduces writing/reading time from ~25s to ~3s (~88% reduction).

andre15silva avatar Aug 10 '21 14:08 andre15silva

Hi @andre15silva

Running flacoco on math70 with this optimization reduces writing/reading time from ~25s to ~3s (~88% reduction).

Nice work!.

martinezmatias avatar Aug 11 '21 13:08 martinezmatias

Running flacoco on math70 with this optimization reduces writing/reading time from ~25s to ~3s (~88% reduction).

Impressive!

monperrus avatar Aug 12 '21 07:08 monperrus

Running flacoco on math70 with #82 reduces test detection time from ~60s to ~1s (~99% reduction), as well as fixing #80.

andre15silva avatar Aug 13 '21 10:08 andre15silva