Investigate mvn4 large graph performance regression
Affected version
4.0.0-rc-3 (resolver 2.0.9)
Bug description
I use this issue to investigate the performance regression posted on the dev list. Github allows to embed pictures which makes sharing flame graphs easier compared to mail.
benchmark project taken from mail thread (thanks to @seregamorph for coming up with a reproducer!): https://github.com/maven-turbo-reactor/maven-multiproject-generator (https://www.mail-archive.com/[email protected]/msg135612.html)
setup (first line are kernel settings for async-profiler):
sudo sysctl kernel.kptr_restrict=0 && sudo sysctl kernel.perf_event_paranoid=1
export JAVA_HOME=/home/mbien/dev/java/jdk-24.0.1.9.1-corretto
mvn3 run:
export MAVEN_OPTS=-agentpath:/home/mbien/dev/java/async-profiler-4.0-linux-x64/lib/libasyncProfiler.so=start,event=cpu,file=maven3.html
/home/mbien/dev/java/apache-maven-9.9.10_40/bin/mvn clean install
mvn4 run:
export MAVEN_OPTS=-agentpath:/home/mbien/dev/java/async-profiler-4.0-linux-x64/lib/libasyncProfiler.so=start,event=cpu,file=maven4.html
/home/mbien/dev/java/apache-maven-4.0.0-rc-3/bin/mvn clean install
(i ran a warmup run first before measuring)
m3
Total time: 01:03 min (Wall Clock)
m4
Total time: 03:08 min (Wall Clock)
(old intel i7 on NVME)
first observation:
mvn3 has 3 paths leading into DefaultRepositorySystem.collectDependencies, mvn4 has 4. The extra path comes from DefaultConsumerPomBuilder, see second observation.
Graphs show relative execution time. No totals.
note: one path is very fast and not visible on this zoom level in both cases, purple marks collectDependencies
mvn3:
mvn4:
second observation:
The reason for that seems to be the install stage. More time is spent in InstallMojo:
m3:
m4:
zooming in shows the difference, the whole RepositoryUtils.toArtifact path is new and is the cause for the extraDefaultRepositorySystem.collectDependencies call.
m3:
m4:
third observation:
more time is spent computing hash codes, purple highlights methods containing "hash".
mvn3:
mvn4:
lets zoom into a single collectDependencies call
mvn3:
mvn4:
full flame graphs as html (compressed since github doesn't like html):
..my profiling show around 15% performance part is from getPluginsAsMap
will try refine it this weekend.
update: https://github.com/apache/maven/pull/2482
1.2% + 0.7% be in the slow string format function. will refine it also.
update: https://github.com/apache/maven/pull/2483
log flush: https://github.com/apache/maven/pull/2478
delete several outdated classes(and their generated copies) https://github.com/apache/maven/pull/2484
17.32% at the thread start in maven-resolver.
seems there be some thread management problems.
so yes I have to have a look at the maven-resolver repo also(never did this before)
~~update: https://github.com/apache/maven-resolver/pull/1490~~ update: https://github.com/apache/maven-resolver/pull/1492
I think the current raised PRs solve most problems.
The remaining one is the resolver dependency tracker (bf consumes more resources than df but provides correct results).
@gnodet feel free to close whenever you see fit. I didn't have time to follow up on this unfortunately.
@gnodet feel free to close whenever you see fit. I didn't have time to follow up on this unfortunately.
Yeah, I'll keep it opened until the PRs are merged:
- #2506
- #2495
- #2518
- #2526
- #2527
- #2528
- #2529
- #2530
I tried to use https://github.com/apache/maven-resolver/pull/1508 and as I can see the problem is still there. Can you please confirm you have a better results for project generated via https://github.com/maven-turbo-reactor/maven-multiproject-generator ?
I tried to use apache/maven-resolver#1508 and as I can see the problem is still there. Can you please confirm you have a better results for project generated via https://github.com/maven-turbo-reactor/maven-multiproject-generator ?
apache-maven-4.1.0-SNAPSHOT-bin.zip @seregamorph if you trust me you can try this package and see if it be helpful. on my local machines this package works better than maven-rc3...... well, we be improving it. still, maybe it is slower than maven3 , because of the injection layer. but things will be better when plugins start to use maven4 interfaces instead(which reduces the injection cost)
Also, one big difference between mvn3 and mvn4 is transitive dep mgr in mvn4. In this respect, mvn4 will always be "a bit behind" maven3 and is okay, as long it is bearable overhead. To turn it off (and POM transformation) use -Dmaven.maven3Personality with Maven4.
@mbien would love if you could ~~test~~ measure (am not at desktop, am on faint linux laptop) this Resolver PR: https://github.com/apache/maven-resolver/pull/1539
(build/install PR, build/install maven master w/ diff):
diff --git a/pom.xml b/pom.xml
index f7ca6ed0d5..8a259f2943 100644
--- a/pom.xml
+++ b/pom.xml
@@ -163,7 +163,7 @@ under the License.
<plexusInterpolationVersion>1.28</plexusInterpolationVersion>
<plexusTestingVersion>1.5.0</plexusTestingVersion>
<plexusXmlVersion>4.1.0</plexusXmlVersion>
- <resolverVersion>2.0.10</resolverVersion>
+ <resolverVersion>2.0.11-SNAPSHOT</resolverVersion>
<securityDispatcherVersion>4.1.0</securityDispatcherVersion>
<sisuVersion>0.9.0.M4</sisuVersion>
<slf4jVersion>2.0.17</slf4jVersion>
Am interested in change.
Seems master suffers from https://github.com/apache/maven/issues/10995
Applied PR https://github.com/apache/maven/pull/10997 locally + Resolver change (above) and tested with that.
3.9.11: Total time: 01:28 min (Wall Clock)
master: Total time: 02:53 min (Wall Clock)
But am interested in memory consumption as well, but more in Resolver master vs Resolver PR one...
mvn4 master + SmartProjectCompactor fix + resolver 2.0.11-SNAPSHOT
Total time: 02:10 min (Wall Clock)
my original runs as reference:
m3
Total time: 01:03 min (Wall Clock)
m4
Total time: 03:08 min (Wall Clock)
My latest numbers: Maven 3.9.11: (for ref): 1:18 min (wall clock) Maven master + Resolver 2.0.11-SNAP + local fixes: 1:23 (with RRF and all niceties enabled!)
So we are getting there 😄
Note: what I tested is NOT yet available anywhere, as code is not yet done, but in short: Resolver master has it almost all, except for one thing: https://github.com/apache/maven-resolver/issues/1547
As it turned out, the problem with conflict resolver is really emphasized by the shape of generated projects (all my local tests were done by using this cool tool https://github.com/maven-turbo-reactor/maven-multiproject-generator).
This bit is last piece of code that is 100% same between Maven 3.9.x (Resolver 1.9.x) and Maven 4 (Resolver 2.0.x), BUT as Maven 4 performs "one extra" collection (on install, to produce the consumer POM), it creates this execution diff between 3 and 4. By the way, current conflict resolver code is O(N^2) as "worst case", and the project that generator generates is worst case for current code.
Hence, I am trying to rework conflict resolver to not perform as many graph walks as many "conflict partitions exists" (and again, in generated case conflict partitions == module count), but basically derive the "conflict resolution" is two passes (I don't think is possible in less than two passes, but maybe?)....
Still, I kinda feel this is "symptom treatment", as what in reality happens, is that Maven resolves in 99% (almost) same thing over and over again: it collects module N, do something, then collects module N+1 (and most trivial part, the "tail" of project shape, see generated project shape on link above), is really same as previous module with current Nth module appended.... but this is like it since Maven 3.0, so no behavior change in here, but this is something we may want to address later. Another issues is that same thing is calculated within one module: on project start project deps are collected, then Enforcer does same thing, and finally install (in Maven 4) does same thing.... @gnodet
Awesome. I don't have much capacity to contribute there, but I'm open for validating prototypes.
So, finally sat down and instead just looking at it, I swallowed the frog 😄
Number: 3.9.11: 1:20 min master: 1:30 min (still with RRF enabled))
Unlike before, where I tried to hack and hack around my first hacks (hence code for post above never got published), I tried to rewrite conflict resolution, result is here (and this was tested as well): https://github.com/apache/maven-resolver/pull/1558
In short: I am building the tree/graph from root "by level" advancing down (with small detours), and not as before iterating over and over on the graph. The rewrite is still not 100% (it is 100% for "normal" usage, when verbosity=NONE), but the two Verbose modes are not yet behaving as before expected (hence the Verbose related UT changes, I plan to fix that and undo them -- but was just curious about the change).
(new hardware, hence different numbers!)
This is where we are:
- resolver master (2.0.11-SNAPSHOT)
- maven https://github.com/apache/maven/pull/11043
Results: https://gist.github.com/cstamas/144c10d0282fbaa2705310b57a4d3707
The PR merely updates Resolver, there is nothing done yet (ie caching) for "one extra" collect that happens during install phase in Maven 4.
I am now at
m4
Total time: 01:18 min (Wall Clock)
with the changes from https://github.com/apache/maven/issues/2481#issuecomment-3245232274
which is only marginally slower than m3 but given that the original measurements with m4 RC3 I took where nearly 3x slower compared to m3 - this issue is essentially solved. Great job!
@cstamas https://github.com/apache/maven/pull/11043 is targeted to master which is 4.1.x. Do you plan to backport it for 4.0.x? I tried to use head 4.0.x and update the maven-resolver dependency to 2.0.11/2.0.12-SNAPSHOT, but the compilation fails (back compatibility issue).
Yes, we work against master and backport to 4.0.x. so it is currently where it was, as even master was not merged.
Or, you can locally try to apply resolver-2.0.11 to 4.0.x?
https://github.com/apache/maven/pull/11043