[MCOMPILER-515] This plugin is not "incremental"
Drop the "incremental" bit of this plugin, make it just dumb always redo everything. The incremental bits just introduced bugs that exist for over 10 years, and just make us look bad.
So just let "incremental" part go away.
This change renders 2 ITs broken:
[ERROR] The following builds failed:
[ERROR] * MCOMPILER-500-package-info-incr/pom.xml
[ERROR] * default-incremental-disable/pom.xml
That IMHO should be dropped as well, if this PR goes in.
https://issues.apache.org/jira/browse/MCOMPILER-515
As expected, the two ITs did fail on GH CI.
As expected, without the two IT it all pass OK.
-1 I'm happy to not have to recompile everything all the time. If you think there is a bug when changing the configuration it will be better to fix it correctly (e.g comparing inputs between 2 runs).
Also -1 from me.
Maybe it is not incremental compile - but detecting if module should be recompiled.
Detecting on change for project configuration should be fixed instead.
@olamy, good, am really happy for you. So, there is "nothing to be seen here", everything is cool, is it?
I have to admit, I was unaware of MCOMPILER project state: "Fun" starts that this issue was reported 10 years ago, but was closed due inactivity: https://issues.apache.org/jira/browse/MCOMPILER-213 But even more "fun" can be seen when one search for "MCOMPILER compiler incremental" in JIRA: https://issues.apache.org/jira/issues/?jql=project%20%3D%20MCOMPILER%20AND%20text%20~%20%22compiler%20incremental%22%20order%20by%20lastViewed%20DESC
There are a TON of issues, that are mostly neglected. And all is about "incremental...". There are (relatively) young issues, only 4 years without feedback https://issues.apache.org/jira/browse/MCOMPILER-345 but there are old issues as well, that got closed, while users commented on them "this is still an issue" https://issues.apache.org/jira/browse/MCOMPILER-187 There are REAL issues where users are baffled WHAT AND HOW this plugin is meant to work at all https://issues.apache.org/jira/browse/MCOMPILER-209 -- this one is really funny, and all this is just the tip of the iceberg. Seems nobody understands this "incremental" feature.
And now about the sad part: this "feature" is present in ONLY ONE class, that today has 2000 LOC. This PR removes cca 400 LOC from it (roughly 20%), and it "still works", no UT affected at all, and only two ITs out of 60 were affected. I hope we do understand what this means?
Maybe not aware, but incremental compilation (and incremental build for that matter) has been solved for Maven years ago. There are situations, where things go so bad, that really just a "let's redo it from scratch" can help, Everything else would just potentially worsen the already bad situation. And this plugin (should be) one of the "crux" of Maven (like m-install-p/m-deploy-p). It's maybe me, who is not satisfied with "current state of affairs" and want to change something about it? Maybe.
By vetoing this PR, you choose really the only possible bad outcome: nothing to happen, and continue this "status quo" of broken state. A bit better, but still bad outcome will be that someone will dedicate his time to "fix this", but we all know that solution is about "inputs and outputs", and this plugin and this "incremental" has no clue about that, so wasted time. I just say "let's burn it".
And one more thing: are there some measurements, some numbers, perf tests, whatever, that prooves the gain (seemingly at cost of correctness) Maven "wins" with this "incremental" feature?
maybe incremental is not the right name for this but yes on a 300 modules project not recompiling everything all the time makes a big difference at the end of the day
think there are multiple points mixed there:
- Default behavior (incremental=true) is to not recompile if upstream didn't change -> we must keep it IMHO
- Toggle behavior (incremental=false) is to only recompile what changed (if you have 10 source files and modified 1 then you only recompile this last one). This mode is broken I agree for a lot of reasons in java ecosystem (annot proc being an example).
So overall I think it makes sense to keep at least the default behavior
As everyone talks how this is "must to have" but nobody provided any number to support this claim, I tried to convince myself this is "must to have". So...
Subject build:
Jetty branch jetty-10.0.x (6e82e70edf6a106b9fd08ccfacd840efef8a95f9) with 172 modules from https://github.com/eclipse/jetty.project
Preparations
Built this checked out project "as usual" (by the book) using -DskipTests as README recommends to prime local repo. After local repo "primed" (to have no remote fetch happens), spotted that asciidoctor-maven-plugin always goes remote, so removed documentation module that uses it, to prevent huge variance (as it goes directly to central, not local MRM).
Then, as 2nd (w/o clean) invocation consistently failed, had to remove tests-integration module as well, to be able to test "incremental" use case.
This leaves us with 170 modules.
Finally,. changed jetty to use locally built 3.11.0-SNAPSHOT of maven-compile-plugin:
- master (5be316391ea7c575ee65c3fe83b4c7b30e1174d0)
- vs PR (7b59074b400974b57e9c4cd84269acd39d7a7f9a)
Full diff of my local changes: https://gist.github.com/cstamas/b637f4bce0967836cf835cf21e105ab9
Used Invocations
I used following 3 invocations to time:
1st invocation: mvn clean install -DskipTests
2nd invocation: mvn install -DskipTests
3rd invocation: mvn verify -DskipTests
Used Maven "total time" output as measure.
With m-c-p:master
1st master: 02:10 min 2nd master: 01:39 min 3rd master: 01:35 min
With m-c-p:PR
1st PR: 02:11 min 2nd PR: 01:35 min 3rd PR: 01:36 min
Conclusion
Please, somebody, anybody, repeat these (the numbers cannot be "cross" compared, only can be compared among runs on same HW/env), and as always, "this is not a proper benchmark" type of notice... but.
So far, it seems Jetty project (170) modules is not "large" enough to produce noticeable difference?
as you are using a project which need install this doesn't work.... because some artifacts are part of the reactor and rebuild (e.g re package) so they are detected as new dependency and so re compiled.... just look the output and it will say everything is recompiled... I have some work locally which handle this case but never pushed this.
Nope, I removed bits needing install, 3rd invocation is mvn verify, please reread.
Here is a simple project showing the usage:
rmannibucau@rmannibucau-yupiik:/opt/rmannibucau/dev/demo-sunny $ mvn clean
[INFO] Scanning for projects...
[INFO]
[INFO] -----------------------< org.example:demo-sunny >-----------------------
[INFO] Building demo-sunny 1.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ demo-sunny ---
[INFO] Deleting /opt/rmannibucau/dev/demo-sunny/target
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.158 s
[INFO] Finished at: 2022-12-13T12:21:00+01:00
[INFO] ------------------------------------------------------------------------
rmannibucau@rmannibucau-yupiik:/opt/rmannibucau/dev/demo-sunny $ mvn compile
[INFO] Scanning for projects...
[INFO]
[INFO] -----------------------< org.example:demo-sunny >-----------------------
[INFO] Building demo-sunny 1.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ demo-sunny ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO]
[INFO] --- maven-compiler-plugin:3.10.1:compile (default-compile) @ demo-sunny ---
[INFO] Changes detected - recompiling the module!
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1.391 s
[INFO] Finished at: 2022-12-13T12:21:04+01:00
[INFO] ------------------------------------------------------------------------
rmannibucau@rmannibucau-yupiik:/opt/rmannibucau/dev/demo-sunny $ mvn compile
[INFO] Scanning for projects...
[INFO]
[INFO] -----------------------< org.example:demo-sunny >-----------------------
[INFO] Building demo-sunny 1.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ demo-sunny ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO]
[INFO] --- maven-compiler-plugin:3.10.1:compile (default-compile) @ demo-sunny ---
[INFO] Nothing to compile - all classes are up to date
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.572 s
[INFO] Finished at: 2022-12-13T12:21:07+01:00
[INFO] ------------------------------------------------------------------------
as you can see gain is quite huge
@rmannibucau i hope you understand your example does not make sense? Can you show me a real life example, a full build? What total gain are we talking about here?
Nope, I removed bits needing install, 3rd invocation is
mvn verify, please reread.
read my comment especially on the reactor and rebuild of artifacts.
what do you see in the output
Changes detected - recompiling the module!
or
Nothing to compile - all classes are up to date
because for every build (except the module jetty-slf4j-impl everything is recompiled)
if not please post the output.
mvn clean install -DskipTests
for jetty-10.0.x branch you have 02:10 min?? really?? you have a very very beefy machine so recompiling or not doesn't make any difference for you.... I know this build and 2:10m for clean install means you have a very beefy machine.
Full logs of mvn verify -DskipTests are here (they are huge) https://drive.google.com/drive/folders/1SIaAbOLegBxCLu87jB3CFtfe3cHmid2P?usp=sharing
(fixed link, sorry)
My test
- Classes in target prod and unit: 4883
- Maven modules: 326
- JDK 1.8
- Os: MacOs 12.6.1 - 2,6 GHz 6-Core Intel Core i7, 32 GB 2667 MHz DDR4
- m-c-p 3.10.1
Scenario 1
mvn test-compile
mvn test-compile | grep -e "Nothing to compile\|Changes detected\|Total time" > 1.txt
Scenario 2
mvn clean
mvn test-compile | grep -e "Nothing to compile\|Changes detected\|Total time" > 2.txt
Result 1
grep "Changes detected" 1.txt | wc -l
1
grep "Nothing to compile" 1.txt | wc -l
313
[INFO] Total time: 01:02 min
Result 2
grep "Changes detected" 2.txt | wc -l
314
grep "Nothing to compile" 2.txt | wc -l
0
[INFO] Total time: 05:59 min
@rmannibucau a bit of simple math: your "full" compile: 1391ms your "incremental" compile: 572ms difference is: 819ms On "big" build (a 300 module build was mentioned): 300 x 819ms = 245700ms = 245,7s = 4,095min
So, the diff you showed would produce on 300 module build total time difference of 4 minutes? Am interested in how much percentage these 4 minutes are of the full build time....30%? 10%? 1%? 0.05%? Let's find out!
@slawekjaranowski please compare m-c-p master (or any released) vs this PRl :smile: As your test really compares m-c-p against m-c-p, not m-c-p master/released against PR.
@cstamas as Olivier mentionned, your build does not use this feature so you compare twice the same things. Also my example shows the real gain. I tested on a work project - sorry, there I can't share the full outputs but it is similar to the demo I shared but with 31 modules: 17s for the first build, 10s for the second using the command mvn package -pl '-documentation' -DskipTests - note that install does not change this +-0.5s - and validating Nothing to compile - all classes are up to date is actually used.
Finally, doing a bit of simple math the 4mn are ~50% of the build as you can see yourself, now if you add tests and your tests take 10mn I agree with you but it is another problem. If your point is that compilation is always a small amount of time of a full build then this is highly wrong cause you think as a CI but a dev does not always run the full project build but working you often have 'shortcutted' builds to go faster and there, +-1mn is something - human time - so yes 4mn is a lot, means you can build x2 so likely produce way more.
@rmannibucau could you please build locally this PR and try your private project with that same steps? Just interested in times.... (the "31 modules: 17s for the first build, 10s for the second using the command mvn package -pl '-documentation' -DskipTests")
@cstamas 16s first time then always ~14s so overall the diff logic is worth it even if it costs some time - guess we can make it faster there too tracking files more efficiently in the project but overall this PR has a negative impact on the overall build time.
@rmannibucau ok, thanks for doing this. As I thought: PR is clearly better at "full recompile" case of m-c-p master
As I thought: PR is clearly better at "full recompile" case of m-c-p master
@cstamas you mean slower and produces the same output? master does by default a full rebuild in 10s whereas your PR does a full rebuild in 14s
Ok, tried with another project, Maven this time:
maven master: 984f73dc7ca2515ee520fbded3382820f62116e9
Invocations
1st: mvn clean install -DskipTests
2nd: mvn verify -DskipTests
With master
1st: 49.868s 2nd: 36.577s
With PR
1st: 48.978s 2nd: 41.383s
And again, the feature w/ master did not work (!) as I understand, as there are inter-dependencies in reactor?
What's the point of multi module build if this feature does NOT work with it? Am I right that this feature works on multi module build ONLY if there are no inter project dependencies? I just realized what @olamy was saying....
@cstamas you mean slower and produces the same output? master does by default a full rebuild in 10s whereas your PR does a full rebuild in 14s
The I don't understand this: "17s for the first build, 10s for the second", isn't first the "full"?
The I don't understand this: "17s for the first build, 10s for the second", isn't first the "full"?
@cstamas the first is the run after a clean which means this PR or master will do the same - this is also why it runs in the same time. All consecutive builds are package/install without clean - exact same command than the first run, dev case - and there your PR is significantly slower so you should compare the > 1 build durations and not the first one which is globally the same without the state checking (but this one is almost free as you can see - at least in the error rate).
Recap what we have so far:
- "incremental" feature may bring 4 minute gain on 300 module build, unless
- there are inter-module dependencies, as in that case this feature would work only on "root" modules?
Is this right?
"incremental" feature may bring 4 minute gain on 300 module build, unless
If we ignore the figures right - highly depends your build, tomee container/openejb-core module for ex, it is 14s -> 4s (1 module so if you do the math you almost save 1h for your example ;)).
there are inter-module dependencies, as in that case this feature would work only on "root" modules?
Kind of, "magic" happens there https://github.com/apache/maven-compiler-plugin/blob/master/src/main/java/org/apache/maven/plugin/compiler/AbstractCompilerMojo.java#L897 and agree the rule can be enhanced to better manage multi-reactor builds, can be a great enhancement/PR to do.
If we ignore the figures right - highly depends your build, tomee container/openejb-core module for ex, it is 14s -> 4s (1 module so if you do the math you almost save 1h for your example ;)).
Sure, we can come up with numbers as we like, but I will shut up in a moment this very feature saves 1h on a real life project out there.
Kind of, "magic" happens there https://github.com/apache/maven-compiler-plugin/blob/master/src/main/java/org/apache/maven/plugin/compiler/AbstractCompilerMojo.java#L897 and agree the rule can be enhanced to better manage multi-reactor builds, can be a great enhancement/PR to do.
Sure, but this also means that almost no multi module build benefits from this feature, as the sole idea of multi module build is usually to "depend on each other". I can imagine some "lab" project where you have 300 modules that do not depend on each other, but I guess this cannot be called "majority", unless am mistaken.
Master
commit 5be316391ea7c575ee65c3fe83b4c7b30e1174d0
Result 1
- next execution - rebuild is skipped
[INFO] Total time: 01:12 min
Result 2
- after clean
[INFO] Total time: 05:42 min
PR 160
commit 7b59074b400974b57e9c4cd84269acd39d7a7f9a
Only scenario 2, many times the same command - mvn test-compile on the same workspace,
all modules are recompiled everytime, sample of time:
[INFO] Total time: 06:25 min
[INFO] Total time: 05:34 min
[INFO] Total time: 05:24 min
take similar time like mvn clean ... in my case
so difference from 1 to 5 minutes is important for project which I work at usual day, and many time I need to rebuild
that do not depend on each other, but I guess this cannot be called "majority", unless am mistaken.
Depends how you build the multi-modules projects. Often you have some "core" bottleneck in the build graph then a tons of children (think camel if you want, core then most of the modules are children), then if you run a command bypassing the core - built manually before with install - then you get the benefit from it. Fully agree it would be better to not have to handle the commands manually but as of today it works and saves time - at least on some big projects I'm working on, agree it can maybe not be the most ones but since code is there I would rather go in the direction to enhance it instead of dropping it otherwise we drop most of the plugin code and plexus abstraction to purely leverage tool provider and be it for the plugin at the end - can be an option but at the opposite of current design which is to make most cases working by default/convention.