nxrocks
nxrocks copied to clipboard
[Bug] Maven commands fail randomly in CI/CD workflow
Plugin Name
@nxrocks/nx-spring-boot
Plugin Version
4.1.0
Nx Version
14.4.3
Expected Behaviour
./mvnw
commands should successfully completes on first execution.
Actual Behaviour
This project provides the script ./mvnw
to run maven commands when executing project targets (e.g. nx build <project>
). In my main CI/CD workflow (GH Action), the execution of targets that rely on ./mvnw
randomly fail.
Here are the high-level commands that may fail in my CI/CD work.
- run: yarn nx affected --target=build --parallel --max-parallel=3
- run: yarn nx run-many --all --target=test --parallel --max-parallel=2
Here is an example of error that randomly make my CI/CD workflow fail.
Error: Error executing Maven.
java.lang.NullPointerException
at java.base/java.util.concurrent.ConcurrentHashMap.putVal(ConcurrentHashMap.java:1011)
at java.base/java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1006)
at java.base/java.util.Properties.put(Properties.java:1301)
at java.base/java.util.Properties.setProperty(Properties.java:229)
at org.apache.maven.cli.MavenCli.populateProperties(MavenCli.java:1656)
at org.apache.maven.cli.MavenCli.properties(MavenCli.java:612)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:282)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:196)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:347)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.apache.maven.wrapper.BootstrapMainStarter.start(BootstrapMainStarter.java:47)
at org.apache.maven.wrapper.WrapperExecutor.execute(WrapperExecutor.java:156)
at org.apache.maven.wrapper.MavenWrapperMain.main(MavenWrapperMain.java:72)
> nx run shared-java-challenge-util:build
Executing command: ./mvnw package
Failed to execute command: ./mvnw package
Error: Command failed: ./mvnw package
at checkExecSyncError (node:child_process:828:11)
at execSync (node:child_process:899:15)
at runBuilderCommand (/__w/challenge-registry/challenge-registry/node_modules/@nxrocks/common/src/lib/core/jvm/utils.js:19:38)
at runBootPluginCommand (/__w/challenge-registry/challenge-registry/node_modules/@nxrocks/nx-spring-boot/src/utils/boot-utils.js:15:43)
at /__w/challenge-registry/challenge-registry/node_modules/@nxrocks/nx-spring-boot/src/executors/build/executor.js:10:62
at Generator.next (<anonymous>)
at /__w/challenge-registry/challenge-registry/node_modules/tslib/tslib.js:117:75
at new Promise (<anonymous>)
at Object.__awaiter (/__w/challenge-registry/challenge-registry/node_modules/tslib/tslib.js:113:16)
at buildExecutor (/__w/challenge-registry/challenge-registry/node_modules/@nxrocks/nx-spring-boot/src/executors/build/executor.js:8:20)
Error
at /__w/challenge-registry/challenge-registry/node_modules/@nxrocks/nx-spring-boot/src/executors/build/executor.js:12:19
at Generator.next (<anonymous>)
at /__w/challenge-registry/challenge-registry/node_modules/tslib/tslib.js:117:75
at new Promise (<anonymous>)
at Object.__awaiter (/__w/challenge-registry/challenge-registry/node_modules/tslib/tslib.js:113:16)
at buildExecutor (/__w/challenge-registry/challenge-registry/node_modules/@nxrocks/nx-spring-boot/src/executors/build/executor.js:8:20)
at /__w/challenge-registry/challenge-registry/node_modules/@nrwl/tao/src/commands/run.js:147:23
at Generator.next (<anonymous>)
at /__w/challenge-registry/challenge-registry/node_modules/tslib/tslib.js:117:75
at new Promise (<anonymous>)
at __awaiter (/__w/challenge-registry/challenge-registry/node_modules/tslib/tslib.js:113:16)
at runExecutorInternal (/__w/challenge-registry/challenge-registry/node_modules/@nrwl/tao/src/commands/run.js:127:34)
at Object.<anonymous> (/__w/challenge-registry/challenge-registry/node_modules/@nrwl/tao/src/commands/run.js:219:54)
at Generator.next (<anonymous>)
at /__w/challenge-registry/challenge-registry/node_modules/tslib/tslib.js:117:75
at new Promise (<anonymous>)
Restarting the failed job in the GH Actions may lead to a successful run as well as another failed run (seems random).
Based on the log, could it be that maven or the maven wrapper (./mvnw
) executions can not be reliably run in parallel? I run 2-3 tasks in parallel in my CI/CD workflow. The next troubleshooting step would be for me to test without running targets concurrently.
@tinesoft Have you ever observed this behavior with ./mvnw
?
Steps to reproduce the behaviour
- Fork https://github.com/Sage-Bionetworks/challenge-registry.
- The GH workflow
.github/workflows/ci.yml
may randomly fail.
Hi @tschaffter
@tinesoft Have you ever observed this behavior with ./mvnw?
No I haven't. sorry
Based on the log, could it be that maven or the maven wrapper (./mvnw) executions can not be reliably run in parallel? I run 2-3 tasks in parallel in my CI/CD workflow.
I would say so too, yes. It has nothing to do with the plugin per se, but rather in the concurrent capabilities of the projects that were ran.
The next troubleshooting step would be for me to test without running targets concurrently.
Yes that would be a good test indeed. You could also try to pinpoint from the nx affected
command output, which ones of your projects were actually built and try to restrict the scope to these specific projects.
Then, creating a simple test that build those x projects in parallel , by calling for example:
it('test parallel build', async () => {
for (let i = 0; i < 20; i++) {
execSync(`nx run-many --projects project1,project2,projectx --parallel 3`);
}
});
in a loop, could help reproduce the issue locally...
Looking at the source code of Maven itself, can also help: https://github.com/apache/maven/blob/9b656c72d54e5bacbed989b64718c159fe39b537/maven-embedder/src/main/java/org/apache/maven/cli/MavenCli.java
I spent some time exploring the issue and came up with a solution. The issue stems from the fact that concurrent executions of Maven is not safe. For example, two or more concurrent executions of Maven may attempt to download the same dependency at the same time and save it to ~/.m2/repository
, which I've seen resulting in errors (Issue A
).
A similar error occurs when multiple project targets that rely on@nxrocks/nx-spring-boot
are executed in parallel, which is the default when running an nx run-many
command. By default, @nxrocks/nx-spring-boot
relies on the Maven wrapper (mvnw
) for the sake of portability. Any execution of ./mvnw
will trigger the download of the mvn
binary if it is not yet installed, which is then installed in ~/.m2/wrapper
. In this context, errors may arise when parallel executions of an @nxrocks/nx-spring-boot
targets attempts to download the same version of the Maven binary at the same time (Issue B
). Below is the stack trace of an occurrence of this issue.
$ nx run-many --all target=build
...
Exception in thread "main" java.util.zip.ZipException: zip END header not found
at java.base/java.util.zip.ZipFile$Source.findEND(ZipFile.java:1469)
at java.base/java.util.zip.ZipFile$Source.initCEN(ZipFile.java:1477)
at java.base/java.util.zip.ZipFile$Source.<init>(ZipFile.java:1315)
at java.base/java.util.zip.ZipFile$Source.get(ZipFile.java:1277)
at java.base/java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:709)
at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:243)
at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:172)
at java.base/java.util.zip.ZipFile.<init>(ZipFile.java:186)
Exception in thread "main" java.nio.file.NoSuchFileException: /root/.m2/wrapper/dists/apache-maven-3.8.6-bin/67568434/apache-maven-3.8.6-bin.zip.part
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at org.apache.maven.wrapper.Installer.unzip(Installer.java:207)
at org.apache.maven.wrapper.Installer.createDist(Installer.java:110)
at org.apache.maven.wrapper.WrapperExecutor.execute(WrapperExecutor.java:151)
at org.apache.maven.wrapper.MavenWrapperMain.main(MavenWrapperMain.java:76)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:429)
at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:266)
at java.base/java.nio.file.Files.move(Files.java:1432)
at org.apache.maven.wrapper.Installer.createDist(Installer.java:95)
at org.apache.maven.wrapper.WrapperExecutor.execute(WrapperExecutor.java:151)
at org.apache.maven.wrapper.MavenWrapperMain.main(MavenWrapperMain.java:76)
Exception in thread "main" java.nio.file.NoSuchFileException: /root/.m2/wrapper/dists/apache-maven-3.8.6-bin/67568434/apache-maven-3.8.6-bin.zip.part
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:429)
at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:266)
at java.base/java.nio.file.Files.move(Files.java:1432)
at org.apache.maven.wrapper.Installer.createDist(Installer.java:95)
at org.apache.maven.wrapper.WrapperExecutor.execute(WrapperExecutor.java:151)
at org.apache.maven.wrapper.MavenWrapperMain.main(MavenWrapperMain.java:76)
The above issues become more and more likely to happen randomly as the Nx workspace grows in terms of number of Maven-based projects. Issue B
can be solved by using a globally-installed Maven binary, which can be achieved by specifying the option --ignoreWrapper
to the @nxrocks/nx-spring-boot
executor.
Ultimately, I decided to find a solution to Issue A
and Issue B
that still enables projects to use different versions of Maven for enhanced project isolation. First, I added the following target to my Java projects:
"prepare-java": {
"executor": "@nrwl/workspace:run-commands",
"options": {
"commands": [
"./mvnw dependency:go-offline -DexcludeGroupIds=org.sagebionetworks.challenge || true"
],
"cwd": "apps/challenge-api-gateway"
}
},
This target downloads Maven (mvw
) and all the dependencies of the project. Because I have Java projects that depend on a shared local library, the command ./mvnw dependency:go-offline
would fail because it would not find my local library that has not even been built at this stage. I may have been able to solve this using Nx project dependncies, but this would complicate this "preparation" stage. Instead, I'm specifying the option -DexcludeGroupIds=org.sagebionetworks.challenge
to prevent Maven from throwing an error when it attempt to download my shared local libraries. Unfortunately, the command mvn dependency:go-offline
has several bugs and one of them is that the option -DexcludeGroupIds
is not evaluated. The workaround I found was to add || true
to silence any errors that mvn dependency:go-offline
may generate. I really don't like doing that but since a failure of this prepare-java
target should not affect subsequent targets (lint
, build
, etc.), I can live with this shortcoming.
In my CI workflow, I run the following command to ultimately 1) install Maven and 2) install all the project dependencies (minus my shared local libraries) sequentially for the affected projects.
- run: yarn nx affected --target=prepare-java --parallel=1
- run: yarn nx affected --target=lint
- run: yarn nx affected --target=build
- run: yarn nx affected --target=test
This solves Issue B
while maintaining Java project isolation and most importantly, Issue A
too.
The release notes of the latest version of Maven mention improvements for concurrent builds. I have not evaluated the impact of these improvements on the reported issues but it sounded like a good time to update the version of Maven used by the wrapper, and update the wrapper itself at the same occasion.
For each Java project, I update the .mvn/wrapper/maven-wrapper.properties
file with the reference to the latest version of Maven and Maven wrapper:
distributionUrl=https://repo.maven.apache.org/maven2/org/apache/maven/apache-maven/3.8.6/apache-maven-3.8.6-bin.zip
wrapperUrl=https://repo.maven.apache.org/maven2/org/apache/maven/wrapper/maven-wrapper/3.1.1/maven-wrapper-3.1.1.jar
Once again, anyone using this plugin should face Issue B
at some point as the number of Java projects and targets increase in the workspace. Idem for Issue A
, unless Maven finds a way to safely handle the case where concurrent executions attempt to modify the same file at the same time. Because of the pseudo random nature of the issue, it may be worth adding a note about it to the README of this plugin.
Hi Thomas,
Thanks for your continuing interest in the plugin and for sharing the result of your investigation in such an exhaustive way! The issues you create on my repo are always very well detailed, for that, I officially name you, my "n°1 issue reporter" !
I will look into a way to implement (and document) your workaround, with:
- a new
go-offline
executor (that will download everything locally)
My only concern so far, is the compatibility when using Gradle (instead of Maven) as build system...
There is an offline mode in Gradle too, but it works differently from Maven's, as it requires the --offline
parameter to be passed along with each command, whereas for Maven the offline mode can be activated once (via mvnw dependency:go-offline
) and then benefit to subsequent commands.
I'm still researching the best way to achieve that, that would be transparent for final users of the plugin, wether they use Maven or Gradle.