7.0.6 takes longer to deploy (delaying start up and triggering a 60 seconds timeout)
Environment Details
- GlassFish Version (and build number): 7.0.6
- JDK version: 11.0.19+7
- OS: OmniOS 151038
- Database: postgresql 15.3
Problem Description
Start takes significantly longer with 7.0.6 than 7.0.5
This would not be a problem, although a minor inconvenience, were is not for the 60 seconds timeout for start which is exceeded.
Glassfish 7.0.6
[ Jul 2 12:11:29 Executing start method ("/opt/XXXX/glassfish/bin/asadmin start-domain"). ]
Waiting for domain1 to start ............................................................
Waiting finished after 60,000 ms.
No response from the Domain Administration Server (domain1) after 60 seconds.
The command is either taking too long to complete or the server has failed.
Please see the server log files for command status.
Please start with the --verbose option in order to see early messages.
Command start-domain failed.
Glassfish 7.0.5
[ Jul 2 12:14:37 Executing start method ("/opt/XXXX/glassfish/bin/asadmin start-domain"). ]
Waiting for domain1 to start .....................................
Waiting finished after 36,895 ms.
Successfully started the domain : domain1
domain Location: /XXXX/glassfish/domain1
Log File: /XXXX/glassfish/domain1/logs/server.log
Admin Port: 4,848
Command start-domain executed successfully.
Steps to reproduce
Update to 7.0.6
Impact of Issue
Slow start triggers a 60 second limit which flags the start up as failed. Glassfish actually starts but it says it doesn't, return value from asadmin I presume. This causes the SMF status to go to maintenance.
Solution is working out why 7.0.6 takes twice as long to start however it also should be possible to extend the start time out - maybe it is, I can't see how.
To extend the start timeout, try the following env (default 60_000L):
export AS_START_TIMEOUT=600_000L
export AS_START_TIMEOUT=600000
with no underscore nor L worked. Thank you very much for this.
Waiting for domain1 to start Environment property AS_START_TIMEOUT is set to a value 10_000L which cannot be parsed to long.
Set low so I can see it is controlling the time:
[ Jul 3 09:10:34 Executing start method ("/opt/XXXX/glassfish/bin/asadmin start-domain"). ]
Waiting for domain1 to start ..........
Waiting finished after 10,000 ms.
No response from the Domain Administration Server (domain1) after 10 seconds.
Yes it works, now to extend and test 7.0.6
7.0.5:
[ Jul 3 09:09:31 Executing start method ("/opt/XXXX/glassfish/bin/asadmin start-domain"). ]
Waiting for domain1 to start ..................................
Waiting finished after 31,529 ms.
Successfully started the domain : domain1
domain Location: /XXXX/glassfish/domain1
Log File: /XXXX/glassfish/domain1/logs/server.log
Admin Port: 4,848
Command start-domain executed successfully.
[ Jul 3 09:10:05 Method "start" exited with status 0. ]
7.0.6 with AS_START_TIMEOUT=600000
[ Jul 3 09:13:51 Executing start method ("/opt/XXXX/glassfish/bin/asadmin start-domain"). ]
Waiting for domain1 to start ..............................................................
Waiting finished after 61,293 ms.
Successfully started the domain : domain1
domain Location: /XXXX/glassfish/domain1
Log File: /XXXX/glassfish/domain1/logs/server.log
Admin Port: 4,848
Command start-domain executed successfully.
[ Jul 3 09:14:55 Method "start" exited with status 0. ]
The timeout extension prevents the return value making the OS think it failed.
7.0.6 takes twice the time of 7.0.5. Otherwise identical. ears, wars, data, etc.
Now with the timeout fixed I can progressively undeploy and prune to find a cause. I find that that including the Square payment dependencies makes a difference. In pom.xml:
<dependency>
<groupId>com.squareup</groupId>
<artifactId>square</artifactId>
<version>30.0.1.20230608</version>
</dependency>
This increases the size of my WAR from 400k to 10MB and increases the deploy time . On GF 7.0.5 it increases from 2 to 5 seconds, on GF 7.0.6 this increases from 2 seconds to 30. So although Square adds a lot of dependency files (probably needlessly, I guess a lot are never used) 7.0.6 is taking 6 times longer to process.
Note I have removed all the Square code so to reproduce just add the dependency. I try another project and just add the dependency, Deploy times adding dependency. 6 times longer with 7.0.6:
7.0.5 1.7 -> 5.1
7.0.6 1.7 -> 30.0
I have a small script to print the stack trace of the thread that's running the deploy command. I use it when I want to know what's happening during deployment when the deployment is slow. Maybe it will help you investigate what's happening:
jcmd `cat glassfish/domains/domain1/config/pid` Thread.print | tac | sed -n '/DeployCommand/,/^$/p' | tac
if you run it regularly during deployment, you'll see what is taking so long. I suspect it's CDI scanning (Weld), since the JAR file probably has a lot of classes to scan for bean-defining annotations.
Thank you for the suggestion. I have run the above command during deployment. In the phase it appears to take the time this is the output:
"admin-listener(1)" #68 daemon prio=5 os_prio=64 cpu=28557.02ms elapsed=187.88s tid=0x0000000008029000 nid=0x5b runnable [0xfffffc7fbefc7000]
java.lang.Thread.State: RUNNABLE
at java.util.AbstractCollection.addAll([[email protected]/AbstractCollection.java:352](mailto:[email protected]/AbstractCollection.java:352))
at org.glassfish.weld.connector.WeldUtils.isCDIEnablingAnnotation(WeldUtils.java:323)
at org.glassfish.weld.connector.WeldUtils.isCDIEnablingAnnotation(WeldUtils.java:336)
at org.glassfish.weld.connector.WeldUtils.isCDIEnablingAnnotation(WeldUtils.java:336)
at org.glassfish.weld.connector.WeldUtils.isCDIEnablingAnnotation(WeldUtils.java:336)
at org.glassfish.weld.connector.WeldUtils.isCDIEnablingAnnotation(WeldUtils.java:336)
at org.glassfish.weld.connector.WeldUtils.isCDIEnablingAnnotation(WeldUtils.java:336)
at org.glassfish.weld.connector.WeldUtils.isCDIEnablingAnnotation(WeldUtils.java:336)
at org.glassfish.weld.connector.WeldUtils.isCDIEnablingAnnotation(WeldUtils.java:309)
at org.glassfish.weld.connector.WeldUtils.hasCDIEnablingAnnotations(WeldUtils.java:206)
at org.glassfish.weld.connector.WeldUtils.hasCDIEnablingAnnotations(WeldUtils.java:185)
at org.glassfish.weld.connector.WeldUtils.isImplicitBeanArchive(WeldUtils.java:172)
at org.glassfish.weld.connector.WeldUtils.isImplicitBeanArchive(WeldUtils.java:160)
at org.glassfish.weld.BeanDeploymentArchiveImpl.populate(BeanDeploymentArchiveImpl.java:469)
at org.glassfish.weld.BeanDeploymentArchiveImpl.<init>(BeanDeploymentArchiveImpl.java:146)
at org.glassfish.weld.RootBeanDeploymentArchive.createModuleBda(RootBeanDeploymentArchive.java:68)
at org.glassfish.weld.RootBeanDeploymentArchive.<init>(RootBeanDeploymentArchive.java:64)
at org.glassfish.weld.RootBeanDeploymentArchive.<init>(RootBeanDeploymentArchive.java:56)
at org.glassfish.weld.DeploymentImpl.createModuleBda(DeploymentImpl.java:721)
at org.glassfish.weld.DeploymentImpl.<init>(DeploymentImpl.java:141)
at org.glassfish.weld.WeldDeployer.load(WeldDeployer.java:228)
at org.glassfish.weld.WeldDeployer.load(WeldDeployer.java:115)
at org.glassfish.internal.data.ModuleInfo.load(ModuleInfo.java:182)
at org.glassfish.internal.data.ApplicationInfo.load(ApplicationInfo.java:317)
at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:531)
at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:259)
at org.glassfish.deployment.admin.DeployCommand.execute(DeployCommand.java:467)
at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:570)
at com.sun.enterprise.v3.admin.CommandRunnerImpl$2$1.run(CommandRunnerImpl.java:566)
at java.security.AccessController.doPrivileged([[email protected]/Native](mailto:[email protected]/Native) Method)
at javax.security.auth.Subject.doAs([[email protected]/Subject.java:361](mailto:[email protected]/Subject.java:361))
at com.sun.enterprise.v3.admin.CommandRunnerImpl$2.execute(CommandRunnerImpl.java:565)
at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:596)
at com.sun.enterprise.v3.admin.CommandRunnerImpl$3.run(CommandRunnerImpl.java:588)
at java.security.AccessController.doPrivileged([[email protected]/Native](mailto:[email protected]/Native) Method)
at javax.security.auth.Subject.doAs([[email protected]/Subject.java:361](mailto:[email protected]/Subject.java:361))
at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:587)
at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:1478)
at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1847)
at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1723)
at org.glassfish.deployment.admin.ReDeployCommand.execute(ReDeployCommand.java:107)
So yes, CDI and Weld is there. We presume the Square dependencies give it a lot of work and this is on a 1W per core processor but neither accounts for why 7.0.6 takes so much longer.
Change for 7.0.6: https://github.com/eclipse-ee4j/glassfish/pull/24480
Just for the record, without any deployment on my machine (Xeon, 64GB RAM):
Eclipse GlassFish 7.0.5 (branch: master, commit: 7b49f4eaf669e9a3b4f4648254f9016872455236, timestamp: 2023-04-02T15:44:03Z)
startup time: Embedded (6 462 ms), startup services (3 234 ms), total (9 696 ms)]]
Eclipse GlassFish 7.0.6 (commit: 9e82903b2afcab9478003fb5adb7bd6a466e4982)
startup time: Embedded (6 347 ms), startup services (2 672 ms), total (9 019 ms)]]
It seems the issue is really in deployment.
Deployment is a part of useful startup which was tripping a timeout causing failure, but yes, we have narrowed it to deployment and I have amended the title accordingly.
You could try to downgrade Weld to 5.1.0 by copying https://mvnrepository.com/artifact/org.jboss.weld/weld-osgi-bundle/5.1.0.Final to [glassfish root]/glassfish/modules/weld-osgi-bundle.jar and see if that makes a difference.
I haven't tried this, but I believe the downgrade should be possible.
@oxdrove any update?
I swapped the weld jars from 5 and 6, no change.
I have tried to build from source with a view to using Weld 5.1.0 not 5.1.1. I have read README.md specifically "## Building". I run "mvn clean install" and it complains:
$ mvn clean install
Error: Could not find or load main class org.codehaus.plexus.classworlds.launcher.Launcher
Caused by: java.lang.ClassNotFoundException: org.codehaus.plexus.classworlds.launcher.Launcher
How to build?
I swamped the weld jars from 5 and 6, no change.
Thanks for testing that.
In the latest commit I just updated Weld to 5.1.1.SP1. You could try that one too:
https://ci.eclipse.org/glassfish/job/glassfish_build-and-test-using-jenkinsfile/job/PR-24498/1/artifact/bundles/glassfish.zip
Error: Could not find or load main class org.codehaus.plexus.classworlds.launcher.Launcher
That looks like a Maven environment not correctly configured. Specifically the M2_HOME may not be correct. @dmatej, any idea?
What maven version and java do you use? mvn -version You should use the latest maven 3 and jdk17
$ java -version
openjdk version "11.0.19" 2023-04-18
OpenJDK Runtime Environment (build 11.0.19+7-omnios-151038)
OpenJDK 64-Bit Server VM (build 11.0.19+7-omnios-151038, mixed mode)
From README.md:
### Prerequisites
* JDK11+, we strongly recommend using JDK17 to build GlassFish with all
mvn works for other projects, There appears to be something odd about maven on the build machine, I am investigating
Maven: My fault. The build server has to have an environment that uses the freshly built software before the installed packages, that is how a new package links to a library before the library is installed, if it were installed the removal of the old version would stop installed packages working.
mvn is working, the build now report failure, I will investigate before reporting here.
7.0.7 is as slow as 7.0.6.
Building gives errors,
$ mvn -X clean install -Pfastest
...
[ERROR] Failed to execute goal io.github.git-commit-id:git-commit-id-maven-plugin:5.0.0:revision (get-the-git-infos) on project glassfish-common: .git directory is not found! Please specify a valid [dotGitDirectory] in your pom.xml -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal io.github.git-commit-id:git-commit-id-maven-plugin:5.0.0:revision (get-the-git-infos) on project glassfish-common: .git directory is not found! Please specify a valid [dotGitDirectory] in your pom.xml
...
...no, I'm trying to build it not commit it.
$ mvn -X clean compile -Pfastest
...
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.6.0:unpack (unpack) on project glassfish: Artifact has not been packaged yet. When used on reactor artifact, unpack should be executed after packaging: see MDEP-98. -> [Help 1]
$ mvn -X clean package -Pfastest
...
[ERROR] Failed to execute goal io.github.git-commit-id:git-commit-id-maven-plugin:5.0.0:revision (get-the-git-infos) on project glassfish-common: .git directory is not found! Please specify a valid [dotGitDirectory] in your pom.xml -> [Help 1]
no, I just want to build it from source.
It would help if it was clearer on what "your" pom.xml file is.
$ find . -name pom.xml | wc
562 562 27066
Plenty from which to choose and none mentions dotGitDirectory.
$ find . -name pom.xml | xargs grep -l dotGitDirectory | wc
0 0 0
$ mvn -X clean package -Pfastest ... [ERROR] Failed to execute goal io.github.git-commit-id:git-commit-id-maven-plugin:5.0.0:revision (get-the-git-infos) on project glassfish-common: .git directory is not found! Please specify a valid [dotGitDirectory] in your pom.xml -> [Help 1]no, I just want to build it from source.
Maven tries to resolve the git commit id which is used in GlassFish internally.
...and fails. I either fudge it, get a git commit id or give up. I only know how to do the later.
...and fails. I either fudge it, get a git commit id or give up. I only know how to do the later.
git clone [email protected]:eclipse-ee4j/glassfish.git
cd glassfish
mvn clean install -Pfastest -pl 'org.glassfish.main.distributions:glassfish' -am -T4C
$ git clone [email protected]:eclipse-ee4j/glassfish.git
Cloning into 'glassfish'...
The authenticity of host 'github.com (140.82.121.4)' can't be established.
ED25519 key fingerprint is SHA256:+DiY3wvvV6TuJJhbpZisF/zLDA0zPMSvHdkr4UvCOqU.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'github.com' (ED25519) to the list of known hosts.
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Sorry I need to go and do something useful. Stepping back from build, the problem points to Weld, isn't Weld a dependent project? So the change is is Weld not Glassfish and we should be looking at Weld and not Glassfish?
No, it's not related to Weld. It's used by the GlassFish distribution artifact to retrieve the git commit ID and add it to the GlassFish ZIP.
You can fix this message and skip running the git plugin with this:
- Open
nucleus/parent/pom.xmlin GF source code - Find "git-commit-id-maven-plugin" plugin
- In the "configuration" section of the plugin, add the following line:
<failOnNoGitDirectory>false</failOnNoGitDirectory>
Unfortunately, it's not possible to configure this via a system property on the command line, you need to modify the file.
I already tried the failOnNoGitDirectory but had another error and have been working on it.
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:3.1.0:run (do stuff) on project glassfish: An Ant BuildException has occured: The following error occurred while executing this line:
[ERROR] /scratch/build/glassfish/64/appserver/distributions/glassfish/target/antrun/build-main.xml:9: The archive microprofile-jwt-auth-api.jar doesn't exist
[ERROR] around Ant part ...<jarupdate basedir="/scratch/build/glassfish/64/appserver/distributions/glassfish/src/main/patches/microprofile-jwt-auth-api" destfile="/scratch/build/glassfish/64/appserver/distributions/glassfish/target/stage/glassfish7/glassfish/modules/microprofile-jwt-auth-api.jar" includes="META-INF/MANIFEST.MF" />... @ 17:310 in /scratch/build/glassfish/64/appserver/distributions/glassfish/target/antrun/build-main.xml
It is trying to make this file with a macro so it will not exist until it makes it itself. It does not complain it can not make the file. The line above, in the pom.xml, that makes xmlsec.jar does work. I can not see why.
Summery of progress:
- It appears not to be the Weld version
- workaround: install Payara
Longer version:
I have install java 17, this lets it build - documentation is in error. This wasted a not inconsiderable amount of time because glassfish required the non intuitive (from the reported error)
asadmin create-jvm-options "--add-opens jdk.naming.dns/com.sun.jndi.dns=ALL-UNNAMED
I compiled 7.0.6 with "Integrate Weld 5.1.1 #24480" in reverse. Same slow load.
Some change between 7.0.5 and 7.0.6 has caused this.
The logical this to do is compile 7.0.5 and incrementally add the changes that created 7.0.6. Sadly 7.0.5 compiled does not run as 7.0.5 binary version [for me]. It just hangs on start. I can take 7.0.5 and 7.0.6 and diff the two to play spot-the-difference but this is taking me too long.
The sensible thing to do it for someone that actually understands the build and the changes that took place to test this.
It just hangs on start.
Actually it usually doesn't hang, but the initialization failed for some reason, so it cannot log anything. The first log is logged by the start-domain command, not by the server, that confused me years ago too. To see what happened you can try to execute asadmin start-domain --verbose which allocates the console, so you can see the standard output. It should contain some stacktrace.
asadmin create-jvm-options "--add-opens jdk.naming.dns/com.sun.jndi.dns=ALL-UNNAMED
I did not find any usage of this module in GlassFish. Can you send us the stacktrace?
asadmin create-jvm-options "--add-opens jdk.naming.dns/com.sun.jndi.dns=ALL-UNNAMED
I did not find any usage of this module in GlassFish.
It's not. Expanding my "Glassfish requires" Glassfissh requires this option in order to pass to Java so it can run my applications. Glassfish being a means to and end and for non Glassfish developers not an end itself - I don't require Glassfish at all. Note too it's not java as such that needs it, java 17 runs my library methods perfectly well from the command line leading to my confusion as to what the opaque error message meant.
This issue has been marked as inactive and old and will be closed in 7 days if there is no further activity. If you want the issue to remain open please add a comment
I can't see what we could improve with this now until we receive the asadmin start-domain --verbose output at least. @oxdrove , I have doubts about your results too, because as I did maybe 200 tests of this in last 10 days, I noticed the influence of the laptop heat on these results. The worst result is when I use a script to unpack the zip file and start the domain immediately after that. The best result is when I close some browsers and I don't compile anything else and I just run asadmin start-domain --verbose.
(worst+best results) 7.0.5 Total startup time including CLI: 3 489 ms Total startup time including CLI: 2 684 ms
7.0.6 Total startup time including CLI: 3 484 ms Total startup time including CLI: 2 684 ms
7.0.18 Total startup time including CLI: 3 458 ms Total startup time including CLI: 2 689 ms
7.0.19-SNAPSHOT with #25183 Total startup time including CLI: 3 454 ms Total startup time including CLI: 2 677 ms
Your results are meaningless unless you indicate the presence of the square dependency.
You need to perform 4 tests: with w/o square times 7.0.5 or 7.0.6.
During investigation the question has narrowed from "7.0.6 takes longer to deploy" to the more specific "Why does 7.0.6 take longer to load Square than 7.0.5".