jetty.project
jetty.project copied to clipboard
Embedded Jetty becomes unresponsive - though the server says it's started successfully, the netstat shows no port tied to the server
Jetty version - Initially discovered in 9.4.33, then in 9.4.36 and now in 9.4.39 as well.
Java version - JSK 1.8
OS type/version - Linux SUSE, RHEL
Description
Team,
Our product currently houses jetty server v9.4.26 and is working super fine in the field without any issues for the last 3years when we did upgrade to this version.
Since December of 2020, we are trying to upgrade to v9.4.33 and we are seeing a strange behavior - we do have our regression framework that runs around 20k tests - this is a combination of several different protocols and a majority of the tests uses http client and http server (embedded jetty). The tests runs fine upto a certain point , say about 13500 tests and suddenly hangs in between when trying to reach the embedded jetty server and when the clients receive no response, the tests fail. We are seeing this exact same behavior with v9.4.36 and as a test we upgraded to v9.4.39 hoping the issue to be resolved here, but not.
My observation goes like this -
- The Http Server (embedded jetty to be precise) works fine for a certain duration and then becomes totally unresponsive.
- Any request to the applications hosted on this server is timing out.
- When a request is sent from the URL, it is not hitting the jetty server.
In researching these above behavior, noticed that when the server becomes unresponsive, the server status shows it's still up and running, but the netstat command on the port occupied by the server gives no output and shows it as a free port. Once the server goes unresponsive, how many ever times we restart the application, the server always remain in the same state - up and running but doesn't listen. Occasionally, the server port shows as listening and the server can be reached normally and again after about max 2-3m, the server goes into unresponsive and non-listening mode.
I also came across another ticket against v9.4.33 which sounded similar to our problem - https://github.com/eclipse/jetty.project/issues/6059.
As part of the trouble shoot, I tried to dump the jetty server status -
- When the server is started once it becomes unresponsive
- When the server is stopped
- Tried to dump the jetty server state when the http requests all fail to hit the embedded jetty server - but couldn't see anything going into our logs.
I have implemented both these in my code -
QueuedThreadPool threadPool = new QueuedThreadPool(numOfmaxThread, numOfminThread); threadPool.setDetailedDump(true); --- newly added code to extract the detailed dump
org.eclipse.jetty.server.Server server = new Server( threadPool ); server.dump() --- newly added code to dump the jetty server state - at start, stop and destroy.
Please find the attached log file - it contains a whole lot of data - but please search based on the keyword "SWATHI" to reach to the area where the jetty server dump is available.
Please note , we have discovered many security vulnerabilities in v9.4.26 and need to upgrade to minimum v9.4.33 to be complaint with the security standards and do business with out customer. We are trying to chase this server going unresponsive since 6 months now and haven't been successful at all. So any help in regards to my case is highly appreciated.
Regards Swathi BN noapp.txt
You have a custom server connector JettyPsServerConnector, so we cannot tell exactly what's going on - likely it's a problem in this class.
Apparently you configure acceptors=0 and the selector threads are in WAITING state, so you must be doing something really wrong with your connector.
Can you post the JettyPsServerConnector code?
Hi Simone,
Thanks for your quick response.
I spoke to my management and they recognize working with you on another jetty issue in the recent past and have given me the consent to share the JettyPsServerConnector code with you. However, this forum being public and our code being proprietary to the company, I am allowed to share the code only to your email id to restrict access to everyone on the internet. Given this, can you share your email id please that way I can share the details with you directly - your email id will be kept confidential and here is my official id - [email protected] , to which you can mail to share the details.
Regards Swathi BN
Hi Simone,
After getting the clearance from my team , I am uploading the source code for JettyPsServerConnector class. Please review and let us know if this is causing the issue of jetty server going unresponsive.
NOTE - The same code works just fine with jetty v9.4.26.
Regards Swathi BN JettyPsServerConnector.txt
Hi Simone,
After carefully reviewing the issue, our team architect was able to resolve the problem by setting SelectorMaxThreads=1 at the below highlighted line in the JettyPsServerConnector class -
public PSJettySelectorProvider(Executor executor, Scheduler scheduler, int selectors) { //super(executor, scheduler, 1); super(executor, scheduler, getSelectorConfiguredThreads()); }
In the above piece of code, getSelectorConfiguredThreads() , reads the property SelectorMaxThreads from one of our application's properties file and passes the value to the constructor accordingly. Out of the box, we are not setting this property and we are defaulting it based on jetty's computation. But to resolve the server going unresponsive, we did set this SelectorMaxThreads=1 - more tests to be conducted with this setting.
Please note, we currently have jety v9.4.26 version in the field and we have 7-8 customers who have reported the exact same problem of http/jetty server going unresponsive and we have had all those customers to set this property SelectorMaxThreads=1 to help resolve the issue. So ideally, the http/jetty server going unresponsive isn't witnessed only with 9.4.33 to 9.4.39, but seen in 9.4.26 as well and the property setting of SelectorMaxThreads=1 has been our rescue all the while.
Can you analyze our JettyPsServerConnector class and let us know what you infer out of it and what could be potentially causing the issue here and how does the setting of acceptors=0 and selectors=1 able to resolve the issue for us?
Regards Swathi BN
Make sure you use the correct SslContextFactory for your usage, there is a SslContextFactory.Server and SslContextFactory.Client that you should use for the specific side / behavior.
This change was introduced in 9.4.16.
Your code isn't using the standard Java selector, instead it looks like you have overridden the selector behavior with a custom selector class.
Why? And have you kept that class up to date with the changes in the JVM? (there were several BIG changes to the networking layer in Java 1.8 over the past 3 years, in the form of backports of behaviors from later Java releases to 1.8, such as ALPN). Also, do you have a custom version for each JVM type? (example: Oracle release, vs OpenJDK release, vs IBM release) all are different behaviors on the selectors. If you use different flavors of OpenJDK 1.8 (eg: AdoptOpenJDK, vs AWS, vs Azul) then you'll even see selector differences due to other selective backports (TLS/1.3 backport present in Azul and AWS flavors of 1.8)
@swathikumar4precisely setting selector=1 should not "solve" the problem.
You have a custom java.nio.Selector implementation via psSelector - I guess that it is broken, but I cannot tell as there is no source for that.
Why do you need a custom java.nio.Selector implementation?
This issue has been automatically marked as stale because it has been a full year without activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been closed due to it having no activity.