HTTP ERROR 500 org.eclipse.jetty.io.EofException
we are seeing frequent EOFException in Trino:
Error fetching http://<ip>:8080/v1/task/20250401_213932_04859_2vpcz.1.12.3/results/4/0: Expected response code to be 200, but was 500:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 500 org.eclipse.jetty.io.EofException</title>
</head>
<body>
<h2>HTTP ERROR 500 org.eclipse.jetty.io.EofException</h2>
<table>
<tr><th>URI:</th><td>http://<ip>/v1/task/20250401_213932_04859_2vpcz.1.12.3/results/4/0</td></tr>
<tr><th>STATUS:</th><td>500</td></tr>
<tr><th>MESSAGE:</th><td>org.eclipse.jetty.io.EofException</td></tr>
</table>
<h3>Caused by:</h3><pre>org.eclipse.jetty.io.EofException
at org.eclipse.jetty.server.internal.HttpChannelState.onClose(HttpChannelState.java:506)
at org.eclipse.jetty.server.internal.HttpConnection.close(HttpConnection.java:623)
at org.eclipse.jetty.util.IO.close(IO.java:615)
at org.eclipse.jetty.util.IO.close(IO.java:630)
at org.eclipse.jetty.io.ManagedSelector$CloseConnections.update(ManagedSelector.java:1000)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processUpdates(ManagedSelector.java:577)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:544)
at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produceTask(AdaptiveExecutionStrategy.java:514)
at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:258)
at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:201)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:311)
at org.eclipse.jetty.util.thread.MonitoredQueuedThreadPool$1.run(MonitoredQueuedThreadPool.java:73)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:979)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1209)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1164)
at java.base/java.lang.Thread.run(Thread.java:1575)
</pre>
</body>
</html>
We are seeing PAGE_TRANSPORT_ERROR as error code Trino version: 472.
We also saw this error. We upgraded from 461 -> 474. Before 474 we didn't saw it at all, no config, or queries changes
we downgraded trino from 472 -> 467, we still see this EOFException with jetty error.
I was seeing these errors too. I think most of them have gone away by adding this to the config:
internal-communication.http2.enabled=false
I'm seeing other unexplainable 500 errors in Trino 472 though. But the Expected response code to be 200, but was 500 ones have gone away after setting that config.
We are seeing similar errors on 469 with internal-communication.http2.enabled=false. The Trino cluster runs fine for a few hours after start up and then a lot of workers end up in a stuck state.
StackTrace
[2025-05-02 18:12:25.862307] 2025-05-02T18:12:25.861Z ERROR async-http-response-5254 org.glassfish.jersey.server.ServerRuntime$Responder Error while closing the output stream in order to commit response.
[2025-05-02 18:12:25.862343] java.io.UncheckedIOException: org.eclipse.jetty.io.EofException[2025-05-02 18:12:25.862359] at org.glassfish.jersey.message.internal.OutboundMessageContext.close(OutboundMessageContext.java:570)[2025-05-02 18:12:25.862372] at org.glassfish.jersey.server.ContainerResponse.close(ContainerResponse.java:406)[2025-05-02 18:12:25.862403] at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:763)[2025-05-02 18:12:25.862415] at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:398)[2025-05-02 18:12:25.862426] at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:388)[2025-05-02 18:12:25.862437] at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$3.run(ServerRuntime.java:926) [2025-05-02 18:12:25.862447] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
[2025-05-02 18:12:25.862457] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)[2025-05-02 18:12:25.862467] at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
[2025-05-02 18:12:25.862493] at org.glassfish.jersey.internal.Errors.process(Errors.java:274)[2025-05-02 18:12:25.862507] at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
[2025-05-02 18:12:25.862518] at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:266)
[2025-05-02 18:12:25.862529] at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:963)
[2025-05-02 18:12:25.862543] at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:914)
[2025-05-02 18:12:25.862555] at io.airlift.jaxrs.AsyncResponseHandler$1.onSuccess(AsyncResponseHandler.java:99)
[2025-05-02 18:12:25.862566] at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1139)
[2025-05-02 18:12:25.862585] at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:79)[2025-05-02 18:12:25.862599] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
[2025-05-02 18:12:25.862611] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
[2025-05-02 18:12:25.862619] at java.base/java.lang.Thread.run(Thread.java:1575)
[2025-05-02 18:12:25.862628] Caused by: org.eclipse.jetty.io.EofException
[2025-05-02 18:12:25.862640] at org.eclipse.jetty.io.SocketChannelEndPoint.flush(SocketChannelEndPoint.java:118)
[2025-05-02 18:12:25.862649] at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
[2025-05-02 18:12:25.862658] at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:275)
[2025-05-02 18:12:25.862668] at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:254)
[2025-05-02 18:12:25.862679] at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:368)
[2025-05-02 18:12:25.862692] at org.eclipse.jetty.server.internal.HttpConnection$SendCallback.process(HttpConnection.java:834)
[2025-05-02 18:12:28.440831] at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:262)[2025-05-02 18:12:28.440843] at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:243)[2025-05-02 18:12:28.440856] at org.eclipse.jetty.server.internal.HttpConnection$HttpStreamOverHTTP1.send(HttpConnection.java:1427)[2025-05-02 18:12:28.440869] at org.eclipse.jetty.server.HttpStream$Wrapper.send(HttpStream.java:185)[2025-05-02 18:12:28.440883] at org.eclipse.jetty.server.internal.HttpChannelState$ChannelResponse.write(HttpChannelState.java:1300)[2025-05-02 18:12:28.440895] at org.eclipse.jetty.server.Response$Wrapper.write(Response.java:768)[2025-05-02 18:12:28.440908] at org.eclipse.jetty.server.handler.EventsHandler$EventsResponse.write(EventsHandler.java:353)[2025-05-02 18:12:28.440920] at org.eclipse.jetty.server.Response$Wrapper.write(Response.java:768)[2025-05-02 18:12:28.440932] at org.eclipse.jetty.server.handler.ContextResponse.write(ContextResponse.java:56)[2025-05-02 18:12:28.440946] at org.eclipse.jetty.ee10.servlet.ServletContextResponse.write(ServletContextResponse.java:288)[2025-05-02 18:12:28.440958] at org.eclipse.jetty.ee10.servlet.HttpOutput.channelWrite(HttpOutput.java:215)[2025-05-02 18:12:28.440972] at org.eclipse.jetty.ee10.servlet.HttpOutput.channelWrite(HttpOutput.java:200)[2025-05-02 18:12:28.440984] at org.eclipse.jetty.ee10.servlet.HttpOutput.write(HttpOutput.java:848)[2025-05-02 18:12:28.440997] at org.glassfish.jersey.servlet.internal.ResponseWriter$NonCloseableOutputStreamWrapper.write(ResponseWriter.java:298)[2025-05-02 18:12:28.441010] at java.base/java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:170)[2025-05-02 18:12:28.441023] at org.glassfish.jersey.message.internal.CommittingOutputStream.flushBuffer(CommittingOutputStream.java:286)[2025-05-02 18:12:28.441036] at org.glassfish.jersey.message.internal.CommittingOutputStream.commit(CommittingOutputStream.java:235)
[2025-05-02 18:12:28.441048] at org.glassfish.jersey.message.internal.CommittingOutputStream.close(CommittingOutputStream.java:250)
[2025-05-02 18:12:28.441060] at org.glassfish.jersey.message.internal.OutboundMessageContext.close(OutboundMessageContext.java:568)
[2025-05-02 18:12:28.441065] ... 19 more
[2025-05-02 18:12:28.441074] Caused by: java.io.IOException: Broken pipe
[2025-05-02 18:12:28.441084] at java.base/sun.nio.ch.SocketDispatcher.$$YJP$$writev0(Native Method)
[2025-05-02 18:12:28.441094] at java.base/sun.nio.ch.SocketDispatcher.writev0(SocketDispatcher.java)
[2025-05-02 18:12:28.441112] at java.base/sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:66)
[2025-05-02 18:12:28.441123] at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:227)
[2025-05-02 18:12:28.441131] at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:158)
[2025-05-02 18:12:28.441141] at java.base/sun.nio.ch.SocketChannelImpl.implWrite(SocketChannelImpl.java:597)
[2025-05-02 18:12:28.441156] at java.base/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:631)
[2025-05-02 18:12:28.441169] at java.base/java.nio.channels.SocketChannel.write(SocketChannel.java:660)
[2025-05-02 18:12:28.441185] at org.eclipse.jetty.io.SocketChannelEndPoint.flush(SocketChannelEndPoint.java:112)
[2025-05-02 18:12:28.441192] ... 43 more
[2025-05-02 18:12:28.441199]
[2025-05-02 18:12:28.441203]
The TCP connections 1000X over time finally leading to FD exhaustion
Majority of the TCP connections are between workers and are all in a CLOSING state (which tracks with the stacktrace above that there seems to be an issue in closing the jetty connections).
We further observed that these org.eclipse.jetty.io.EofException with connection close is happening when pods are scaling up when autoscale is enable in trino workers.
same problem, I upgrade Trino version from 472 to 474, if I modify config internal-communication.http2.enabled=true, will this problem be fixed ? thank you
cc: @wendigo
This error has stopped after upgrading trino to 476 version.