remoting
remoting copied to clipboard
fix: add exponential backoff
trafficstars
I faced with DNS/timeout issues during agent initialization WebSocket mode
io.jenkins.remoting.shaded.javax.websocket.DeploymentException: Connection failed.
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:187)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:107)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.openClientSocket(JdkClientContainer.java:192)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$3$1.run(ClientManager.java:647)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$3.run(ClientManager.java:696)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager$SameThreadExecutorService.execute(ClientManager.java:849)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:493)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:337)
at hudson.remoting.Engine.runWebSocket(Engine.java:656)
at hudson.remoting.Engine.run(Engine.java:495)
Caused by: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:104)
at sun.nio.ch.UnixAsynchronousSocketChannelImpl.implConnect(UnixAsynchronousSocketChannelImpl.java:302)
at sun.nio.ch.AsynchronousSocketChannelImpl.connect(AsynchronousSocketChannelImpl.java:210)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.TransportFilter.handleConnect(TransportFilter.java:184)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:80)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:83)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.Filter.connect(Filter.java:83)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.ClientFilter.connect(ClientFilter.java:99)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.connectSynchronously(JdkClientContainer.java:326)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer.access$700(JdkClientContainer.java:58)
at io.jenkins.remoting.shaded.org.glassfish.tyrus.container.jdk.client.JdkClientContainer$1.call(JdkClientContainer.java:156)
... 12 more
TCP socket mode
java.io.IOException: Failed to connect to http://jenkins-url/tcpSlaveAgentListener/: connect timed out
at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:214)
at hudson.remoting.Engine.innerRun(Engine.java:733)
at hudson.remoting.Engine.run(Engine.java:539)
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1228)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990)
at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:211)
... 2 more
I'm trying to detect exact issue. Seems it relates to some K8s/JDK networking corner case. Anyway the PR introduces exponential backoff workaround which resolves the issue on source code level. It can help to avoid potential network issues.
Merge conflict, possibly with #603 etc. The idea is fine but this is probably a candidate for closing unless the author still has time and interest in cleaning it up.