skaffold icon indicating copy to clipboard operation
skaffold copied to clipboard

Maven builds sometimes fail with connection resets

Open briandealwis opened this issue 3 years ago • 5 comments

Seen on Travis surprisingly often:

        [builder] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on project hello-spring-boot: Execution default-test of goal org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test failed: Plugin org.apache.maven.plugins:maven-surefire-plugin:2.21.0 or one of its dependencies could not be resolved: The following artifacts could not be resolved: org.apache.maven.surefire:surefire-api:jar:2.21.0, org.apache.maven.surefire:surefire-logger-api:jar:2.21.0: Could not transfer artifact org.apache.maven.surefire:surefire-api:jar:2.21.0 from/to central (https://repo.maven.apache.org/maven2): Transfer failed for https://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-api/2.21.0/surefire-api-2.21.0.jar: Connection reset -> [Help 1]

Similar complaints abound on the internet. Suspicion is that Maven Central's CDN times out connections resulting in a connection reset.

Some relevant reading:

briandealwis avatar Jan 20 '21 04:01 briandealwis

Still being seen, so setting the connection TTL isn't sufficient.

        [builder] [ERROR] Failed to execute goal org.springframework.boot:spring-boot-maven-plugin:2.0.5.RELEASE:repackage (default) on project hello-spring-boot: Execution default of goal org.springframework.boot:spring-boot-maven-plugin:2.0.5.RELEASE:repackage failed: Plugin org.springframework.boot:spring-boot-maven-plugin:2.0.5.RELEASE or one of its dependencies could not be resolved: Could not transfer artifact org.apache.maven:maven-archiver:jar:2.6 from/to central (https://repo.maven.apache.org/maven2): Transfer failed for https://repo.maven.apache.org/maven2/org/apache/maven/maven-archiver/2.6/maven-archiver-2.6.jar: Connection reset -> [Help 1]

briandealwis avatar Jan 21 '21 19:01 briandealwis

Maven Wagon HTTP defaults:

  • maven.wagon.http.retryHandler.class = standard
  • maven.wagon.http.retryHandler.requestSentEnabled = false
  • maven.wagon.http.retryHandler.count = 3

The default retry handler (DefaultHttpRequestRetryHandler) defaults to not retrying the following classes: InterruptedIOException, UnknownHostException, ConnectException, SSLException

briandealwis avatar Jan 21 '21 19:01 briandealwis

Placing the following into <top-level>/.mvn/jvm.config, and ensuring you're using Maven ≥ 3.6.1 (which has fixes for retry behaviour), seems to do the trick:

-Dmaven.wagon.httpconnectionManager.ttlSeconds=120 -Dmaven.wagon.http.retryHandler.requestSentEnabled=true

This changes the Wagon HTTP connection pool to close connections after 2 minutes of inactivity, but more importantly causes Wagon HTTP to retry even if the request seemed to be successfully sent. I'm no expert here, but I believe the remote is silently dropping the connection. Since most HTTP GET requests will fit into a single packet, the request will appear to be successful, and the subsequent RST from the other side will result in the connection reset.

If using the default retry-Handler (which isn't actually the default) you may need to explicitly specify the maven.wagon.http.retryHandler.nonRetryableClasses property as stack-traces suggest the connection reset is raised in a java.net.SocketException.

briandealwis avatar Jan 22 '21 17:01 briandealwis

Still happening in the buildpacks tests (log from #5279)

briandealwis avatar Jan 23 '21 04:01 briandealwis

@briandealwis I think the standard retryHandler simply uses the DefaultHttpRequestRetryHandler and won't retry on the same list of nonRetriableClasses. Probably need to configure maven.wagon.http.retryHandler.nonRetryableClasses to something like below to allow retry upon java.net.ConnectException?

java.io.InterruptedIOException,java.net.UnknownHostException,java.net.NoRouteToHostException,javax.net.ssl.SSLException

Update: I just realized the issue here is about connection resets. My comment above was in fact about an connection timeout issue I'm investigating. It has a difference stack trace:

Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
at java.net.PlainSocketImpl.socketConnect (Native Method)
at java.net.AbstractPlainSocketImpl.doConnect (AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress (AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect (AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect (SocksSocketImpl.java:392)
at java.net.Socket.connect (Socket.java:607)

(Feel free to delete my off-topic comment.)

hligit avatar Aug 07 '22 12:08 hligit