spring-native icon indicating copy to clipboard operation
spring-native copied to clipboard

WebClient fails with handshake timed out after 10000ms after system restart even tho program terminates in 2369ms

Open thekalinga opened this issue 1 year ago • 21 comments

I am seeing a strange issue with native image when using webclient. Its consistently happening in my system

  1. If I run the program without native build (i.e on JVM), the program doesnt throw exception
  2. It doesnt fail if I build the native image & run executable i.e immediately after build but before system restart
  3. But if I restart the system & run native executable (thats built before system restart), it fails with this exception, while jar based version doesnt fail
time /home/x/dev/spring-native-netty-timeout-bug/build/native/nativeCompile/spring-native-netty-timeout-bug 
2022-08-21 14:55:44.016  INFO 3812 --- [           main] o.s.nativex.NativeListener               : AOT mode enabled
2022-08-21 14:55:44.017  INFO 3812 --- [           main] com.acme.DemoApplication                 : Starting DemoApplication using Java 17.0.4 on playground with PID 3812 (/home/x/dev/spring-native-netty-timeout-bug/build/native/nativeCompile/spring-native-netty-timeout-bug started by x in /home/x)
2022-08-21 14:55:44.017 DEBUG 3812 --- [           main] com.acme.DemoApplication                 : Running with Spring Boot v2.7.1, Spring v5.3.21
2022-08-21 14:55:44.017  INFO 3812 --- [           main] com.acme.DemoApplication                 : No active profile set, falling back to 1 default profile: "default"
2022-08-21 14:55:44.048  INFO 3812 --- [           main] com.acme.DemoApplication                 : Started DemoApplication in 0.045 seconds (JVM running for 0.047)
2022-08-21 14:55:44.056 DEBUG 3812 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [d8c063ba] REGISTERED
2022-08-21 14:55:44.075 DEBUG 3812 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [d8c063ba] CONNECT: httpbin.org/52.87.105.151:443
2022-08-21 14:55:44.329 DEBUG 3812 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [d8c063ba, L:/192.168.1.7:48060 - R:httpbin.org/52.87.105.151:443] ACTIVE
2022-08-21 14:55:44.329 DEBUG 3812 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [d8c063ba, L:/192.168.1.7:48060 - R:httpbin.org/52.87.105.151:443] USER_EVENT: SslHandshakeCompletionEvent(io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms)
2022-08-21 14:55:44.330  WARN 3812 --- [ctor-http-nio-2] r.netty.http.client.HttpClientConnect    : [d8c063ba, L:/192.168.1.7:48060 - R:httpbin.org/52.87.105.151:443] The connection observed an error

io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms
	at io.netty.handler.ssl.SslHandler$7.run(SslHandler.java:2115) ~[na:na]
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[na:na]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[na:na]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[na:na]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[na:na]
	at java.lang.Thread.run(Thread.java:833) ~[spring-native-netty-timeout-bug:na]
	at com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:705) ~[spring-native-netty-timeout-bug:na]
	at com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:202) ~[na:na]

2022-08-21 14:55:44.330 DEBUG 3812 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [d8c063ba, L:/192.168.1.7:48060 ! R:httpbin.org/52.87.105.151:443] USER_EVENT: SslCloseCompletionEvent(java.nio.channels.ClosedChannelException)
2022-08-21 14:55:44.330 DEBUG 3812 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [d8c063ba, L:/192.168.1.7:48060 ! R:httpbin.org/52.87.105.151:443] INACTIVE
2022-08-21 14:55:44.331 DEBUG 3812 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [d8c063ba, L:/192.168.1.7:48060 ! R:httpbin.org/52.87.105.151:443] UNREGISTERED
2022-08-21 14:55:44.331 ERROR 3812 --- [           main] o.s.boot.SpringApplication               : Application run failed

java.lang.IllegalStateException: Failed to execute ApplicationRunner
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:765) ~[spring-native-netty-timeout-bug:2.7.1]
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:752) ~[spring-native-netty-timeout-bug:2.7.1]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:315) ~[spring-native-netty-timeout-bug:2.7.1]
	at org.springframework.boot.builder.SpringApplicationBuilder.run(SpringApplicationBuilder.java:164) ~[na:na]
	at com.acme.DemoApplication.main(DemoApplication.java:22) ~[spring-native-netty-timeout-bug:na]
Caused by: org.springframework.web.reactive.function.client.WebClientRequestException: handshake timed out after 10000ms; nested exception is io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms
	at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:141) ~[na:na]
	Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: 
Error has been observed at the following site(s):
	*__checkpoint ⇢ Request to GET https://httpbin.org/get [DefaultWebClient]
Original Stack Trace:
		at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:141) ~[na:na]
		at reactor.core.publisher.MonoErrorSupplied.subscribe(MonoErrorSupplied.java:55) ~[na:na]
		at reactor.core.publisher.Mono.subscribe(Mono.java:4397) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103) ~[na:na]
		at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222) ~[na:na]
		at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222) ~[na:na]
		at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222) ~[na:na]
		at reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onError(MonoFlatMapMany.java:204) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.SerializedSubscriber.onError(SerializedSubscriber.java:124) ~[na:na]
		at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.whenError(FluxRetryWhen.java:225) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxRetryWhen$RetryWhenOtherSubscriber.onError(FluxRetryWhen.java:274) ~[na:na]
		at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:415) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onNext(FluxConcatMap.java:251) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.EmitterProcessor.drain(EmitterProcessor.java:537) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.EmitterProcessor.tryEmitNext(EmitterProcessor.java:343) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.SinkManySerialized.tryEmitNext(SinkManySerialized.java:100) ~[na:na]
		at reactor.core.publisher.InternalManySink.emitNext(InternalManySink.java:27) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onError(FluxRetryWhen.java:190) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.netty.http.client.HttpClientConnect$MonoHttpConnect$ClientTransportSubscriber.onError(HttpClientConnect.java:308) ~[na:na]
		at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.netty.resources.DefaultPooledConnectionProvider$DisposableAcquire.onUncaughtException(DefaultPooledConnectionProvider.java:204) ~[na:na]
		at reactor.netty.resources.DefaultPooledConnectionProvider$PooledConnection.onUncaughtException(DefaultPooledConnectionProvider.java:454) ~[spring-native-netty-timeout-bug:1.0.20]
		at reactor.netty.channel.ChannelOperationsHandler.exceptionCaught(ChannelOperationsHandler.java:129) ~[na:na]
		at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireExceptionCaught(CombinedChannelDuplexHandler.java:424) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.ChannelHandlerAdapter.exceptionCaught(ChannelHandlerAdapter.java:92) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.CombinedChannelDuplexHandler$1.fireExceptionCaught(CombinedChannelDuplexHandler.java:145) ~[na:na]
		at io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:143) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.CombinedChannelDuplexHandler.exceptionCaught(CombinedChannelDuplexHandler.java:231) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at reactor.netty.tcp.SslProvider$SslReadHandler.userEventTriggered(SslProvider.java:850) ~[na:na]
		at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.handler.logging.LoggingHandler.userEventTriggered(LoggingHandler.java:222) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:346) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:332) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:324) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.handler.ssl.SslUtils.handleHandshakeFailure(SslUtils.java:443) ~[na:na]
		at io.netty.handler.ssl.SslHandler$7.run(SslHandler.java:2119) ~[na:na]
		at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[na:na]
		at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[na:na]
		at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[na:na]
		at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[na:na]
		at java.lang.Thread.run(Thread.java:833) ~[spring-native-netty-timeout-bug:na]
		at com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:705) ~[spring-native-netty-timeout-bug:na]
		at com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:202) ~[na:na]
	Suppressed: java.lang.Exception: #block terminated with an error
		at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.Mono.block(Mono.java:1707) ~[spring-native-netty-timeout-bug:3.4.19]
		at com.acme.DemoApplication.lambda$onInit$0(DemoApplication.java:27) ~[spring-native-netty-timeout-bug:na]
		at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:762) ~[spring-native-netty-timeout-bug:2.7.1]
		at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:752) ~[spring-native-netty-timeout-bug:2.7.1]
		at org.springframework.boot.SpringApplication.run(SpringApplication.java:315) ~[spring-native-netty-timeout-bug:2.7.1]
		at org.springframework.boot.builder.SpringApplicationBuilder.run(SpringApplicationBuilder.java:164) ~[na:na]
		at com.acme.DemoApplication.main(DemoApplication.java:22) ~[spring-native-netty-timeout-bug:na]
Caused by: io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms
	at io.netty.handler.ssl.SslHandler$7.run(SslHandler.java:2115) ~[na:na]
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[na:na]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[na:na]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[na:na]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[na:na]
	at java.lang.Thread.run(Thread.java:833) ~[spring-native-netty-timeout-bug:na]
	at com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:705) ~[spring-native-netty-timeout-bug:na]
	at com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:202) ~[na:na]

   0.04s  user 0.05s system 3% cpu 2.346 total

When I run same program with jar, I get following output

java -jar /home/x/dev/spring-native-netty-timeout-bug/build/libs/spring-native-netty-timeout-bug-0.0.1-SNAPSHOT.jar
2022-08-21 15:10:27.820  INFO 5865 --- [           main] o.s.nativex.NativeListener               : AOT mode disabled
2022-08-21 15:10:27.940  INFO 5865 --- [           main] com.acme.DemoApplication                 : Starting DemoApplication using Java 17.0.4 on playground with PID 5865 (/home/x/dev/spring-native-netty-timeout-bug/build/libs/spring-native-netty-timeout-bug-0.0.1-SNAPSHOT.jar started by x in /home/x/dev/spring-native-netty-timeout-bug)
2022-08-21 15:10:27.940 DEBUG 5865 --- [           main] com.acme.DemoApplication                 : Running with Spring Boot v2.7.1, Spring v5.3.21
2022-08-21 15:10:27.942  INFO 5865 --- [           main] com.acme.DemoApplication                 : No active profile set, falling back to 1 default profile: "default"
2022-08-21 15:10:29.256  INFO 5865 --- [           main] com.acme.DemoApplication                 : Started DemoApplication in 2.044 seconds (JVM running for 2.725)
2022-08-21 15:10:30.209 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b] REGISTERED
2022-08-21 15:10:30.283 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b] CONNECT: httpbin.org/34.227.213.82:443
2022-08-21 15:10:30.607 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] ACTIVE
2022-08-21 15:10:30.859 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] READ COMPLETE
2022-08-21 15:10:30.964 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] READ COMPLETE
2022-08-21 15:10:31.216 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] USER_EVENT: SslHandshakeCompletionEvent(SUCCESS)
2022-08-21 15:10:31.247 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b-1, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] WRITE: 99B GET /get HTTP/1.1
user-agent: ReactorNetty/1.0.20
host: httpbin.org
Accept: application/json


2022-08-21 15:10:31.248 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b-1, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] FLUSH
2022-08-21 15:10:31.252 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b-1, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] WRITE: 0B
2022-08-21 15:10:31.252 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b-1, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] FLUSH
2022-08-21 15:10:31.253 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b-1, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] READ COMPLETE
2022-08-21 15:10:31.500 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b-1, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] READ: 506B HTTP/1.1 200 OK
Date: Sun, 21 Aug 2022 09:40:31 GMT
Content-Type: application/json
Content-Length: 276
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true

{
  "args": {}, 
  "headers": {
    "Accept": "application/json", 
    "Host": "httpbin.org", 
    "User-Agent": "ReactorNetty/1.0.20", 
    "X-Amzn-Trace-Id": "Root=1-6301fd8f-0d26170512f1dca7432ea37a"
  }, 
  "origin": "122.171.23.69", 
  "url": "https://httpbin.org/get"
}

2022-08-21 15:10:31.629 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b, L:/192.168.1.7:60608 - R:httpbin.org/34.227.213.82:443] READ COMPLETE
GetResponse[headers=Headers[accept=application/json]]
2022-08-21 15:10:31.650 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b, L:/192.168.1.7:60608 ! R:httpbin.org/34.227.213.82:443] CLOSE
2022-08-21 15:10:31.653 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b, L:/192.168.1.7:60608 ! R:httpbin.org/34.227.213.82:443] USER_EVENT: SslCloseCompletionEvent(java.nio.channels.ClosedChannelException)
2022-08-21 15:10:31.653 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b, L:/192.168.1.7:60608 ! R:httpbin.org/34.227.213.82:443] INACTIVE
2022-08-21 15:10:31.653 DEBUG 5865 --- [or-http-epoll-2] com.acme.ApiAccessor                     : [3266e03b, L:/192.168.1.7:60608 ! R:httpbin.org/34.227.213.82:443] UNREGISTERED

I see this consistently happening after I restart the system. The binary doesnt work even after multiple runs. I even tried running it after I ran jar version on JVM (to see if DNS resolution at OS level caused this issue — my uneducated guess work as I am not sure if netty does DNS resolution by itself/delegates to OS :D). But it ran successfully when I created the image initally using ./gradlew nativeCompile before I restarted the system.

Version info

  1. Java
openjdk version "17.0.4" 2022-07-19
OpenJDK Runtime Environment Temurin-17.0.4+8 (build 17.0.4+8)
OpenJDK 64-Bit Server VM Temurin-17.0.4+8 (build 17.0.4+8, mixed mode, sharing)
  1. OS Ubuntu 20.04.4 LTS
  2. Graal VM 22.2.r17-grl
  3. Spring & Gradle versions can be found at this minimum reproducible sample

thekalinga avatar Aug 21 '22 09:08 thekalinga

Hey, interesting problem. Can you please put

logging.level.io.netty=TRACE

in your application.properties? Maybe the netty logs will tell us the cause of this.

mhalbritter avatar Aug 22 '22 09:08 mhalbritter

So, i can reproduce it on a Fedora 35 Linux.

After i added a ton of hints to clean the netty trace output:

@TypeHint(typeNames= {
		"io.netty.channel.DefaultChannelPipeline$TailContext",
		"io.netty.channel.DefaultChannelPipeline$HeadContext",
		"io.netty.handler.ssl.SslHandler",
		"io.netty.handler.codec.http.HttpClientCodec",
		"io.netty.resolver.dns.DnsNameResolver$1",
		"io.netty.resolver.dns.DnsNameResolver$3",
		"io.netty.handler.codec.dns.DatagramDnsQueryEncoder",
		"io.netty.resolver.dns.DnsNameResolver$DnsResponseHandler",
		"reactor.netty.resources.DefaultPooledConnectionProvider$PooledConnectionAllocator$PooledConnectionInitializer",
		"reactor.netty.transport.TransportConfig$TransportChannelInitializer",
		"reactor.netty.transport.logging.ReactorNettyLoggingHandler",
		"reactor.netty.channel.ChannelOperationsHandler",
		"reactor.netty.tcp.SslProvider$SslReadHandler"
}, access = { TypeAccess.PUBLIC_METHODS })

I now get this log after building the image, verifying that it works until reboot, reboot and then it fails with:

2022-08-22 11:36:16.774  INFO 2504 --- [           main] o.s.nativex.NativeListener               : AOT mode enabled
2022-08-22 11:36:16.775  INFO 2504 --- [           main] com.acme.DemoApplication                 : Starting DemoApplication using Java 17.0.4 on fedora with PID 2504 (/home/moe/Downloads/issue-projects/gh-1698/build/native/nativeCompile/spring-native-netty-timeout-bug started by moe in /home/moe/Downloads/issue-projects/gh-1698)
2022-08-22 11:36:16.775 DEBUG 2504 --- [           main] com.acme.DemoApplication                 : Running with Spring Boot v2.7.1, Spring v5.3.21
2022-08-22 11:36:16.775  INFO 2504 --- [           main] com.acme.DemoApplication                 : No active profile set, falling back to 1 default profile: "default"
2022-08-22 11:36:16.796  INFO 2504 --- [           main] com.acme.DemoApplication                 : Started DemoApplication in 0.033 seconds (JVM running for 0.081)
2022-08-22 11:36:16.800 TRACE 2504 --- [           main] io.netty.channel.nio.NioEventLoop        : instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@1b6f9dda
2022-08-22 11:36:16.800 TRACE 2504 --- [           main] io.netty.channel.nio.NioEventLoop        : instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@451553f0
2022-08-22 11:36:16.800 TRACE 2504 --- [           main] io.netty.channel.nio.NioEventLoop        : instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@230754b
2022-08-22 11:36:16.800 TRACE 2504 --- [           main] io.netty.channel.nio.NioEventLoop        : instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@518815f9
2022-08-22 11:36:16.800 TRACE 2504 --- [           main] io.netty.channel.nio.NioEventLoop        : instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@758d51aa
2022-08-22 11:36:16.800 TRACE 2504 --- [           main] io.netty.channel.nio.NioEventLoop        : instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@6f122867
2022-08-22 11:36:16.800 TRACE 2504 --- [           main] io.netty.channel.nio.NioEventLoop        : instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@11731641
2022-08-22 11:36:16.800 TRACE 2504 --- [           main] io.netty.channel.nio.NioEventLoop        : instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@272b7d46
2022-08-22 11:36:16.800 TRACE 2504 --- [           main] io.netty.channel.nio.NioEventLoop        : instrumented a special java.util.Set into: sun.nio.ch.EPollSelectorImpl@472d096e
2022-08-22 11:36:16.801 DEBUG 2504 --- [           main] io.netty.channel.DefaultChannelId        : -Dio.netty.processId: 2504 (auto-detected)
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.channel.DefaultChannelId        : -Dio.netty.machineId: 00:0c:29:ff:fe:8a:cd:55 (auto-detected)
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.numHeapArenas: 18
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.numDirectArenas: 18
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.pageSize: 8192
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.maxOrder: 9
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.chunkSize: 4194304
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.smallCacheSize: 256
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.normalCacheSize: 64
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.maxCachedBufferCapacity: 32768
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.cacheTrimInterval: 8192
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.cacheTrimIntervalMillis: 0
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.useCacheForAllThreads: false
2022-08-22 11:36:16.802 DEBUG 2504 --- [           main] io.netty.buffer.PooledByteBufAllocator   : -Dio.netty.allocator.maxCachedByteBuffersPerChunk: 1023
2022-08-22 11:36:16.803 DEBUG 2504 --- [           main] io.netty.buffer.ByteBufUtil              : -Dio.netty.allocator.type: pooled
2022-08-22 11:36:16.803 DEBUG 2504 --- [           main] io.netty.buffer.ByteBufUtil              : -Dio.netty.threadLocalDirectBufferSize: 0
2022-08-22 11:36:16.803 DEBUG 2504 --- [           main] io.netty.buffer.ByteBufUtil              : -Dio.netty.maxThreadLocalCharBufferSize: 16384
2022-08-22 11:36:16.804 DEBUG 2504 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [e9657467] REGISTERED
2022-08-22 11:36:16.804 DEBUG 2504 --- [ctor-http-nio-1] i.n.util.ResourceLeakDetectorFactory     : Loaded default ResourceLeakDetector: io.netty.util.ResourceLeakDetector@16f0ef6
2022-08-22 11:36:16.805 DEBUG 2504 --- [ctor-http-nio-1] io.netty.resolver.dns.DnsQueryContext    : [id: 0x7fa99b43] WRITE: UDP, [12958: /127.0.0.53:53], DefaultDnsQuestion(httpbin.org. IN A)
2022-08-22 11:36:16.807 DEBUG 2504 --- [ctor-http-nio-1] io.netty.resolver.dns.DnsQueryContext    : [id: 0x7fa99b43] WRITE: UDP, [32900: /127.0.0.53:53], DefaultDnsQuestion(httpbin.org. IN AAAA)
2022-08-22 11:36:16.831 DEBUG 2504 --- [ctor-http-nio-1] io.netty.resolver.dns.DnsNameResolver    : [id: 0x7fa99b43] RECEIVED: UDP [32900: /127.0.0.53:53], DatagramDnsResponse(from: /127.0.0.53:53, to: /[0:0:0:0:0:0:0:0]:33579, 32900, QUERY(0), NoError(0), RD RA)
	DefaultDnsQuestion(httpbin.org. IN AAAA)
	DefaultDnsRawRecord(httpbin.org. 153 IN SOA 70B)
	DefaultDnsRawRecord(OPT flags:0 udp:65494 0B)
2022-08-22 11:36:16.831 DEBUG 2504 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [e9657467] CLOSE
2022-08-22 11:36:16.831 DEBUG 2504 --- [ctor-http-nio-2] com.acme.ApiAccessor                     : [e9657467] UNREGISTERED
2022-08-22 11:36:16.833 ERROR 2504 --- [           main] o.s.boot.SpringApplication               : Application run failed

java.lang.IllegalStateException: Failed to execute ApplicationRunner
	at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:765) ~[spring-native-netty-timeout-bug:2.7.1]
	at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:752) ~[spring-native-netty-timeout-bug:2.7.1]
	at org.springframework.boot.SpringApplication.run(SpringApplication.java:315) ~[spring-native-netty-timeout-bug:2.7.1]
	at org.springframework.boot.builder.SpringApplicationBuilder.run(SpringApplicationBuilder.java:164) ~[na:na]
	at com.acme.DemoApplication.main(DemoApplication.java:37) ~[spring-native-netty-timeout-bug:na]
Caused by: org.springframework.web.reactive.function.client.WebClientRequestException: Failed to resolve 'httpbin.org' and search domain query for configured domains failed as well: [.]; nested exception is io.netty.resolver.dns.DnsResolveContext$SearchDomainUnknownHostException: Failed to resolve 'httpbin.org' and search domain query for configured domains failed as well: [.]
	at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:141) ~[na:na]
	Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: 
Error has been observed at the following site(s):
	*__checkpoint ⇢ Request to GET https://httpbin.org/get [DefaultWebClient]
Original Stack Trace:
		at org.springframework.web.reactive.function.client.ExchangeFunctions$DefaultExchangeFunction.lambda$wrapException$9(ExchangeFunctions.java:141) ~[na:na]
		at reactor.core.publisher.MonoErrorSupplied.subscribe(MonoErrorSupplied.java:55) ~[na:na]
		at reactor.core.publisher.Mono.subscribe(Mono.java:4397) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103) ~[na:na]
		at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222) ~[na:na]
		at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222) ~[na:na]
		at reactor.core.publisher.FluxPeek$PeekSubscriber.onError(FluxPeek.java:222) ~[na:na]
		at reactor.core.publisher.MonoNext$NextSubscriber.onError(MonoNext.java:93) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onError(MonoFlatMapMany.java:204) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.SerializedSubscriber.onError(SerializedSubscriber.java:124) ~[na:na]
		at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.whenError(FluxRetryWhen.java:225) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxRetryWhen$RetryWhenOtherSubscriber.onError(FluxRetryWhen.java:274) ~[na:na]
		at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.drain(FluxConcatMap.java:415) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.onNext(FluxConcatMap.java:251) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.EmitterProcessor.drain(EmitterProcessor.java:537) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.EmitterProcessor.tryEmitNext(EmitterProcessor.java:343) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.SinkManySerialized.tryEmitNext(SinkManySerialized.java:100) ~[na:na]
		at reactor.core.publisher.InternalManySink.emitNext(InternalManySink.java:27) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.onError(FluxRetryWhen.java:190) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.netty.http.client.HttpClientConnect$MonoHttpConnect$ClientTransportSubscriber.onError(HttpClientConnect.java:308) ~[na:na]
		at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.netty.resources.DefaultPooledConnectionProvider$DisposableAcquire.onError(DefaultPooledConnectionProvider.java:158) ~[na:na]
		at reactor.netty.internal.shaded.reactor.pool.AbstractPool$Borrower.fail(AbstractPool.java:475) ~[spring-native-netty-timeout-bug:1.0.20]
		at reactor.netty.internal.shaded.reactor.pool.SimpleDequePool.lambda$drainLoop$9(SimpleDequePool.java:431) ~[spring-native-netty-timeout-bug:1.0.20]
		at reactor.core.publisher.FluxDoOnEach$DoOnEachSubscriber.onError(FluxDoOnEach.java:186) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.MonoCreate$DefaultMonoSink.error(MonoCreate.java:201) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.netty.resources.DefaultPooledConnectionProvider$PooledConnectionAllocator$PooledConnectionInitializer.onError(DefaultPooledConnectionProvider.java:548) ~[spring-native-netty-timeout-bug:1.0.20]
		at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondError(MonoFlatMap.java:192) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.MonoFlatMap$FlatMapInner.onError(MonoFlatMap.java:259) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:106) ~[na:na]
		at reactor.core.publisher.Operators.error(Operators.java:198) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.MonoError.subscribe(MonoError.java:53) ~[na:na]
		at reactor.core.publisher.Mono.subscribe(Mono.java:4397) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103) ~[na:na]
		at reactor.netty.transport.TransportConnector$MonoChannelPromise.tryFailure(TransportConnector.java:534) ~[spring-native-netty-timeout-bug:1.0.20]
		at reactor.netty.transport.TransportConnector.lambda$doResolveAndConnect$11(TransportConnector.java:341) ~[na:na]
		at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:109) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.resolver.InetSocketAddressResolver$2.operationComplete(InetSocketAddressResolver.java:86) ~[na:na]
		at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.resolver.dns.DnsResolveContext$1.operationComplete(DnsResolveContext.java:236) ~[na:na]
		at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:552) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1055) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:1000) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:418) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.resolver.dns.DnsResolveContext.access$600(DnsResolveContext.java:66) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.resolver.dns.DnsResolveContext$2.operationComplete(DnsResolveContext.java:467) ~[na:na]
		at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.resolver.dns.DnsQueryContext.tryFailure(DnsQueryContext.java:256) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.resolver.dns.DnsQueryContext$4.run(DnsQueryContext.java:208) ~[na:na]
		at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[spring-native-netty-timeout-bug:4.1.78.Final]
		at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[na:na]
		at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[na:na]
		at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[na:na]
		at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[na:na]
		at java.lang.Thread.run(Thread.java:833) ~[spring-native-netty-timeout-bug:na]
		at com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:705) ~[spring-native-netty-timeout-bug:na]
		at com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:202) ~[na:na]
	Suppressed: java.lang.Exception: #block terminated with an error
		at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99) ~[spring-native-netty-timeout-bug:3.4.19]
		at reactor.core.publisher.Mono.block(Mono.java:1707) ~[spring-native-netty-timeout-bug:3.4.19]
		at com.acme.DemoApplication.lambda$onInit$0(DemoApplication.java:42) ~[spring-native-netty-timeout-bug:na]
		at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:762) ~[spring-native-netty-timeout-bug:2.7.1]
		at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:752) ~[spring-native-netty-timeout-bug:2.7.1]
		at org.springframework.boot.SpringApplication.run(SpringApplication.java:315) ~[spring-native-netty-timeout-bug:2.7.1]
		at org.springframework.boot.builder.SpringApplicationBuilder.run(SpringApplicationBuilder.java:164) ~[na:na]
		at com.acme.DemoApplication.main(DemoApplication.java:37) ~[spring-native-netty-timeout-bug:na]
Caused by: io.netty.resolver.dns.DnsResolveContext$SearchDomainUnknownHostException: Failed to resolve 'httpbin.org' and search domain query for configured domains failed as well: [.]
	at io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1047) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:1000) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:418) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.resolver.dns.DnsResolveContext.access$600(DnsResolveContext.java:66) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.resolver.dns.DnsResolveContext$2.operationComplete(DnsResolveContext.java:467) ~[na:na]
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.resolver.dns.DnsQueryContext.tryFailure(DnsQueryContext.java:256) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.resolver.dns.DnsQueryContext$4.run(DnsQueryContext.java:208) ~[na:na]
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[spring-native-netty-timeout-bug:4.1.78.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[na:na]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[na:na]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[na:na]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[na:na]
	at java.lang.Thread.run(Thread.java:833) ~[spring-native-netty-timeout-bug:na]
	at com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:705) ~[spring-native-netty-timeout-bug:na]
	at com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:202) ~[na:na]
Caused by: io.netty.resolver.dns.DnsNameResolverTimeoutException: [/127.0.0.53:53] query via UDP timed out after 5000 milliseconds (no stack trace available)

2022-08-22 11:36:16.834 DEBUG 2504 --- [ctor-http-nio-1] io.netty.resolver.dns.DnsNameResolver    : [id: 0x7fa99b43] RECEIVED: UDP [12958: /127.0.0.53:53], DatagramDnsResponse(from: /127.0.0.53:53, to: /[0:0:0:0:0:0:0:0]:33579, 12958, QUERY(0), NoError(0), RD RA)
	DefaultDnsQuestion(httpbin.org. IN A)
	DefaultDnsRawRecord(httpbin.org. 59 IN A 4B)
	DefaultDnsRawRecord(httpbin.org. 59 IN A 4B)
	DefaultDnsRawRecord(httpbin.org. 59 IN A 4B)
	DefaultDnsRawRecord(httpbin.org. 59 IN A 4B)
	DefaultDnsRawRecord(OPT flags:0 udp:65494 0B)
2022-08-22 11:36:16.834 DEBUG 2504 --- [ctor-http-nio-1] io.netty.resolver.dns.DnsNameResolver    : Received a DNS response with an unknown ID: UDP [[id: 0x7fa99b43]: 12958]
2022-08-22 11:36:18.844 DEBUG 2504 --- [ctor-http-nio-1] io.netty.buffer.PoolThreadCache          : Freed 3 thread-local buffer(s) from thread: reactor-http-nio-1

Strange.

mhalbritter avatar Aug 22 '22 09:08 mhalbritter

I'll check if this still happens with SB3 / SF6. I have a suspicion that this is caused by netty build time init.

mhalbritter avatar Aug 22 '22 09:08 mhalbritter

Whats SB3 & SF6?

PS: I'll try the log options & let you know what I see in my system.

thekalinga avatar Aug 22 '22 09:08 thekalinga

Okay. Got it. Spring Boot & Spring Framework :)

thekalinga avatar Aug 22 '22 09:08 thekalinga

Yes, sorry, you are correct.

mhalbritter avatar Aug 22 '22 09:08 mhalbritter

I think I have to restart my machine. Binary is working now :D

Something has changed in system state that made this executable run without errors now. I'll restart and let you know what I see.

thekalinga avatar Aug 22 '22 10:08 thekalinga

I get same error after system restart. I have attached the log for

spring-native-netty-timeout-bug --logging.level.io.netty=TRACE

debug.log

thekalinga avatar Aug 22 '22 10:08 thekalinga

Strangely, now same executable is running without exception (I dint restart since I shared last log). Not sure if browser/some other application changed system state in ways that make this error go away.

Here is the new log.

debug.log

thekalinga avatar Aug 22 '22 13:08 thekalinga

I dint access httpbin.org in browser/thru any other mechanism (I dint run the jar version aswell). So, the issue is most likely not related to domain resolution but something to do with the System state that netty/webflux is relying on.

thekalinga avatar Aug 22 '22 13:08 thekalinga

It looks like there are some hints missing in Netty which lead to the DNS response being dropped. I saw something similar with Redis. There's quite some work going on related to Netty and SB3 / SF6. We can backport the fix when we have a working solution.

mhalbritter avatar Aug 22 '22 13:08 mhalbritter

Sure. Thanks.

thekalinga avatar Aug 22 '22 13:08 thekalinga

Couple of qns

  1. If they are missing, shouldnt they fail every time whether we restart the system/not?
  2. The events being logged by Netty are at DEBUG level (which I assume are harmless given the severity they are being logged at)

Thanks for the prompt responses.

thekalinga avatar Aug 22 '22 13:08 thekalinga

  1. My guess (!!!) is that there are some parts of netty which are initialized at build time and baked into the native image. They capture some parts of your system. When you restart your system, the baked-in parts in the image don't match your system anymore and weird stuff happens.
  2. Yeah, one would think so. That you have to ask the Netty authors. The main problem here is that netty relies heavily on reflection. It has not be designed for a runtime which has no reflection available. So it fails in unpredictable ways when reflection is not available (like in native-image without metadata).

What you can do to get this running: attach the native-image agent to your application, run it in JVM mode and try if those captured hints fix the problem until we have found the root cause.

mhalbritter avatar Aug 22 '22 13:08 mhalbritter

Thanks. Will try.

thekalinga avatar Aug 22 '22 13:08 thekalinga

@thekalinga did you find out which hints are needed? I'm having the same issue

steincodabox avatar Aug 24 '22 12:08 steincodabox

Currently using 0.11.5 with spring boot 2.6.7 instead of 0.12.1 with 2.7.1 seems to bypass the issue

steincodabox avatar Aug 24 '22 13:08 steincodabox

It's a variation of this bug: https://github.com/oracle/graal/issues/4809

There are hints for Netty handlers missing. Run your application with the native-image agent attached and look in the JSON for ChannelHandler classes. These are the classes Netty uses in their pipeline for your use case (i think they even depend on your operating system, but not sure). You can then use those classes to add @TypeHint in your code.

mhalbritter avatar Aug 24 '22 14:08 mhalbritter

@steincodabox Sorry. No, not yet. I havent yet tried (it requires a decent amount of trial & error) as I am stuck with some other work.

thekalinga avatar Aug 26 '22 17:08 thekalinga

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

spring-projects-issues avatar Sep 06 '22 12:09 spring-projects-issues

Shouldnt this issue be kept open till oracle fixed the issue from their end & netty + spring + graalvm integration works without issues?

thekalinga avatar Sep 10 '22 17:09 thekalinga

The related issue has been fixed in GraalVM 22.3.0 which will be used as of Spring Boot 3 RC1, so please test with Spring Boot 3 RC1 + GraalVM 22.3.0 once available. If you still face an issue with those, please create an issue on https://github.com/spring-projects/spring-boot/issues.

sdeleuze avatar Sep 23 '22 08:09 sdeleuze

Currently using 0.11.5 with spring boot 2.6.7 instead of 0.12.1 with 2.7.1 seems to bypass the issue

This saved my day

surabura avatar Nov 02 '22 13:11 surabura