grpc-spring icon indicating copy to clipboard operation
grpc-spring copied to clipboard

Race condition when finding an available port in GrpcServerProperties

Open amirhadadi opened this issue 1 year ago • 1 comments

The context

In our CI build we use port 0 for the grpc server, allowing the grpc server to determine an available port automatically. However, we sometimes encounter an issue where the port is already bound when the grpc server starts.

The bug

This appears to be a result of the port assignment being non atomic:

  1. GrpcServerProperties::getPort finds an available port (since the port is 0).
  2. The socket used to find that port is closed.
  3. The port is bound by some other thread / process on the machine.
  4. When the grpc server starts and attempts to bind to the port, it fails since the port is already bound.

This is exacerbated by the GrpcServerMetricAutoConfiguration::grpcInfoContributor (which calls GrpcServerProperties::getPort) being loaded long before the call to GrpcServerLifecycle::start, hence there is a long window during which the port can be bound by another thread / process on the machine.

Stacktrace and logs

[09:37:35][Step 2/3] Caused by: java.lang.IllegalStateException: Failed to start the grpc server
[09:37:35][Step 2/3] 	at net.devh.boot.grpc.server.serverfactory.GrpcServerLifecycle.start(GrpcServerLifecycle.java:74)
[09:37:35][Step 2/3] 	at org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:178)
[09:37:35][Step 2/3] 	... 76 more
[09:37:35][Step 2/3] Caused by: java.io.IOException: Failed to bind to address 0.0.0.0/0.0.0.0:38840
[09:37:35][Step 2/3] 	at io.grpc.netty.NettyServer.start(NettyServer.java:328)
[09:37:35][Step 2/3] 	at io.grpc.internal.ServerImpl.start(ServerImpl.java:184)
[09:37:35][Step 2/3] 	at io.grpc.internal.ServerImpl.start(ServerImpl.java:93)
[09:37:35][Step 2/3] 	at net.devh.boot.grpc.server.serverfactory.GrpcServerLifecycle.createAndStartGrpcServer(GrpcServerLifecycle.java:113)
[09:37:35][Step 2/3] 	at net.devh.boot.grpc.server.serverfactory.GrpcServerLifecycle.start(GrpcServerLifecycle.java:72)
[09:37:35][Step 2/3] 	... 77 more
[09:37:35][Step 2/3] Caused by: io.netty.channel.unix.Errors$NativeIoException: bind(..) failed: Address already in use

The application's environment

  • Spring (boot): 2.7.9
  • grpc-spring-boot-starter: 2.13.1.RELEASE

amirhadadi avatar Apr 02 '23 08:04 amirhadadi

I'll have a look.

ST-DDT avatar Apr 02 '23 21:04 ST-DDT