@RunOnVirtualThread is not working properly with use-separate-server=false
Describe the bug
In my grpc Service I wanted to use the Annotation @RunOnVirtualThread, but sometimes on requests when there is some load (for example in my integration tests with a lot of requests in a short time) I get this error:
io.grpc.StatusRuntimeException: INTERNAL: Half-closed without a request at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:268) at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:249) at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:167)
Expected behavior
The requests should be working totally fine even under some load
Actual behavior
When putting some load on the service some requests are failing with the error: INTERNAL: Half-closed without a request
How to Reproduce?
I used the quarkus hello project. So I just created a new quarkus project with grpc and the example code. I put in the application.properties
quarkus.grpc.server.use-separate-server=false quarkus.grpc.server.enable-reflection-service=true
and then I started putting some load on the service with: (https://github.com/bojand/ghz)
ghz --insecure
--proto ./src/main/proto/hello.proto
--call hello.HelloGrpc.SayHello
-d '{"name": "Test"}'
-c 10 -n 1000
localhost:8080
there I see this:
Summary: Count: 1000 Total: 20.24 s Slowest: 106.02 ms Fastest: 0.17 ms Average: 102.18 ms Requests/sec: 49.40
Response time histogram: 0.171 [1] | 10.755 [982] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 21.340 [0] | 31.924 [0] | 42.509 [0] | 53.093 [0] | 63.677 [0] | 74.262 [0] | 84.846 [0] | 95.431 [0] | 106.015 [10] |
Latency distribution: 10 % in 0.42 ms 25 % in 0.57 ms 50 % in 0.91 ms 75 % in 1.42 ms 90 % in 2.15 ms 95 % in 3.02 ms 99 % in 103.97 ms
Status code distribution: [OK] 993 responses [Internal] 2 responses [DeadlineExceeded] 5 responses
Error distribution: [2] rpc error: code = Internal desc = Half-closed without a request [5] rpc error: code = DeadlineExceeded desc = context deadline exceeded
if I do use-separate-server=true:
ghz --insecure
--proto ./src/main/proto/hello.proto
--call hello.HelloGrpc.SayHello
-d '{"name": "Test"}'
-c 10 -n 1000
localhost:9000
Summary: Count: 1000 Total: 223.26 ms Slowest: 15.65 ms Fastest: 0.58 ms Average: 2.08 ms Requests/sec: 4479.15
Response time histogram: 0.576 [1] | 2.084 [789] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 3.591 [153] |∎∎∎∎∎∎∎∎ 5.099 [29] |∎ 6.607 [16] |∎ 8.115 [2] | 9.622 [0] | 11.130 [0] | 12.638 [0] | 14.146 [0] | 15.654 [10] |∎
Latency distribution: 10 % in 1.42 ms 25 % in 1.60 ms 50 % in 1.76 ms 75 % in 1.99 ms 90 % in 2.56 ms 95 % in 3.77 ms 99 % in 7.86 ms
Status code distribution: [OK] 1000 responses
Output of uname -a or ver
No response
Output of java -version
java version "21" 2023-09-19
Quarkus version or git rev
3.24.3
Build tool (ie. output of mvnw --version or gradlew --version)
No response
Additional information
No response
/cc @cescoffier (virtual-threads), @ozangunalp (virtual-threads)
Any news on this?
No, we didn't have time to look at it. When using separate servers, we have a lot less control over the execution model. I would even say: it is likely "working" sometimes by accident, and we may disable virtual threads altogether when using this server.
@cescoffier I am sorry I confused the boolean in the title. It is not working properly with use-separate-server=false.
Ok, we will need to investigate.
Is there an update for this bug? I’m facing the same issue and had to change to the worker threadpool with @Blocking to avoid request failures.
I’m using Quarkus 3.30.2 and setting use-separate-server=false.
No, no progress yet.