server icon indicating copy to clipboard operation
server copied to clipboard

All gRPC requests to the Triton server are timing out, but HTTP requests are functioning normally.

Open SunnyGhj opened this issue 1 year ago • 14 comments

Description All gRPC requests to the Triton server are timing out, but HTTP requests are functioning normally.

Triton Information 23.10

Are you using the Triton container or did you build it yourself? container from NGC

To Reproduce When using the TensorRT backend, I often encounter a large number of connection timeouts with gRPC, while HTTP requests work fine. This indicates that there is no problem with the model, but rather with the RPC. After restarting the service, RPC requests return to normal.

Expected behavior

SunnyGhj avatar Feb 21 '24 04:02 SunnyGhj

After tcp packet capture analysis, grpc port 8001 is normal, it is confirmed that the request reaches tritonserver, and finally times out.

SunnyGhj avatar Feb 22 '24 07:02 SunnyGhj

@tanmayv25 @Tabrizian @CoderHam Sincerely asking for help!

SunnyGhj avatar Feb 22 '24 07:02 SunnyGhj

I meet silimar issue. I'm deploying tritonserver on T4 in docker container, and infer via http/grpc endpoint. The models are tensorrt engines converted from onnx. After server is started, everything works fine. But somehow the grpc infer is blocking, statistics show that no request is performed. But if I switch to http client, inference is ok. It seems grpc infer is blocking, maybe the request has not been passed to core engine?

biaochen avatar Feb 22 '24 11:02 biaochen

I meet silimar issue. I'm deploying tritonserver on T4 in docker container, and infer via http/grpc endpoint. The models are tensorrt engines converted from onnx. After server is started, everything works fine. But somehow the grpc infer is blocking, statistics show that no request is performed. But if I switch to http client, inference is ok. It seems grpc infer is blocking, maybe the request has not been passed to core engine?

image After set log_verbose_level=2, I found more information. It seems request cannot be fetched from cq. hope this find could help investigate.

biaochen avatar Feb 23 '24 06:02 biaochen

Thank you for reporting this, I filed a ticket for our team to investigate: 6211

oandreeva-nv avatar Feb 24 '24 01:02 oandreeva-nv

If the service state is mistakenly judged as shutdown and there are no new requests in the completion queue (cq), will it block all RPC requests? 企业微信截图_f63db341-811a-4473-9089-210379ef4227

SunnyGhj avatar Feb 26 '24 07:02 SunnyGhj

I meet silimar issue. I'm deploying tritonserver on T4 in docker container, and infer via http/grpc endpoint. The models are tensorrt engines converted from onnx. After server is started, everything works fine. But somehow the grpc infer is blocking, statistics show that no request is performed. But if I switch to http client, inference is ok. It seems grpc infer is blocking, maybe the request has not been passed to core engine?

I am using http client in the same environment you described. But I'm facing occasional regular timeouts, it seems that the client can't connect to the server, but after one occurrence the rest of the requests are fine until the next time the problem occurs. Will you face the problem too ?

secain avatar Feb 28 '24 06:02 secain

I meet silimar issue. I'm deploying tritonserver on T4 in docker container, and infer via http/grpc endpoint. The models are tensorrt engines converted from onnx. After server is started, everything works fine. But somehow the grpc infer is blocking, statistics show that no request is performed. But if I switch to http client, inference is ok. It seems grpc infer is blocking, maybe the request has not been passed to core engine?

I am using http client in the same environment you described. But I'm facing occasional regular timeouts, it seems that the client can't connect to the server, but after one occurrence the rest of the requests are fine until the next time the problem occurs. Will you face the problem too ?

No, I haven't encountered this problem

SunnyGhj avatar Feb 28 '24 11:02 SunnyGhj

If the service state is mistakenly judged as shutdown and there are no new requests in the completion queue (cq), will it block all RPC requests?

sssss

SunnyGhj avatar Feb 29 '24 09:02 SunnyGhj

Thank you for reporting this, I filed a ticket for our team to investigate: 6211

Hi, Andreeva. Is there any progress?

SunnyGhj avatar Mar 04 '24 05:03 SunnyGhj

The issue is being looked at.

oandreeva-nv avatar Mar 04 '24 17:03 oandreeva-nv

The issue is being looked at.

Thanks, look forward to your soonest reply.

SunnyGhj avatar Mar 05 '24 10:03 SunnyGhj