server
server copied to clipboard
Exception serializing request - dealing with large input
Hey everyone!
I am trying to deploy whisper model. I used to the python backend for that. We pass the audio as numpy array to Triton server, I set the dims
in the model config to -1
so the user can determine the input shape dynamically (whisper accepts that too).
When I try to inference a very long audio (around 24h) with shape of 1.382.393.499
(very large I know) I get this error:
Traceback (most recent call last):
File "/media/ssdraid/training/triton/triton-test/whisper_client.py", line 145, in <module>
results = triton_client.infer(
File "/home/tensorflow/venvs/triton/lib/python3.10/site-packages/tritonclient/grpc/__init__.py", line 1322, in infer
raise_error_grpc(rpc_error)
File "/home/tensorflow/venvs/triton/lib/python3.10/site-packages/tritonclient/grpc/__init__.py", line 62, in raise_error_grpc
raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.INTERNAL] Exception serializing request!
Can anyone help me finding a solution, please?
Thanks in advance