modelmesh-serving icon indicating copy to clipboard operation
modelmesh-serving copied to clipboard

Triton on-wire compression of request/response on HTTP

Open andreapairon opened this issue 3 years ago • 0 comments

Is it possible to leverage Triton client (https://github.com/triton-inference-server/client) features like the on-wire compression of request/response on HTTP using the current /infer endpoint? (https://github.com/triton-inference-server/server/blob/main/docs/inference_protocols.md#compression)

If not, will be implemented on future ModelMesh releases?

andreapairon avatar Jan 24 '22 16:01 andreapairon