MLServer
MLServer copied to clipboard
Add support for raw_input_contents field
Add support for top-level raw_input_contents in gRPC spec. This field was initially introduced to work-around some performance issues within gRPC.
You can find more details on these issues:
- https://github.com/triton-inference-server/server/issues/1821
- https://github.com/kubeflow/kfserving/pull/998
Any updates ? Meet same problem
Hey @LuBingtan , this is prioritised on our internal roadmap but haven't gotten around it yet.
Is this currently a blocker for you? Would be good to learn more about how it's affecting you.
Hi @adriangonz Can we pass a raw input json to the ML server instead of the predefoned inference request format via seldon core deployment similar to the predict_raw method in seldon core that accepts a raw json?
Hey @divyadilip91, it's possible to define custom endpoints within a custom inference runtime. This is not currently documented, but you can see an example in the MLflow runtime.
Just for extra context though, this issue is not about custom payloads, but allowing the user to send a compact binary representation of their V2 payload (which can reduce latency in some use cases).
Thankyou @adriangonz