MLServer icon indicating copy to clipboard operation
MLServer copied to clipboard

Add support for raw_input_contents field

Open adriangonz opened this issue 5 years ago • 5 comments
trafficstars

Add support for top-level raw_input_contents in gRPC spec. This field was initially introduced to work-around some performance issues within gRPC.

You can find more details on these issues:

  • https://github.com/triton-inference-server/server/issues/1821
  • https://github.com/kubeflow/kfserving/pull/998

adriangonz avatar Oct 19 '20 16:10 adriangonz

Any updates ? Meet same problem

LuBingtan avatar Nov 24 '21 09:11 LuBingtan

Hey @LuBingtan , this is prioritised on our internal roadmap but haven't gotten around it yet.

Is this currently a blocker for you? Would be good to learn more about how it's affecting you.

adriangonz avatar Nov 24 '21 09:11 adriangonz

Hi @adriangonz Can we pass a raw input json to the ML server instead of the predefoned inference request format via seldon core deployment similar to the predict_raw method in seldon core that accepts a raw json?

divyadilip91 avatar Feb 15 '22 17:02 divyadilip91

Hey @divyadilip91, it's possible to define custom endpoints within a custom inference runtime. This is not currently documented, but you can see an example in the MLflow runtime.

Just for extra context though, this issue is not about custom payloads, but allowing the user to send a compact binary representation of their V2 payload (which can reduce latency in some use cases).

adriangonz avatar Feb 16 '22 15:02 adriangonz

Thankyou @adriangonz

divyadilip91 avatar Feb 24 '22 09:02 divyadilip91