MLServer Can't specify params when running inference on MLServer

Can't specify params when running inference on MLServer

Open svpino opened this issue 1 year ago • 5 comments

When deploying an MLflow model using MLServer, we can't specify params as part of the request.

Here is the error returned by the server:

{"error":"mlflow.utils.proto_json_utils.MlflowInvalidInputException: Invalid input. One of \"instances\" and \"inputs\" must be specified (not both or any other keys).Received: ['inputs', 'params']"}%

Here is how I'm running the server:

mlflow models serve -m runs:/<run_id>/model -p 5000 --enable-mlserver

Here is an example of the request:

curl -X POST http://0.0.0.0:8080/invocations \
    -H "Content-Type: application/json" \
    -d '{"inputs": [{
            "sex": "MALE"
        }], 
        "params": {"capture": true}}'

It seems that MLServer doesn't support parameters as part of the request, which breaks its compatibility with MLflow models that require these parameters to work.

Oct 16 '24 21:10 svpino

thanks @svpino there is a PR for a proposed fixed here: https://github.com/SeldonIO/MLServer/pull/1921

Oct 17 '24 08:10 sakoush

@svpino this has been fixed for the standard mlflow runtime /infer endpoint as part of: https://github.com/SeldonIO/MLServer/pull/1921

Would it be possible to use this endpoint instead of /invocations, otherwise we would welcome contributions similar to the above.

Note that /invocations endpoint is a custom endpoint in mlserver and lacks support for a lot of features compared to /infer.

Nov 19 '24 12:11 sakoush

Unfortunately, MLflow requires an /invocations endpoint, so having it on /infer only will not solve the compatibility issue.

Nov 19 '24 13:11 svpino

@svpino is this something that you would like to fix in a PR? We encourage contributions from the community,

Nov 19 '24 14:11 sakoush

I'd love to. I'll try to get to this at some point, but I can't make any promises for now.

Nov 19 '24 14:11 svpino

MLServer MLServer copied to clipboard

Can't specify params when running inference on MLServer

MLServer
MLServer copied to clipboard