MLServer
MLServer copied to clipboard
InferenceRequests with Tensor of Tensors does not decode properly
Hello,
First of all thank you for the project, and the time spent on it.
We have a very simple MLFlow pyfunc model that is being pickled and loaded in a MLSever.
When loading the MLFlow model in MLServer (download the artifacts and start a mlserver start we see that the InferenceRequest is not properly parsed.
Code that we use: model:
import mlflow
import pandas as pd
class Model(mlflow.pyfunc.PythonModel):
def __init__(self) -> None:
pass
def predict(self, self,
context,
model_input: pd.DataFrame,
params=None,) -> pd.DataFrame:
print(model_input)
return model_input
We load the model in mlserver and that model is being invoqued using two endpoints.
- we invoke using the v2 infer endpoint (
/v2/models/model/infer) with a payload that look like this:
dataframe = pd.DataFrame({'test':[['test', 'test']], 'input_1': ['1']})
request = requests.post('http://localhost:8080/v2/models/model/infer, data=PandasCodec.encode_request(df).json())
logs from the server: Actual:
INFO: 192.168.65.1:21457 - "POST /v2/models/item-knn/infer HTTP/1.1" 500 Internal Server Error
INFO:namespace.models.item_knn: input_1 test
0 1
What we would expect:
INFO: 192.168.65.1:21457 - "POST /v2/models/item-knn/infer HTTP/1.1" 500 Internal Server Error
INFO:namespace.models.item_knn: input_1 test
0 1 ['test', 'test']
(note that we also discovered a bug in the codec and/or the doc, we need to use data and not JSON as this encoding would encode the string as byte string in python that can't be parsed with JSON, I will open an issue if you think this is a bug, or simply a bad usage from us with all the code we used and how to reproduce exactly the bug)
When calling this function we would expect the dataframe to be 1, ['test','test'] however it's actually 1, None, we used multiple ways to encode this dataframe with shapes defined differently, but they dont work.
- We use the
invocationsendpoint from mlflow directly, this works as expected with all the encoding proposed on their website.
We looked at a lot of places, and we realized that one of the issue is that the InferenceProtocol decode function does not look at the shape. So even if we flatten the array to get ['test', 'test'] as the reprensentation, the array will still be considered as an array of several 1 element therefore cannot be parsed correctly.
Our questions?
- Is this a bug? And if yes is this known and would there be any workaround to pass an array of value to a model (would it be MLFlow Pyfunc or any other type, as the model itself does work)
- Is this simply a bad usage of the InferenceProtocol API from our end that could be resolved by looking at some documentation?