MLServer icon indicating copy to clipboard operation
MLServer copied to clipboard

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

Results 304 MLServer issues
Sort by recently updated
recently updated
newest added
trafficstars

Nice! We could probably use this to read other things within MLServer as well, like `model-settings.json` files. _Originally posted by @adriangonz in https://github.com/SeldonIO/MLServer/pull/720#discussion_r965667311_

If this is duplicated from other folders within the root `tests/` package, feel free to move it to the base `conftest.py` BTW _Originally posted by @adriangonz in https://github.com/SeldonIO/MLServer/pull/720#discussion_r965696527_

I'm new to seldon. When I added some debug statements (using `print()` and also the `logging` module) to the model I found that it will work in `load()` but not...

HuggingFace runtime has a batch_size variable in its setting. This should be checked against the MLServer max_batch_size setting for consistency. ```json { "name": "transformer", "implementation": "mlserver_huggingface.HuggingFaceRuntime", "max_batch_size": 5, "max_batch_time": 1,...

huggingface_runtime output JSON serializer does not support NumPy basic datatypes when the data is a dict value

Good morning! I noticed you have changed (in rest server) , queue request management from 1.0.0 to 1.1.0, adding in the last one python queues. I would like to know...

As mentioned https://github.com/SeldonIO/MLServer/pull/727#discussion_r972003311 the convertor from grpc output is not implemented. This isn't working as the following line: ```python from mlserver.grpc.converters import ModelInferResponseConverter from mlserver.codecs.string import StringRequestCodec inference_response = ModelInferResponseConverter.to_types(response)...

Hi, we would like the number of the elements within the request queue in pool inside of a metric, for performance issues. It's a good idea to get this data...

There is no information about how to retrieve back the raw dictionary data from mlserver output in the [documentation](https://mlserver.readthedocs.io/en/latest/examples/custom-json/README.html). I will add a pull request for discussion.

if the two variable `max_batch_time` and `max_batch_size` are defined in the `model-settings.json`: ```json { "name": "node-1", "implementation": "models.NodeOne", "max_batch_size": 5, "max_batch_time": 1, "parameters": { "uri": "./fakeuri" } } ``` Then...