MLServer
MLServer copied to clipboard
An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
Trace each inference step within MLServer. These traces can be pushed to Jaeger or similar OpenTracing backends.
Watch for changes on the model repository to refresh models automatically whenever there are new versions of the model artifacts.
Seldon Core currently supports providing a “reward signal” as feedback for model’s predictions. This is received by the model as a request sent to a `/feedback` endpoint. Since this modifies...
MLserver HuggingFace runtime cannot work with speech models in the batched mode as the pipeline accepts a list of arrays [(request1), (request2), (request3), (request4), (request5)] which the type of each...
Cleaned up model versioning part of [registry.py](https://github.com/SeldonIO/MLServer/blob/master/mlserver/registry.py). and added semantic versioning for the model's version (The TODO comment).
As per https://github.com/SeldonIO/MLServer/pull/740#discussion_r981259626 it would be possible to just merge both huggingface batch variable into the mlserver batch variable for less redundancy in the `model-settings.json` file.
There are three different places where the path to where the models are located (i.e. the model repository) can be specified: 1. CLI argument to the `mlserver start ` command....
Hi, I am having some problems with PandasCodec and grpc requests. On the client side, I have the following code to send a pandas df: ``` python data = pd.DataFrame(data={'col1':...
Triton exposes an API to provide details statistics on model usage check [here](https://github.com/triton-inference-server/server/blob/main/docs/protocol/extension_statistics.md): `rpc ModelStatistics(ModelStatisticsRequest)` It would be good to consider something similar for MLServer. Specifically I like `queue` vs...