MLServer
MLServer copied to clipboard
An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
## How to reproduce: ``` docker run -it python:3.9 bash >>> pip install mlserver >>> python -c "import mlserver" ``` Package versions: ``` > pip list | grep mlserver mlserver...
I am interested in using MLServer with HuggingFace models. To get the full potential of these models, it is essential to modify the model parameters (https://huggingface.co/docs/transformers/main_classes/text_generation). I have tried to...
I am trying to run a transformer model using parallel inference on 4 workers on a machine that has 4 GPUs. The 4 workers are able to load the model...
Hi, I observed some weird behavior when using the REST API with adaptive batching enabled. When sending a **single** request to the v2 REST endpoint `/v2/models//infer` the Parameters within the...
As git isn't installed on the mlserver images, any references to git paths in your requirements.txt will fail `mlserver build` commands. Reproduce: - Specify a git path in your requirements.txt...
MLServer only supports Pydantic V1, which is a problem for us as we would like to move to Pydantic V2 for all our services using Pydantic. Do you think MLServer...
Hello, First of all thank you for the project, and the time spent on it. We have a very simple MLFlow pyfunc model that is being pickled and loaded in...
Resolves #1506
If using a `model-settings.json` of the following form: ```json { "name": "my-model", "implementation": "mlserver_huggingface.HuggingFaceRuntime", "parameters": { "extra": { "task": "text-generation", "pretrained_model": "model/path", "model_kwargs": { "load_in_8bit": true } } } }...
Support a Heterogeneous pool of workers with variable number of model replicas per worker