MLServer icon indicating copy to clipboard operation
MLServer copied to clipboard

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

Results 304 MLServer issues
Sort by recently updated
recently updated
newest added
trafficstars

When using model-settings.json to define the inputs, it appears that the client inference request [doesn't need to define](https://mlserver.readthedocs.io/en/latest/user-guide/content-type.html#model-metadata) the content_type, but still needs to define the `datatype` and/or `shape`. It...

Currently we only expose the `predict` method. However, some orchestrations frameworks like SC support the use of other "inference steps", like routing or aggregation. It would be good to explore...

MLServer now has built-in support to unpack and activate a [conda-pack](https://conda.github.io/conda-pack/) tarball. This feature could be leveraged to run the custom environment defined in the`conda.yaml` file usually present in MLflow...

Intel has an accelerator (both for training and inference) of scikit-learn models: https://github.com/intel/scikit-learn-intelex. It would be interesting to see if we could use this in our scikit-learn runtime. It's unclear...

https://github.com/SeldonIO/MLServer/blob/master/docs/examples/custom/README.md In the above example ^ I see that in the training phase the data in standardized (lambda function) and this doesn't happen in the inference, so the model would...

Using the built-in logger is convenient because it's auto-configured:https://github.com/SeldonIO/MLServer/blob/743778766be536865c847135af93fedbcc89ba96/mlserver/logging.py#L23 However, it falls short when trying to debug messages coming from multiple different models - all of these will be assigned...

As a follow-up from the initial PR that introduced HuggingFace Optimum Runtime via #4081 we have identified a set of followup tasks to improve the servers: * [ ] Extend...

To better support use cases where `mlserver` is used as a library, we shouldn't restrict the dependencies versions too much. Instead, we should look into adding some sort of lockfile...

Inputs are a list of tensors with a `name` entry, however it's not possible to use this so select tensors by name. Instead one must either resort to index-based selection...

add catboost support: * add a new runtime by heavily copying lightgbm runtime * add test that builds a model, sends an inference request, validates the prediction * add example...