MLServer issues

Discrepancy between models loaded by xgboost and lightgbm runtimes

## Issue The `mlserver_xgboost` runtime loads models as follows: https://github.com/SeldonIO/MLServer/blob/6864a2ddf90bc3c81e9bf178b1baeba3931de28a/runtimes/xgboost/mlserver_xgboost/xgboost.py#L24-L34 Whilst the `mlserver_lightgbm` runtime does: https://github.com/SeldonIO/MLServer/blob/6864a2ddf90bc3c81e9bf178b1baeba3931de28a/runtimes/lightgbm/mlserver_lightgbm/lightgbm.py#L22 The result is that for xgboost we end up with a sklearn API model,...

ascillitoe

Add TreeSHAP dependencies to Alibi-Explain runtime image

1

The `TreeSHAP` explainer needs access to the underlying model instance. Therefore, to ensure `TreeSHAP` works out-of-the-box, we'll need to ensure the following libraries are present on the Alibi Explain runtime...

adriangonz

Add cancelled requests back to the worker queue

Following #1234, a future enhancement is to - when a worker dies - add the cancelled requests back to the queue for another worker to pick up. _Originally posted by...

adriangonz

Support ModelStreamInfer

Add support for Triton's new [`ModelStreamInfer` RPC](https://github.com/triton-inference-server/server/blob/8e6628f4f5a9e3dc8f4c718282dc4e76c3587477/docs/protocol/extension_sequence.md?plain=1#L134) extension to the Open Inference Protocol.

adriangonz

Show protocol extensions (e.g. model repository) in Swagger docs

4

First of all: Thank you for all the work that went into version 1.3.x. With version 1.2.4, I used to navigate to `http://localhost:8080/docs` and check which models were loaded, see...

sauerburger

Allow local packages in `mlserver build` dependencies

In some cases, people may structure their custom inference runtimes as "isolated" Python packages (i.e. with a `setup.py` / `pyproject.toml`, etc.). In these cases, to make sure your local packages...

adriangonz

Custom metrics are not showing up when parallel_worker=0

3

Following the root cause of #1130 it seems that the MLServer metrics not working with the parallel_worker=0 was the reason. As a result [Custom metrics](https://mlserver.readthedocs.io/en/latest/user-guide/metrics.html#custom-metrics) seems to be not working...

saeid93

Expose `TensorDictCodec` as a top level codec

Currently `TensorDictCodec` is specific to mlflow runtime but it is useful beyond just this runtime. Consider moving it one level up for other runtimes to access it without having to...

sakoush

Dynamic change of batch size

7

In some cases we need to be able to change some of the configurations of the deployed models like the batch size on the fly without reloading the model, I...

saeid93

high number of concurrent requests causes MLServer microservice to timeout

6

These are predictions that are a bit demanding and will take ~20s to finish. If many requests hit the microservice concurrently, it will start becoming significantly slower (40s, 50s, 60s...

rlleshi

MLServer
MLServer copied to clipboard

Metadata

Discrepancy between models loaded by xgboost and lightgbm runtimes

Add TreeSHAP dependencies to Alibi-Explain runtime image

Add cancelled requests back to the worker queue

Support ModelStreamInfer

Show protocol extensions (e.g. model repository) in Swagger docs

Allow local packages in `mlserver build` dependencies

Custom metrics are not showing up when parallel_worker=0

Expose `TensorDictCodec` as a top level codec

Dynamic change of batch size

high number of concurrent requests causes MLServer microservice to timeout

← Metadata

Owner

Metadata

MLServer MLServer copied to clipboard

Metadata

← Metadata

Owner

Metadata

MLServer
MLServer copied to clipboard