MLServer issues

batch_request_queue and parallel_request_queue

1

After updating images to new versions of MLServer (version `1.3.0.dev4`),`batch_request_queue` and `parallel_request_queue` are not showing up on my K8S prometheus when deploying them on SeldonDeployments V1. Other parameters like `model_infer_request_success`...

saeid93

Surface custom handlers with custom environments

When using custom environments, the model can't be even imported within the main process. Therefore, the custom handlers hook is not able to inspect the model to look for custom...

adriangonz

effeciency with multiprocess: not work when setting 'parallel_workers > 1'

4

Hi, I'm trying to improve throughput of my server which was running by MLServer, and I got to know that I can set 'parallel_workers > 1' to enable parallel. Hence...

ooooona

Schema validation for custom runtimes

There is currently no inbuilt way to validate an input against its defined schema. For example, this schema defines an input array named `array` with shape `[-1, 1, 28, 28]`,...

michaeltinsley

Proto descriptor definition collision with KServe

6

Hi, I'm using MLServer with KServe, and found that the proto descriptor in grpc has a collision between them: ``` File ~/.cache/pypoetry/virtualenvs/example-mlflow-lZ2hGP5g-py3.10/lib/python3.10/site-packages/mlserver/__init__.py:2 1 from .version import __version__ ----> 2 from...

jinserk

Out of the box support of graph optimiser

2

Feature request - Since doing an optimization step before the deep learning model is becoming very common in machine learning deployment, out-of-the-box support in MLServer could be beneficial, some examples...

saeid93

REST latency buckets capped at 10s

The `starlette_exporter` middleware we use in the REST server seems to cap buckets at 10s (which matches with the default buckets used in Prom: https://github.com/prometheus/client_python/blob/4f994ece6dcfd1905726d18e2a6899cc4474ac3d/prometheus_client/metrics.py#L544). In some cases, requests' latency...

adriangonz

MLServer
MLServer copied to clipboard

Metadata

batch_request_queue and parallel_request_queue

Surface custom handlers with custom environments

effeciency with multiprocess: not work when setting 'parallel_workers > 1'

Schema validation for custom runtimes

Proto descriptor definition collision with KServe

Out of the box support of graph optimiser

REST latency buckets capped at 10s

Support more types of custom metrics

Allow explanation of batched inputs for explaines that dont allow batch.

Allow passing complex inputs to explanations

← Metadata

Owner

Metadata

MLServer MLServer copied to clipboard

Metadata

← Metadata

Owner

Metadata

MLServer
MLServer copied to clipboard