MLServer issues

Use aiofiles to read model-settings.json

Nice! We could probably use this to read other things within MLServer as well, like `model-settings.json` files. _Originally posted by @adriangonz in https://github.com/SeldonIO/MLServer/pull/720#discussion_r965667311_

RafalSkolasinski

Refactor duplicated fixtures

If this is duplicated from other folders within the root `tests/` package, feel free to move it to the base `conftest.py` BTW _Originally posted by @adriangonz in https://github.com/SeldonIO/MLServer/pull/720#discussion_r965696527_

RafalSkolasinski

No stdout output for default parallel_workers=1

3

I'm new to seldon. When I added some debug statements (using `print()` and also the `logging` module) to the model I found that it will work in `load()` but not...

rfalke

check for consistency between MLServer and HuggingFace Batch size

HuggingFace runtime has a batch_size variable in its setting. This should be checked against the MLServer max_batch_size setting for consistency. ```json { "name": "transformer", "implementation": "mlserver_huggingface.HuggingFaceRuntime", "max_batch_size": 5, "max_batch_time": 1,...

saeid93

support json serialize all kind's of huggingface pipelines outputs

6

huggingface_runtime output JSON serializer does not support NumPy basic datatypes when the data is a dict value

pepesi

Queue management from 1.0.0 to 1.1.0 versions

1

Good morning! I noticed you have changed (in rest server) , queue request management from 1.0.0 to 1.1.0, adding in the last one python queues. I would like to know...

alvarorsant

convertor from grpc response not implemented

As mentioned https://github.com/SeldonIO/MLServer/pull/727#discussion_r972003311 the convertor from grpc output is not implemented. This isn't working as the following line: ```python from mlserver.grpc.converters import ModelInferResponseConverter from mlserver.codecs.string import StringRequestCodec inference_response = ModelInferResponseConverter.to_types(response)...

saeid93

Get metrics from the Queues in the inference in parallel/pool

1

Hi, we would like the number of the elements within the request queue in pool inside of a metric, for performance issues. It's a good idea to get this data...

alvarorsant

No information about how to retrieve back the raw dictionary data from mlserver output in documentation

There is no information about how to retrieve back the raw dictionary data from mlserver output in the [documentation](https://mlserver.readthedocs.io/en/latest/examples/custom-json/README.html). I will add a pull request for discussion.

saeid93

"Maximum batch size" and "Maximum batch time" are not overwritten through env variables

3

if the two variable `max_batch_time` and `max_batch_size` are defined in the `model-settings.json`: ```json { "name": "node-1", "implementation": "models.NodeOne", "max_batch_size": 5, "max_batch_time": 1, "parameters": { "uri": "./fakeuri" } } ``` Then...

saeid93

MLServer
MLServer copied to clipboard

Metadata

Use aiofiles to read model-settings.json

Refactor duplicated fixtures

No stdout output for default parallel_workers=1

check for consistency between MLServer and HuggingFace Batch size

support json serialize all kind's of huggingface pipelines outputs

Queue management from 1.0.0 to 1.1.0 versions

convertor from grpc response not implemented

Get metrics from the Queues in the inference in parallel/pool

No information about how to retrieve back the raw dictionary data from mlserver output in documentation

"Maximum batch size" and "Maximum batch time" are not overwritten through env variables

← Metadata

Owner

Metadata

MLServer MLServer copied to clipboard

Metadata

← Metadata

Owner

Metadata

MLServer
MLServer copied to clipboard