Saeid Ghafouri issues

Results 43 issues of


                                            Saeid Ghafouri

Mentioning MLServer as one of the prepackaged runtinmes

I think there should be a page added to doc with an MLServer example in [Servers](https://docs.seldon.io/projects/seldon-core/en/latest/nav/config/servers.html) page. I also think that there should be somewhere in the doc (maybe in...

Problem in the adding metedata to Seldon graph following the documentaiton

## Describe the bug As per our conversation over [Slack](https://seldondev.slack.com/archives/C03DQFTFXMX/p1661526920540159?thread_ts=1661438086.861089&cid=C03DQFTFXMX) the metaata example in the documentation does not have the expected behavior and result in a microservice error. ## To...

bug

Using shared models/modules across inference pipelines

Academic systems like [Rim](https://dl.acm.org/doi/abs/10.1145/3450268.3453521) and [grandslam](https://dl.acm.org/doi/10.1145/3302424.3303958) have the ability to share a model across multiple pipelines. As there are use cases in which a single model could be used as...

added support for huggingface runtime speech models

fixes: #758

HuggingFace speech models not supported

MLserver HuggingFace runtime cannot work with speech models in the batched mode as the pipeline accepts a list of arrays [(request1), (request2), (request3), (request4), (request5)] which the type of each...

merging the huggingface and mlserver batch variable

As per https://github.com/SeldonIO/MLServer/pull/740#discussion_r981259626 it would be possible to just merge both huggingface batch variable into the mlserver batch variable for less redundancy in the `model-settings.json` file.

check for consistency between MLServer and HuggingFace Batch size

HuggingFace runtime has a batch_size variable in its setting. This should be checked against the MLServer max_batch_size setting for consistency. ```json { "name": "transformer", "implementation": "mlserver_huggingface.HuggingFaceRuntime", "max_batch_size": 5, "max_batch_time": 1,...

convertor from grpc response not implemented

As mentioned https://github.com/SeldonIO/MLServer/pull/727#discussion_r972003311 the convertor from grpc output is not implemented. This isn't working as the following line: ```python from mlserver.grpc.converters import ModelInferResponseConverter from mlserver.codecs.string import StringRequestCodec inference_response = ModelInferResponseConverter.to_types(response)...

No information about how to retrieve back the raw dictionary data from mlserver output in documentation

There is no information about how to retrieve back the raw dictionary data from mlserver output in the [documentation](https://mlserver.readthedocs.io/en/latest/examples/custom-json/README.html). I will add a pull request for discussion.

"Maximum batch size" and "Maximum batch time" are not overwritten through env variables

if the two variable `max_batch_time` and `max_batch_size` are defined in the `model-settings.json`: ```json { "name": "node-1", "implementation": "models.NodeOne", "max_batch_size": 5, "max_batch_time": 1, "parameters": { "uri": "./fakeuri" } } ``` Then...