MLServer issues

ML Server with GRPC

HI, I actually started using ML Server for a new project. But I wanted to set-up using GRPC. Is there any example or documentation on how to set it up?...

Haritima-Manchanda

gRPC fails with inferred f16 numpy array

1

I think I discovered a bug in the current gRPC code in mlserver. I have a model that returns float16 arrays and I tried to get predictions via gRPC. I...

sauerburger

Adding more metrics to MLServer Prometheus endpoint

1

The currently implemented metrics in MLServer are all around pure count of the number of requests: ![mlserver-metrics](https://user-images.githubusercontent.com/6298780/197811560-bd0b6276-a160-4c12-8d4e-0e558554093e.png) Compared with similar platforms like [Triton Server](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/metrics.md) many other metrics could be added...

saeid93

Examples/clarification over using OpenTelemetry tracing in MLServer

Hi there - the docs show very little information for how to incorporate OpenTelemetry tracing in MLServer, added in this [PR](https://github.com/SeldonIO/MLServer/pull/1281). If for instance I’m deploying this model server within...

danielsoutar

Install `fugashi`, `unidic`, `unidic-lite`, and `ipadic` as dependencies to MLServer HuggingFace to support hosting Japanese language models

3

Because of the fact that Japanese mixes phonetic scripts and Chinese characters, special algorithms and dictionaries are needed to run tokenizers for these these models. A popular example of this...

jbauer2718

Allow payload request to support extra inference method kwargs

6

```python from transformers import LlamaForCausalLM, AutoTokenizer, TextGenerationPipeline model = LlamaForCausalLM.from_pretrained("daryl149/llama-2-7b-hf",load_in_8bit=True) tokenizer = AutoTokenizer.from_pretrained("daryl149/llama-2-7b-hf") pipeline = TextGenerationPipeline(model, tokenizer) pipeline("Once upon a time,", max_new_tokens=100,return_full_text=False) ``` `max_new_tokens` and `return_full_text` are extra arguments we...

nanbo-liu

Ability to serve swagger doc dependencies from the mlserver instance or use a custom CDN

The swagger docs don't work when the browser cannot access the CDN for swagger js dependencies. Is it possible to configure a custom CDN or serve the dependencies using static...

ichbinjakes

tritonclient dependency change - cuda-python cause poetry installation failure in MacOS

6

Hi there i got issue with poetry install mlserver, and it is due to a tritonclient 2.37+ now depends on cuda-python, it will block mlserver installation if the machine does...

liusha-H

Add support for `CatBoostRegressor`

Following https://github.com/SeldonIO/MLServer/pull/1403, it would be great to also support the `CatBoostRegressor` (and subsequently `CatBoostRanker`) model types. From the linked PR: Q: >Looking ahead to adding support for the Regressor and...

krishanbhasin-gc

Enable PandasCodec.decode_request to restore the exact dataframe

# What When a dataframe is encoded by `PandasCodec.encode_request(use_bytes=True)`, `PandasCodec.decode_request()` cannot restore the exact dataframe. client code ```py X = pd.DataFrame( dict( int_col=[1, 2, 3], str_col=["s1", "s2", "s3"], ) )...

ysk24ok

MLServer
MLServer copied to clipboard

Metadata

ML Server with GRPC

gRPC fails with inferred f16 numpy array

Adding more metrics to MLServer Prometheus endpoint

Examples/clarification over using OpenTelemetry tracing in MLServer

Install `fugashi`, `unidic`, `unidic-lite`, and `ipadic` as dependencies to MLServer HuggingFace to support hosting Japanese language models

Allow payload request to support extra inference method kwargs

Ability to serve swagger doc dependencies from the mlserver instance or use a custom CDN

tritonclient dependency change - cuda-python cause poetry installation failure in MacOS

Add support for `CatBoostRegressor`

Enable PandasCodec.decode_request to restore the exact dataframe

← Metadata

Owner

Metadata

MLServer MLServer copied to clipboard

Metadata

← Metadata

Owner

Metadata

MLServer
MLServer copied to clipboard