infinity icon indicating copy to clipboard operation
infinity copied to clipboard

Support Integration with KServe

Open indranilr opened this issue 1 year ago • 1 comments

Feature request

Kserve is a Kubernetes based engine for predictive and generative AI models and provides abstraction for popular model servers like Huggingface TEI (https://github.com/kserve/kserve/pull/3743), Tensorflow,PyTorch etc. Request to support Infinity as a model serving engine in Kserve too.

Motivation

Many organizations using OSS are using Kserve for predictive model deployment, and are attempting to use the same for embedding and generative model deployment. Having Infinity as a model serving engine would help to avoid a separate deployment for infinity altogether Ref :#314 .

Your contribution

NA

indranilr avatar Sep 06 '24 19:09 indranilr

@indranilr I like the goal and mission of the Kserve project. That said, I have not worked with it extensively in the past. I would be happy to assist someone (like you) having questions for working on the integration.

michaelfeil avatar Sep 06 '24 19:09 michaelfeil