Create guide for how to deploy an existing Hugging Face model on self-hosted LLM Engine

Open rkaplan opened this issue 2 years ago • 2 comments

Requested by Biju on Twitter.

Jul 19 '23 21:07 rkaplan

We're currently wrapping up some testing for a self-contained helm install on your own EKS cluster. Once that's ready, we'll ship the docs too.

Jul 20 '23 17:07 yixu34

Btw just want to clarify and acknowledge that https://github.com/scaleapi/llm-engine/pull/153 solves part but not all of the ask - it shows you how to deploy a self-hosted model for an existing model in our Model Zoo, which represents a subset of Hugging Face models. It does not show how you would also add to the Model Zoo, i.e. build an endpoint from an arbitrary Hugging Face model. That will require some follow-up work.

Jul 21 '23 00:07 yixu34