llm-engine
llm-engine copied to clipboard
Create guide for how to deploy an existing Hugging Face model on self-hosted LLM Engine
We're currently wrapping up some testing for a self-contained helm install on your own EKS cluster. Once that's ready, we'll ship the docs too.
Btw just want to clarify and acknowledge that https://github.com/scaleapi/llm-engine/pull/153 solves part but not all of the ask - it shows you how to deploy a self-hosted model for an existing model in our Model Zoo, which represents a subset of Hugging Face models. It does not show how you would also add to the Model Zoo, i.e. build an endpoint from an arbitrary Hugging Face model. That will require some follow-up work.