Add a HuggingFace backend
As you are probably aware, huggingface-hub allows users to upload model & their implementation as a model repo and users can use them via AutoModel and sorts. For example:
https://huggingface.co/microsoft/Florence-2-large/tree/main
This is great, because model uploaders don't have to wait for PR to get merged to share the model to the public compatible with huggingface ecosystem, and I think similar things can be done for sglang.
IMO this is such a great tool that is very underappreciated that can really shine from such ecosystem
@cloneofsimo Thanks for the suggestions.
Currently, sglang backend provides many speed optimizations (e.g., radix attention, continuous batching) so we need to re-implement the model with SGLang layers to get these benefits.
It seems what you need is a Hugging Face backend, which can directly load models from the Hugging Face official repo or third-party repo. Given our limited bandwidth, our team will not work on this, but communication support is welcomed!
To get started, you can try to add a HuggingFace backend under https://github.com/sgl-project/sglang/tree/main/python/sglang/backend and then you can call any huggingface models with sglang frontend.
This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed.