sglang icon indicating copy to clipboard operation
sglang copied to clipboard

Add a HuggingFace backend

Open cloneofsimo opened this issue 1 year ago • 1 comments

As you are probably aware, huggingface-hub allows users to upload model & their implementation as a model repo and users can use them via AutoModel and sorts. For example: https://huggingface.co/microsoft/Florence-2-large/tree/main

This is great, because model uploaders don't have to wait for PR to get merged to share the model to the public compatible with huggingface ecosystem, and I think similar things can be done for sglang.

IMO this is such a great tool that is very underappreciated that can really shine from such ecosystem

cloneofsimo avatar Jul 04 '24 06:07 cloneofsimo

@cloneofsimo Thanks for the suggestions.

Currently, sglang backend provides many speed optimizations (e.g., radix attention, continuous batching) so we need to re-implement the model with SGLang layers to get these benefits.

It seems what you need is a Hugging Face backend, which can directly load models from the Hugging Face official repo or third-party repo. Given our limited bandwidth, our team will not work on this, but communication support is welcomed!

To get started, you can try to add a HuggingFace backend under https://github.com/sgl-project/sglang/tree/main/python/sglang/backend and then you can call any huggingface models with sglang frontend.

merrymercy avatar Jul 09 '24 08:07 merrymercy

This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed.

github-actions[bot] avatar Sep 08 '24 01:09 github-actions[bot]