OpenLLM icon indicating copy to clipboard operation
OpenLLM copied to clipboard

feat: support LMDeploy backend

Open zhyncs opened this issue 1 year ago • 1 comments

Feature request

@aarnphm @ssheng @parano Hi OpenLLM team, thank you for your exceptional work. Currently, OpenLLM supports two backends, vLLM and PyTorch, with good usability but there is still room for improvement in terms of performance. LMDeploy has achieved a good balance between performance and usability, with recent Llama3 8B showing a 1.8x performance improvement over vLLM on LMDeploy. Performance is crucial, especially when the demand for large-scale deployment arises after meeting user requirements. Currently, Meituan is widely using internally. I strongly recommend OpenLLM to consider integrating LMDeploy and making it the default backend. You can refer to the documentation at https://lmdeploy.readthedocs.io/en/latest/ during the research and integration process. Thanks.

Motivation

No response

Other

No response

zhyncs avatar Apr 25 '24 00:04 zhyncs

cc @lvhan028 @AllentDan

zhyncs avatar Apr 25 '24 02:04 zhyncs

remind @aarnphm @ssheng @parano

zhyncs avatar Jun 05 '24 02:06 zhyncs

Recently, I noticed that you published a blog post https://bentoml.com/blog/benchmarking-llm-inference-backends that mentioned LMDeploy currently does not support a wide variety of models yet. And this will continue to be developed, including subsequent support for MOE models on TurboMind and decoupling model and batch. The performance advantage will also be maintained, and I also look forward to your future plans. Cheers.

zhyncs avatar Jun 07 '24 10:06 zhyncs

Due to the lack of response for a long time, this issue is being closed.

zhyncs avatar Jul 10 '24 08:07 zhyncs

Since vllm 0.6, this should be simple to implement. Just contribute bentos / templates to https://github.com/bentoml/openllm-models

bojiang avatar Jul 13 '24 05:07 bojiang

@bojiang @aarnphm Take the liberty to find out what are the differences and advantages between OpenLLM and Skypilot. Thanks.

zhyncs avatar Jul 13 '24 08:07 zhyncs

IMO Skypilot has different design pattern. It tries to do everything for users, and openllm just focus on a running models and get a service.

bojiang avatar Jul 15 '24 05:07 bojiang