FastChat Add lmdeploy integration

Add lmdeploy integration

Open AllentDan opened this issue 3 months ago • 1 comments

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

An alternative to inference tool integration, LMDeploy obtains 14.42 qps performance on A100 for the llama 7b model according to this.

Mar 20 '24 09:03 AllentDan

Hi, @merrymercy would you please kindly help review this PR?

Apr 11 '24 08:04 AllentDan