FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Add lmdeploy integration

Open AllentDan opened this issue 3 months ago • 1 comments

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

An alternative to inference tool integration, LMDeploy obtains 14.42 qps performance on A100 for the llama 7b model according to this.

AllentDan avatar Mar 20 '24 09:03 AllentDan

Hi, @merrymercy would you please kindly help review this PR?

AllentDan avatar Apr 11 '24 08:04 AllentDan