DeepSpeed
DeepSpeed copied to clipboard

Published 20 hours ago •

Reame
Issues

chatglm-6b can not use deepspeed inference[BUG]

Open zuocebianpingmao opened this issue 1 year ago • 0 comments

Model * GPU size memory required for tensor parallel inference and it does not reduce latency Are there any plans to support it?

Jun 06 '23 07:06 zuocebianpingmao