DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

chatglm-6b can not use deepspeed inference[BUG]

Open zuocebianpingmao opened this issue 1 year ago • 0 comments

Model * GPU size memory required for tensor parallel inference and it does not reduce latency Are there any plans to support it?

zuocebianpingmao avatar Jun 06 '23 07:06 zuocebianpingmao