gpt-fast icon indicating copy to clipboard operation
gpt-fast copied to clipboard

Add mixtral support

Open Chillee opened this issue 1 year ago • 1 comments

Some preliminary perf numbers:

TP=8, fp16, 163.69 tok/s

Chillee avatar Dec 17 '23 09:12 Chillee

@Chillee Hi there, did you get the fp8 version somewhere or you are currently working on the fp8 quant in this PR?

siriusctrl avatar Dec 18 '23 16:12 siriusctrl