blog
blog copied to clipboard

Published 20 hours ago •

Reame
Issues

Gpt-neox-20b model take 1 minutes for 100 token using 4 bit quantization.

Open imrankh46 opened this issue 2 years ago • 0 comments

How I can reduce time for more the 100 token. ? The model take 1 minutes for 100 token using model in 4bit quantization.

May 25 '23 12:05 imrankh46