LsEmpire

Results 3 comments of LsEmpire

> If you're running inference on CPU, you should expect the slower speed, if you're running on a GPU, the generation is much faster Thanks, thanks for your help and...

> Thanks, thanks for your reply. I see the code is from generate_stream function in file inference.py Could you please help again to check if it is the 0 -...

Thanks > Hi, as you are already aware, if you use the worker for the generation, you get the streaming output. If you need all the outputs at once, you...