distributed-llama icon indicating copy to clipboard operation
distributed-llama copied to clipboard

Output all are "!"

Open HysenX-LI opened this issue 1 year ago • 5 comments

image As I shows in the picture, all outputs of inference is "!".

I tried different approaches and found that I could solve this problem only if I turned on O0 optimization. But it runs too slowly with O0 optimizations turned on, and this can happen with O1 O2 O3 optimizations turned on.

Has anyone ever had the same problem? How should we solve it.

HysenX-LI avatar Sep 04 '24 07:09 HysenX-LI

This is very weird. What CPU/OS?

b4rtaz avatar Sep 04 '24 08:09 b4rtaz

Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz Ubuntu 18.04

HysenX-LI avatar Sep 04 '24 08:09 HysenX-LI

image Even with O0, it may output some meaningless output

HysenX-LI avatar Sep 04 '24 08:09 HysenX-LI

Which model?

b4rtaz avatar Sep 04 '24 08:09 b4rtaz

dllama_model_llama3_8b_q40.m

HysenX-LI avatar Sep 04 '24 08:09 HysenX-LI

Could you check again on version 0.12.1? Many memory leaks have been fixed. Please re-download the model and tokenizer.

b4rtaz avatar Feb 14 '25 15:02 b4rtaz