Hyesung Jeon
Results
2
comments of
Hyesung Jeon
I'm also having a similar problem with LLaMA 33B using an NVIDIA A100 80G * 2, even with a single micro-batch size. It is really confusing because when I execute...
It seems lm-eval-harness can reproduce the Llama (paper ver.1) performance on Hellaswag, but some issues remain in other tasks. Llama-30b model gives 82.65 % acc_norm while paper shows 82.9%.