Yezhen Wang
Results
2
comments of
Yezhen Wang
perhaps they used different random seeds and reported the average results or they just picked the highest score to report.
> Hello, thank you for your interest in LLaDA. We plan to open-source the evaluation metrics for the LLaDA Base model using the lm-evaluation-harness library. This may take some time...