Yezhen Wang

Results 2 comments of Yezhen Wang

perhaps they used different random seeds and reported the average results or they just picked the highest score to report.

> Hello, thank you for your interest in LLaDA. We plan to open-source the evaluation metrics for the LLaDA Base model using the lm-evaluation-harness library. This may take some time...