LLaDA Any plan for releasing the evaluation code?

Great Work, Sincerely Congratulations!

I'm very interested in your work, is that possible for your team to also release the evaluation code which you used for the benchmark testing? For example, if i want to test LLaDA through the lm-evaluation-harness, what should i do?

Mar 04 '25 07:03 BIGKnight

I also found it challenging to integrate the loglikelihood and generate functions into lm-evaluation-harness. If the author could open-source the relevant parts of lm-evaluation-harness, it would be greatly helpful to me.

Mar 04 '25 08:03 1773226512

Hello, thank you for your interest in LLaDA. We plan to open-source the evaluation metrics for the LLaDA Base model using the lm-evaluation-harness library. This may take some time to organize the code and go through the open-source process.

Mar 05 '25 17:03 Monohydroxides

Hello, thank you for your interest in LLaDA. We plan to open-source the evaluation metrics for the LLaDA Base model using the lm-evaluation-harness library. This may take some time to organize the code and go through the open-source process.

thit is really good to know, sincerely thanks. Looking forward to it : )

Mar 06 '25 09:03 BIGKnight

Thank you for your attention. We have released the code for the evaluation using the open source library.

Mar 08 '25 13:03 nieshenx