open_llama icon indicating copy to clipboard operation
open_llama copied to clipboard

The result from open_llm_leaderboard is not as expected.

Open chi2liu opened this issue 2 years ago • 7 comments

open_llm_leaderboard had updated the result for open-llama-3b and open-llama-7b.

image

This result is much worse than llama-7b and does not match expectations. Is it because of the fast tokenizer issue mention in the document?

chi2liu avatar Jun 13 '23 07:06 chi2liu

Relative scores compared to llama-7b:

image

There's a clear performance hit for the multi-shot tasks, as compared to llama-7b

gjmulder avatar Jun 13 '23 07:06 gjmulder

This is likely the issue of the auto-converted fast tokenizer. I've created an issue here

young-geng avatar Jun 13 '23 09:06 young-geng

@young-geng looks like the issue in that repo was fixed last week. I'm assuming this could be retried now? (@chi2liu)

c0bra avatar Jun 22 '23 16:06 c0bra

@c0bra There has not yet been a new release of huggingface/transformers since the fix has been merged: https://github.com/huggingface/transformers/releases. I assume we still need to wait for this.

The already existing entries for OpenLLaMa on the leader-board disappeared around a week ago as well. Maybe there is a connection and the maintainers of the leader-board removed the results, because they learned of the bug and are now waiting for the next release of huggingface/transformers... That's just my guess, though.

codesoap avatar Jun 22 '23 17:06 codesoap

@codesoap Yeah I've contacted the maintainers for the leaderboard for a re-evaluation request, and the model should be in the queue right now.

young-geng avatar Jun 22 '23 23:06 young-geng

open-llama-7b-open-instruct is pending evaluation in open_llm_leaderboard. They confirmed that they fine-tuned with use_fast = False

gjmulder avatar Jun 28 '23 15:06 gjmulder

OpenLLaMa 3B result is not pending. is there any reason?

HeegyuKim avatar Jul 03 '23 03:07 HeegyuKim