lm-evaluation-harness
lm-evaluation-harness copied to clipboard

Published 20 hours ago •

Reame
Issues

Add long context evaluation benchmarks such as LongBench and LEval.

Open txchen-USTC opened this issue 6 months ago • 2 comments

Add long context evaluation benchmarks such as LongBench and LEval.

Aug 05 '24 08:08 txchen-USTC