lm-evaluation-harness
lm-evaluation-harness copied to clipboard
Evaluation fails when all samples are cached
When all samples are already cached, the process errors out instead (instead of skipping to the metric calculation) on the subsequent run due to lack of requests to pass on to the LM.