lm-evaluation-harness Generator Error when evaluating GLUE and superGLUE

The error information shows that:

And the corresponding code is:

I guess the error was caused by the parameter <n_reordered_requests>, which is a generator, and that was assigned by:

So I think #1197 commitment changes it and results in this error. When I checkout/back to the previous version from two weeks ago, that's normal.

Jan 03 '24 03:01 shiweijiezero

This should have been fixed in #1229. Are you on the latest commit?

Jan 03 '24 07:01 baberabb

This should have been fixed in #1229. Are you on the latest commit?

Yes, the version I used was commit #1238

Sorry, it seems to be ok in #1238, and I didn't realize the bug was fixed yesterday. Thank you!

Jan 03 '24 08:01 shiweijiezero

hmm. Can you provide the full command? The previous bug occurred only when using batch "auto".

Jan 03 '24 08:01 baberabb

hmm. Can you provide the full command? The previous bug occurred only when using batch "auto".

Yeah,

lm_eval --model hf \
    --model_args pretrained=gpt2-xl,trust_remote_code=true,dtype=bfloat16 \
    --tasks glue,gsm8k,super-glue-lm-eval-v1 \
    --batch_size auto \
    --output_path ./eval_out/gpt2-xl \
    --device cuda:0

btw, the following command triggered another error

lm_eval --model hf \
    --model_args pretrained=Qwen/Qwen-14B-Chat,trust_remote_code=true,dtype=bfloat16 \
    --tasks glue,gsm8k,super-glue-lm-eval-v1 \
    --batch_size auto \
    --output_path ./eval_out/qwen-14b \
    --device cuda:0

Jan 03 '24 08:01 shiweijiezero

The second one looks like a tokenizer bug. @haileyschoelkopf

Jan 03 '24 08:01 baberabb

Even worse, the first command cannot run properly

lm_eval --model hf \
    --model_args pretrained=gpt2-xl,trust_remote_code=true,dtype=bfloat16 \
    --tasks glue,gsm8k,super-glue-lm-eval-v1 \
    --batch_size auto \
    --output_path ./eval_out/gpt2-xl \
    --device cuda:0

Jan 03 '24 09:01 shiweijiezero

Even worse, the first command cannot run properly更糟糕的是，第一个命令无法正常运行
lm_eval --model hf \
    --model_args pretrained=gpt2-xl,trust_remote_code=true,dtype=bfloat16 \
    --tasks glue,gsm8k,super-glue-lm-eval-v1 \
    --batch_size auto \
    --output_path ./eval_out/gpt2-xl \
    --device cuda:0

Furthermore, I located the error caused by super-glue-lm-eval-v1 evaluating. May there be any idea for solving it？

Jan 03 '24 16:01 shiweijiezero

Taking a look right now!

Jan 03 '24 16:01 haileyschoelkopf

lm-evaluation-harness lm-evaluation-harness copied to clipboard

Generator Error when evaluating GLUE and superGLUE

lm-evaluation-harness
lm-evaluation-harness copied to clipboard