lm-evaluation-harness icon indicating copy to clipboard operation
lm-evaluation-harness copied to clipboard

Generator Error when evaluating GLUE and superGLUE

Open shiweijiezero opened this issue 1 year ago • 8 comments

The error information shows that: image

And the corresponding code is: image

I guess the error was caused by the parameter <n_reordered_requests>, which is a generator, and that was assigned by: image

So I think #1197 commitment changes it and results in this error. When I checkout/back to the previous version from two weeks ago, that's normal.

shiweijiezero avatar Jan 03 '24 03:01 shiweijiezero

This should have been fixed in #1229. Are you on the latest commit?

baberabb avatar Jan 03 '24 07:01 baberabb

This should have been fixed in #1229. Are you on the latest commit?

Yes, the version I used was commit #1238

Sorry, it seems to be ok in #1238, and I didn't realize the bug was fixed yesterday. Thank you!

shiweijiezero avatar Jan 03 '24 08:01 shiweijiezero

hmm. Can you provide the full command? The previous bug occurred only when using batch "auto".

baberabb avatar Jan 03 '24 08:01 baberabb

hmm. Can you provide the full command? The previous bug occurred only when using batch "auto".

Yeah,

lm_eval --model hf \
    --model_args pretrained=gpt2-xl,trust_remote_code=true,dtype=bfloat16 \
    --tasks glue,gsm8k,super-glue-lm-eval-v1 \
    --batch_size auto \
    --output_path ./eval_out/gpt2-xl \
    --device cuda:0

btw, the following command triggered another error image

lm_eval --model hf \
    --model_args pretrained=Qwen/Qwen-14B-Chat,trust_remote_code=true,dtype=bfloat16 \
    --tasks glue,gsm8k,super-glue-lm-eval-v1 \
    --batch_size auto \
    --output_path ./eval_out/qwen-14b \
    --device cuda:0

shiweijiezero avatar Jan 03 '24 08:01 shiweijiezero

The second one looks like a tokenizer bug. @haileyschoelkopf

baberabb avatar Jan 03 '24 08:01 baberabb

Even worse, the first command cannot run properly

lm_eval --model hf \
    --model_args pretrained=gpt2-xl,trust_remote_code=true,dtype=bfloat16 \
    --tasks glue,gsm8k,super-glue-lm-eval-v1 \
    --batch_size auto \
    --output_path ./eval_out/gpt2-xl \
    --device cuda:0

image

shiweijiezero avatar Jan 03 '24 09:01 shiweijiezero

Even worse, the first command cannot run properly更糟糕的是,第一个命令无法正常运行

lm_eval --model hf \
    --model_args pretrained=gpt2-xl,trust_remote_code=true,dtype=bfloat16 \
    --tasks glue,gsm8k,super-glue-lm-eval-v1 \
    --batch_size auto \
    --output_path ./eval_out/gpt2-xl \
    --device cuda:0

image

Furthermore, I located the error caused by super-glue-lm-eval-v1 evaluating. May there be any idea for solving it?

shiweijiezero avatar Jan 03 '24 16:01 shiweijiezero

Taking a look right now!

haileyschoelkopf avatar Jan 03 '24 16:01 haileyschoelkopf