lm-evaluation-harness HellaSwag with UnicodeDecodeError

trafficstars

When I was trying to evaluate HellaSwag using: lm_eval --model hf --model_args pretrained=HuggingFaceH4/zephyr-7b-beta,dtype="bfloat16" --tasks hellaswag --device cuda:0 --num_fewshot 10 --batch_size auto --trust_remote_code I met the error: File "/root/miniconda3/envs/lm_eval/lib/python3.10/site-packages/datasets/load.py", line 2587, in load_dataset builder_instance = load_dataset_builder( File "/root/miniconda3/envs/lm_eval/lib/python3.10/site-packages/datasets/load.py", line 2259, in load_dataset_builder dataset_module = dataset_module_factory( File "/root/miniconda3/envs/lm_eval/lib/python3.10/site-packages/datasets/load.py", line 1910, in dataset_module_factory raise e1 from None File "/root/miniconda3/envs/lm_eval/lib/python3.10/site-packages/datasets/load.py", line 1862, in dataset_module_factory can_load_config_from_parquet_export = "DEFAULT_CONFIG_NAME" not in f.read() File "/root/miniconda3/envs/lm_eval/lib/python3.10/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte

How can I solve this error?

Apr 27 '24 08:04 Hua-rookie

same problem

Apr 27 '24 10:04 zjuruizhechen

encounter the same issue in local environment

Apr 29 '24 08:04 PotatoBearP

same issue

Apr 30 '24 10:04 Shuizhimei

same issue

May 02 '24 05:05 huangwei021230

same issue update: it is working now

May 05 '24 02:05 cs32963

Cannot initially seem to replicate on a fresh HF cache... perhaps did something wrong though? Is the connection to the HF Hub working for those facing this problem?

May 06 '24 14:05 haileyschoelkopf

Cannot initially seem to replicate on a fresh HF cache... perhaps did something wrong though? Is the connection to the HF Hub working for those facing this problem?

It seems not this problem, the connection is well on my machine.

May 06 '24 16:05 Hua-rookie

same problem update: it is working now