Bug report：config eval dataset with mme， raise errors：LocalTokenNotFoundError

Open XiaotaoChen opened this issue 1 year ago • 1 comments

backend

i'm trying to quantizing InternVL2-2B config eval dataset with mme, but met the error named:LocalTokenNotFoundError, it. seems like try to download dataset, but the dataset is in local.

analysis

according to the crash stack, crash raised in llmc/eval/eval_vqa.py:98 task_dict = get_task_dict(tasks, task_manager), this line tried to download dataset, which requires the token;
the dataset is configed in /home/chenxiaotao03/Reposities/llmc/resource/data/llm_dataset/text/eval/MME, the code self.eval_dataset_path in llmc/eval/eval_vqa.py are not used. maybe it caused the bug.
i'm strange with llm eval, please help me to solve the problem, thanks.

config

base:
    seed: &seed 42
model:
    type: InternVL2
    path: /home/chenxiaotao03/Reposities/llmc/resource/model/custom_model/InternVL2-2b-custom
    tokenizer_mode: slow
    torch_dtype: auto
calib:
    name: custom_mm
    download: False
    path: /home/chenxiaotao03/Reposities/llmc/resource/data/general_custom_data
    apply_chat_template: True
    add_answer: True # Defalut is False. If set it to Ture, calib data will add answers.
    n_samples: 8
    bs: -1
    seq_len: 512
    padding: True
    seed: *seed
eval:
    eval_pos: [pretrain, fake_quant]
    type: vqa
    name: mme
    download: False
    path: /home/chenxiaotao03/Reposities/llmc/resource/data/llm_dataset/text/eval/MME
    bs: 1
    inference_per_block: False
# eval:
#     eval_pos: [transformed, fake_quant]
#     name: wikitext2
#     download: False
#     path: /home/chenxiaotao03/Reposities/llmc/resource/data/llm_dataset/text/eval/wikitext2
#     seq_len: 2048
#     # For 7B / 13B model eval, bs can be set to "1", and inference_per_block can be set to "False".
#     # For 70B model eval, bs can be set to "20", and inference_per_block can be set to "True".
#     bs: 1
#     inference_per_block: False
quant:
    language:
        method: Awq
        weight:
            bit: 4
            symmetric: False
            granularity: per_group
            group_size: 64
        special:
            trans: True
            # The options for "trans_version" include "v1" and "v2".
            # But their results don't differ significantly.
            trans_version: v2
            weight_clip: True
            # For 2-bit quantization, setting "clip_sym: False" will yield better results.
            clip_sym: True
save:
    save_trans: True
    save_fake: False
    save_path: /home/chenxiaotao03/Reposities/llmc/output/quant/awq/InternVL2

crash stack

Traceback (most recent call last):
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/tenacity/__init__.py", line 470, in __call__
    result = fn(*args, **kwargs)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/lmms_eval/api/task.py", line 1043, in download
    self.dataset = datasets.load_dataset(
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/datasets/load.py", line 2523, in load_dataset
    builder_instance = load_dataset_builder(
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/datasets/load.py", line 2195, in load_dataset_builder
    dataset_module = dataset_module_factory(
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/datasets/load.py", line 1846, in dataset_module_factory
    raise e1 from None
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/datasets/load.py", line 1791, in dataset_module_factory
    raise e
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/datasets/load.py", line 1765, in dataset_module_factory
    dataset_info = hf_api.dataset_info(
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 2433, in dataset_info
    headers = self._build_hf_headers(token=token)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 9072, in _build_hf_headers
    return build_hf_headers(
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/huggingface_hub/utils/_headers.py", line 124, in build_hf_headers
    token_to_send = get_token_to_send(token)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/huggingface_hub/utils/_headers.py", line 158, in get_token_to_send
    raise LocalTokenNotFoundError(
huggingface_hub.errors.LocalTokenNotFoundError: Token is required (`token=True`), but no token found. You need to provide a token or be logged in to Hugging Face with `huggingface-cli login` or `huggingface_hub.login`. See https://huggingface.co/settings/tokens.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/chenxiaotao03/Reposities/llmc/llmc/__main__.py", line 246, in <module>
    main(config)
  File "/home/chenxiaotao03/Reposities/llmc/llmc/__main__.py", line 33, in main
    eval_model(model, None, eval_list, eval_pos='pretrain')
  File "/home/chenxiaotao03/Reposities/llmc/llmc/eval/utils.py", line 87, in eval_model
    res = eval_class.eval(model)
  File "/home/chenxiaotao03/Reposities/llmc/llmc/eval/eval_vqa.py", line 98, in eval
    task_dict = get_task_dict(tasks, task_manager)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/lmms_eval/tasks/__init__.py", line 558, in get_task_dict
    task_name_from_string_dict = task_manager.load_task_or_group(
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/lmms_eval/tasks/__init__.py", line 372, in load_task_or_group
    all_loaded_tasks = dict(collections.ChainMap(*map(self._load_individual_task_or_group, task_list)))
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/lmms_eval/tasks/__init__.py", line 289, in _load_individual_task_or_group
    return _load_task(task_config, task=name_or_config)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/lmms_eval/tasks/__init__.py", line 259, in _load_task
    task_object = ConfigurableTask(config=config, model_name=self.model_name)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/lmms_eval/api/task.py", line 718, in __init__
    self.download(self.config.dataset_kwargs)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/tenacity/__init__.py", line 330, in wrapped_f
    return self(f, *args, **kw)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/tenacity/__init__.py", line 467, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/tenacity/__init__.py", line 368, in iter
    result = action(retry_state)
  File "/home/chenxiaotao03/miniconda3/envs/python310-llm/lib/python3.10/site-packages/tenacity/__init__.py", line 411, in exc_check
    raise retry_exc from fut.exception()
tenacity.RetryError: RetryError[<Future at 0x7307fb93a6e0 state=finished raised LocalTokenNotFoundError>]

Jan 17 '25 07:01 XiaotaoChen

in additions, this error can be solved by login to huggingface，such as write the private token into ~/.huggingface/token, but i'm not sure about the influence of the self.eval_dataset_path var. the eval result as belows:

2025-01-17 17:24:11.499 | INFO     | llmc.eval.eval_vqa:eval:82 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2025-01-17 17:24:19.947 | INFO     | llmc.models.internvl2:__init__:436 - Using 1 devices with tensor parallelism
2025-01-17 17:24:19.947 | WARNING  | llmc.eval.eval_vqa:_adjust_config:148 - Overwriting default num_fewshot of mme                                            from None to 0
2025-01-17 17:24:19.947 | WARNING  | llmc.eval.eval_vqa:_adjust_config:148 - Overwriting default num_fewshot of mme                                            from None to 0
2025-01-17 17:24:19.947 | INFO     | lmms_eval.evaluator_utils:from_taskdict:91 - No metadata found in task config for mme, using default n_shot=0
2025-01-17 17:24:19.948 | INFO     | lmms_eval.api.task:build_all_requests:425 - Building contexts for mme on rank 0...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2374/2374 [00:00<00:00, 167526.59it/s]
2025-01-17 17:24:46.659 | INFO     | lmms_eval.evaluator:evaluate:446 - Running generate_until requests
Model Responding: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2374/2374 [07:29<00:00,  5.28it/s]
Postprocessing: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2374/2374 [00:20<00:00, 114.93it/s]
2025-01-17 17:32:36.888 | INFO     | utils:mme_aggregate_results:124 - code_reasoning: 95.00
2025-01-17 17:32:36.888 | INFO     | utils:mme_aggregate_results:124 - numerical_calculation: 40.00
2025-01-17 17:32:36.888 | INFO     | utils:mme_aggregate_results:124 - text_translation: 155.00
2025-01-17 17:32:36.888 | INFO     | utils:mme_aggregate_results:124 - commonsense_reasoning: 112.86
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - artwork: 142.25
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - celebrity: 115.88
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - count: 133.33
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - color: 153.33
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - position: 155.00
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - OCR: 87.50
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - landmark: 152.00
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - scene: 155.50
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - existence: 195.00
2025-01-17 17:32:36.889 | INFO     | utils:mme_aggregate_results:124 - posters: 120.41
2025-01-17 17:32:37.007 | INFO     | llmc.eval.utils:eval_model:90 - EVAL: vqa on mme is 
|Tasks|Version|Filter|n-shot|       Metric       |   |  Value  |   |Stderr|
|-----|-------|------|-----:|--------------------|---|--------:|---|------|
|mme  |Yaml   |none  |     0|mme_cognition_score |↑  | 402.8571|±  |   N/A|
|mme  |Yaml   |none  |     0|mme_perception_score|↑  |1410.2072|±  |   N/A|

Jan 17 '25 09:01 XiaotaoChen