MiniCPM-V icon indicating copy to clipboard operation
MiniCPM-V copied to clipboard

[BUG] 用 llama-factory 微调,运行示例脚本报错 :

Open everwind opened this issue 9 months ago • 4 comments

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

  • [x] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

  • [x] 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

$ CUDA_VISIBLE_DEVICES=0 llamafactory-cli train configs/minicpmo_2_6_lora_sft.yaml [2025-03-17 15:42:30,515] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect) [INFO|2025-03-17 15:42:33] llamafactory.hparams.parser:380 >> Process rank: 0, world size: 1, device: cuda:0, distributed training: False, compute dtype: torch.bfloat16 [INFO|tokenization_utils_base.py:2050] 2025-03-17 15:42:54,036 >> loading file vocab.json from cache at /root/.cache/huggingface/hub/models--openbmb--MiniCPM-o-2_6/snapshots/9a8db9d033b8e61fa1f1a9f387895237c3de98a2/vocab.json [INFO|tokenization_utils_base.py:2050] 2025-03-17 15:42:54,036 >> loading file merges.txt from cache at /root/.cache/huggingface/hub/models--openbmb--MiniCPM-o-2_6/snapshots/9a8db9d033b8e61fa1f1a9f387895237c3de98a2/merges.txt [INFO|tokenization_utils_base.py:2050] 2025-03-17 15:42:54,036 >> loading file tokenizer.json from cache at /root/.cache/huggingface/hub/models--openbmb--MiniCPM-o-2_6/snapshots/9a8db9d033b8e61fa1f1a9f387895237c3de98a2/tokenizer.json [INFO|tokenization_utils_base.py:2050] 2025-03-17 15:42:54,036 >> loading file added_tokens.json from cache at /root/.cache/huggingface/hub/models--openbmb--MiniCPM-o-2_6/snapshots/9a8db9d033b8e61fa1f1a9f387895237c3de98a2/added_tokens.json [INFO|tokenization_utils_base.py:2050] 2025-03-17 15:42:54,036 >> loading file special_tokens_map.json from cache at /root/.cache/huggingface/hub/models--openbmb--MiniCPM-o-2_6/snapshots/9a8db9d033b8e61fa1f1a9f387895237c3de98a2/special_tokens_map.json [INFO|tokenization_utils_base.py:2050] 2025-03-17 15:42:54,036 >> loading file tokenizer_config.json from cache at /root/.cache/huggingface/hub/models--openbmb--MiniCPM-o-2_6/snapshots/9a8db9d033b8e61fa1f1a9f387895237c3de98a2/tokenizer_config.json [INFO|tokenization_utils_base.py:2050] 2025-03-17 15:42:54,036 >> loading file chat_template.jinja from cache at None [INFO|tokenization_utils_base.py:2313] 2025-03-17 15:42:54,296 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. [INFO|2025-03-17 15:43:06] llamafactory.data.template:143 >> Add <|im_end|> to stop words. [INFO|2025-03-17 15:43:06] llamafactory.data.loader:143 >> Loading dataset mllm_demo.json... num_proc must be <= 6. Reducing num_proc to 6 for dataset of size 6. Converting format of dataset (num_proc=6): 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 49.62 examples/s] num_proc must be <= 6. Reducing num_proc to 6 for dataset of size 6. Running tokenizer on dataset (num_proc=6): 0%| | 0/6 [00:00<?, ? examples/s] multiprocess.pool.RemoteTraceback: """ Traceback (most recent call last): File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 678, in _write_generator_to_queue for i, result in enumerate(func(**kwargs)): File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3519, in _map_single for i, batch in iter_outputs(shard_iterable): File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3469, in iter_outputs yield i, apply_function(example, i, offset=offset) File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3392, in apply_function processed_inputs = function(*fn_args, *additional_args, **fn_kwargs) File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/data/processor/supervised.py", line 99, in preprocess_dataset input_ids, labels = self._encode_data_example( File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/data/processor/supervised.py", line 43, in _encode_data_example messages = self.template.mm_plugin.process_messages(prompt + response, images, videos, audios, self.processor) File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/data/mm_plugin.py", line 616, in process_messages self._validate_input(processor, images, videos, audios) File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/data/mm_plugin.py", line 163, in _validate_input raise ValueError("Processor was not found, please check and update your processor config.") ValueError: Processor was not found, please check and update your processor config. """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/root/anaconda3/envs/llamafactory/bin/llamafactory-cli", line 8, in sys.exit(main()) File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/cli.py", line 120, in main run_exp() File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/train/tuner.py", line 103, in run_exp _training_function(config={"args": args, "callbacks": callbacks}) File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/train/tuner.py", line 68, in _training_function run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 52, in run_sft dataset_module = get_dataset(template, model_args, data_args, training_args, stage="sft", **tokenizer_module) File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/data/loader.py", line 302, in get_dataset dataset = _get_preprocessed_dataset( File "/root/paddlejob/workspace2/LLaMA-Factory/src/llamafactory/data/loader.py", line 248, in _get_preprocessed_dataset dataset = dataset.map( File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 562, in wrapper out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs) File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3171, in map for rank, done, content in iflatmap_unordered( File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 718, in iflatmap_unordered [async_result.get(timeout=0.05) for async_result in async_results] File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 718, in [async_result.get(timeout=0.05) for async_result in async_results] File "/root/anaconda3/envs/llamafactory/lib/python3.10/site-packages/multiprocess/pool.py", line 774, in get raise self._value ValueError: Processor was not found, please check and update your processor config.

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

CUDA_VISIBLE_DEVICES=0 llamafactory-cli train configs/minicpmo_2_6_lora_sft.yaml

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

No response

everwind avatar Mar 17 '25 07:03 everwind

I am running into the same issue, @everwind could you explain how you solved it?

ZeeshanZulfiqarAli avatar Mar 17 '25 09:03 ZeeshanZulfiqarAli

I am running into the same issue, @everwind could you explain how you solved it?

rm -rf /root/.cache/huggingface/

I cleared the Hugging Face cache and reran the code, and it worked! It seems that the processor config was missing during the initial download.

everwind avatar Mar 17 '25 09:03 everwind

I am running into the same issue, @everwind could you explain how you solved it?

rm -rf /root/.cache/huggingface/

I cleared the Hugging Face cache and reran the code, and it worked! It seems that the processor config was missing during the initial download.

everwind avatar Mar 17 '25 09:03 everwind

I have other issuses using llama-factory, shall you share the versin of the env, like transformers, torch and so on

shuaijiang avatar Mar 17 '25 10:03 shuaijiang

It seems that the problem has been solved. Please allow me to close this issue. If you have any questions, feel free to ask again.

tc-mb avatar Aug 11 '25 10:08 tc-mb