LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

After training the phi3 model, export error

Open sanqiuli opened this issue 1 year ago • 1 comments

Reminder

  • [x] I have read the above rules and searched the existing issues.

System Info

2025-02-28 12:25:54.568953: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2025-02-28 12:25:54.607931: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2025-02-28 12:25:55.281147: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT [2025-02-28 12:25:57,490] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect)

  • llamafactory version: 0.9.2.dev0
  • Platform: Linux-4.19.91-012.ali4000.alios7.x86_64-x86_64-with-glibc2.35
  • Python version: 3.10.14
  • PyTorch version: 2.3.1+cu121 (GPU)
  • Transformers version: 4.48.3
  • Datasets version: 3.2.0
  • Accelerate version: 1.2.1
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • GPU type: NVIDIA A10
  • GPU number: 1
  • GPU memory: 23.69GB
  • DeepSpeed version: 0.16.3
  • Bitsandbytes version: 0.45.2
  • vLLM version: 0.5.3

Reproduction

root@dsw-879798-7684864888-8vrl6:/mnt/workspace/LLaMA-Factory# llamafactory-cli export \
--model_name_or_path /mnt/workspace/.cache/modelscope/models/LLM-Research/Phi-3-mini-4k-instruct \
--adapter_name_or_path ./saves/Phi-3-mini-4k-instruct/lora/sft  \
--template phi \
--finetuning_type lora \
--export_dir megred-model-path \
--export_size 2 \
--export_device cpu \
--export_legacy_format False
2025-02-28 12:19:08.736918: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-02-28 12:19:08.776683: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-02-28 12:19:09.507474: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
[2025-02-28 12:19:11,936] [INFO] [real_accelerator.py:222:get_accelerator] Setting ds_accelerator to cuda (auto detect)
df: /root/.triton/autotune: 没有那个文件或目录
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  3.18it/s]
Traceback (most recent call last):
  File "/usr/local/bin/llamafactory-cli", line 8, in <module>
    sys.exit(main())
  File "/mnt/workspace/LLaMA-Factory/src/llamafactory/cli.py", line 87, in main
    export_model()
  File "/mnt/workspace/LLaMA-Factory/src/llamafactory/train/tuner.py", line 131, in export_model
    model.save_pretrained(
  File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2750, in save_pretrained
    custom_object_save(self, save_directory, config=self.config)
  File "/usr/local/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 624, in custom_object_save
    for needed_file in get_relative_import_files(object_file):
  File "/usr/local/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 128, in get_relative_import_files
    new_imports.extend(get_relative_imports(f))
  File "/usr/local/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 97, in get_relative_imports
    with open(module_file, "r", encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.10/site-packages/transformers/models/phi3/..modeling_flash_attention_utils.py'

Others

I performed the pre-step https://zhuanlan.zhihu.com/p/695287607 according to the introductory tutorial, but I changed the model used in the tutorial to Phi-3-mini-4k-instruct

The content of the above command repeatedly running FileNotFoundError will change, but it will be related to transformers/models/phi3. For example:

FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.10/site-packages/transformers/models/phi3/..generation.py'
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.10/site-packages/transformers/models/phi3/..modeling_utils.py'

sanqiuli avatar Feb 28 '25 04:02 sanqiuli

Due to Phi-3 uses a custom Python file for its model config (e.g., configuration_phi3.py), which triggers the get_relative_imports function in the Transformers library, resulting this path error. Try this solution to avoid the issue of the path not being found using this method temporarily. ref: #6399 #6411

Kuangdd01 avatar Feb 28 '25 18:02 Kuangdd01