LLaMA-Factory icon indicating copy to clipboard operation
LLaMA-Factory copied to clipboard

with safe_open(checkpoint_file, framework="pt") as f: FileNotFoundError: No such file or directory: "/root/Qwen1.5-7B/model-00001-of-00004.safetensors"

Open ginreedcho opened this issue 1 year ago • 1 comments

Reminder

  • [X] I have read the README and searched the existing issues.

Reproduction

Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s] Exception in thread Thread-6 (run_exp): Traceback (most recent call last): File "/root/miniconda3/envs/llama_factory/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/root/miniconda3/envs/llama_factory/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/root/LLaMA-Factory/src/llmtuner/train/tuner.py", line 33, in run_exp run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/root/LLaMA-Factory/src/llmtuner/train/sft/workflow.py", line 33, in run_sft model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train) File "/root/LLaMA-Factory/src/llmtuner/model/loader.py", line 101, in load_model model: "PreTrainedModel" = AutoModelForCausalLM.from_pretrained(**init_kwargs) File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained return model_class.from_pretrained( File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained ) = cls._load_pretrained_model( File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4084, in _load_pretrained_model state_dict = load_state_dict(shard_file, is_quantized=is_quantized) File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 507, in load_state_dict with safe_open(checkpoint_file, framework="pt") as f: FileNotFoundError: No such file or directory: "/root/Qwen1.5-7B/model-00001-of-00004.safetensors" Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s] Exception in thread Thread-6 (run_exp): Traceback (most recent call last): File "/root/miniconda3/envs/llama_factory/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/root/miniconda3/envs/llama_factory/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/root/LLaMA-Factory/src/llmtuner/train/tuner.py", line 33, in run_exp run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks) File "/root/LLaMA-Factory/src/llmtuner/train/sft/workflow.py", line 33, in run_sft model = load_model(tokenizer, model_args, finetuning_args, training_args.do_train) File "/root/LLaMA-Factory/src/llmtuner/model/loader.py", line 101, in load_model model: "PreTrainedModel" = AutoModelForCausalLM.from_pretrained(**init_kwargs) File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained return model_class.from_pretrained( File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained ) = cls._load_pretrained_model( File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4084, in _load_pretrained_model state_dict = load_state_dict(shard_file, is_quantized=is_quantized) File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 507, in load_state_dict with safe_open(checkpoint_file, framework="pt") as f: FileNotFoundError: No such file or directory: "/root/Qwen1.5-7B/model-00001-of-00004.safetensors"

Expected behavior

最开始出现了错误with safe_open(checkpoint_file) as f:safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge,与issue#327中的情况相似,于是我按照#327中的解决建议删除了qwen模型文件中所有的safetensors后缀文件,于是报错No such file or directory: "/root/Qwen1.5-7B/model-00001-of-00004.safetensors"

System Info

  • transformers version: 4.40.0
  • Platform: Linux-5.15.0-84-generic-x86_64-with-glibc2.31
  • Python version: 3.10.14
  • Huggingface_hub version: 0.22.2
  • Safetensors version: 0.4.3
  • Accelerate version: 0.29.3
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.2+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Others

No response

ginreedcho avatar Apr 18 '24 21:04 ginreedcho

模型文件不全

hiyouga avatar Apr 19 '24 13:04 hiyouga