Reminder
- [X] I have read the README and searched the existing issues.
System Info
llamafactory version: 0.8.3.dev0
- Platform: Linux-4.19.36-vhulk1907.1.0.h619.eulerosv2r8.aarch64-aarch64-with-glibc2.28
- Python version: 3.10.14
- PyTorch version: 2.3.1 (NPU)
- Transformers version: 4.41.2
- Datasets version: 2.20.0
- Accelerate version: 0.32.1
- PEFT version: 0.11.1
- TRL version: 0.9.6
- NPU type: Ascend910PremiumA
- CANN version: 8.0.RC2.alpha002
Reproduction
ASCEND_RT_VISIBLE_DEVICES=0,1,2,3 llamafactory-cli chat glm4_9b_chat.yaml
Expected behavior
Traceback (most recent call last):
File "/root/miniconda3/envs/llm/bin/llamafactory-cli", line 8, in
sys.exit(main())
File "/root/project/LLaMA-Factory/src/llamafactory/cli.py", line 81, in main
run_chat()
File "/root/project/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 125, in run_chat
chat_model = ChatModel()
File "/root/project/LLaMA-Factory/src/llamafactory/chat/chat_model.py", line 44, in init
self.engine: "BaseEngine" = HuggingfaceEngine(model_args, data_args, finetuning_args, generating_args)
File "/root/project/LLaMA-Factory/src/llamafactory/chat/hf_engine.py", line 58, in init
self.model = load_model(
File "/root/project/LLaMA-Factory/src/llamafactory/model/loader.py", line 153, in load_model
model = AutoModelForCausalLM.from_pretrained(**init_kwargs)
File "/root/miniconda3/envs/llm/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/root/miniconda3/envs/llm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3754, in from_pretrained
) = cls._load_pretrained_model(
File "/root/miniconda3/envs/llm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4214, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/root/miniconda3/envs/llm/lib/python3.10/site-packages/transformers/modeling_utils.py", line 887, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/root/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 460, in set_module_tensor_to_device
clear_device_cache()
File "/root/miniconda3/envs/llm/lib/python3.10/site-packages/accelerate/utils/memory.py", line 42, in clear_device_cache
torch.npu.empty_cache()
File "/root/miniconda3/envs/llm/lib/python3.10/site-packages/torch_npu/npu/memory.py", line 144, in empty_cache
torch_npu._C.npu_emptyCache()
RuntimeError: unmapHandles:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:400 NPU function error: aclrtSynchronizeStream(stream), error code is 107003
[ERROR] 2024-07-15-16:44:48 (PID:159190, Device:0, RankID:-1) ERR00100 PTA call acl api failed
[Error]: The stream is not in the current context.
Check whether the context where the stream is located is the same as the current context.
EE9999: Inner Error!
EE9999: 2024-07-15-16:44:48.807.354 Stream synchronize failed, stream is not in current ctx, stream_id=2.[FUNC:StreamSynchronize][FILE:api_impl.cc][LINE:1005]
TraceBack (most recent call last):
rtStreamSynchronize execute failed, reason=[stream not in current context][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
synchronize stream failed, runtime result = 107003[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
Others
No response