LLaMA-Factory
LLaMA-Factory copied to clipboard
关于华为计算中心的昇腾设备无法运行本项目
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
(PyTorch-2.1.0) [ma-user LLaMA-Factory]$python src/train_web.py /home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/torch_npu/dynamo/init.py:18: UserWarning: Register eager implementation for the 'npu' backend of dynamo, as torch_npu was not compiled with torchair. warnings.warn( Warning : ASCEND_HOME_PATH environment variable is not set. /home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/pydantic/_internal/_config.py:334: UserWarning: Valid config keys have changed in V2:
- 'allow_population_by_field_name' has been renamed to 'populate_by_name'
- 'validate_all' has been renamed to 'validate_default' warnings.warn(message, UserWarning) /home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/pydantic/_internal/fields.py:160: UserWarning: Field "model_persistence_threshold" has conflict with protected namespace "model".
You may be able to resolve this warning by setting model_config['protected_namespaces'] = ()
.
warnings.warn(
Traceback (most recent call last):
File "/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/trl/import_utils.py", line 176, in _get_module
return importlib.import_module("." + module_name, self.name)
File "/home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ma-user/work/LLaMA-Factory/src/train_web.py", line 1, in
Expected behavior
最初我安装的是目前的最新版本。然而在运行启动命令后报ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. deepspeed 0.9.2 requires pydantic<2.0.0, but you have pydantic 2.7.1 which is incompatible. 接着运行了pip install --no-deps -e .运行启动命令后报RuntimeError: Failed to import trl.trainer.dpo_trainer because of the following error (look up to see its traceback): 'FieldInfo' object has no attribute 'required' 后来尝试调整各个包的版本,依旧不行。
System Info
(PyTorch-2.1.0) [ma-user LLaMA-Factory]$transformers-cli env /home/ma-user/anaconda3/envs/PyTorch-2.1.0/lib/python3.9/site-packages/torch_npu/dynamo/init.py:18: UserWarning: Register eager implementation for the 'npu' backend of dynamo, as torch_npu was not compiled with torchair. warnings.warn( Warning : ASCEND_HOME_PATH environment variable is not set.
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
-
transformers
version: 4.40.2 - Platform: Linux-4.19.36-vhulk1907.1.0.h619.eulerosv2r8.aarch64-aarch64-with-glibc2.28
- Python version: 3.9.18
- Huggingface_hub version: 0.23.0
- Safetensors version: 0.4.3
- Accelerate version: 0.30.1
- Accelerate config: not found
- PyTorch version (GPU?): 2.1.0 (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Others
No response
昇腾相关用户可以加入这个群做进一步交流
升级了新的版本之后我也遇到同样的问题,目前使用的是A100芯片,1机8卡,请问这个问题已经解决了么?
我没有解决。之前希望在计算中心上使用webui,但华为的技术和我说只有裸金属服务器才能打开对应的端口。解决了包冲突之后就没再研究了
yx9966 @.***>于2024年5月17日 周五17:47写道:
升级了新的版本之后我也遇到同样的问题,目前使用的是A100芯片,1机8卡,请问这个问题已经解决了么?
— Reply to this email directly, view it on GitHub https://github.com/hiyouga/LLaMA-Factory/issues/3684#issuecomment-2117167153, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7EZL2EOQ36HBHI2SYO2BYDZCXG2LAVCNFSM6AAAAABHRPH2NSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJXGE3DOMJVGM . You are receiving this because you authored the thread.Message ID: @.***>
请问你是哪个包冲突了?pydantic这个是安装的什么版本啊?
我下班了,在车上开电脑比较麻烦。印象中我把deepspeed卸载掉就正常了
yx9966 @.***>于2024年5月17日 周五17:51写道:
请问你是哪个包冲突了?pydantic这个是安装的什么版本啊?
— Reply to this email directly, view it on GitHub https://github.com/hiyouga/LLaMA-Factory/issues/3684#issuecomment-2117174960, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7EZL2CDLIBMAGZFNSXKNVTZCXHKBAVCNFSM6AAAAABHRPH2NSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJXGE3TIOJWGA . You are receiving this because you authored the thread.Message ID: @.***>
收到谢谢我也试下
群二维码能不能发我一下
请问还能加群吗?
还能加群吗@codemayq