LLaMA-Factory
LLaMA-Factory copied to clipboard
Unable to interrupt when using multi-GPU training in webui. 在webui使用多GPU训练时无法中断
Reminder
- [X] I have read the README and searched the existing issues.
Reproduction
- luanch webui using:
CUDA_VISIBLE_DEVICES='0,1' llamafactory-cli webui
- fill the model name or model path
- click
Start
button
Chinese
- 使用
CUDA_VISIBLE_DEVICES='0,1' llamafactory-cli webui
启动webui - 填写模型名称或者模型路径
- 点击启动按钮
Expected behavior
Training terminated successfully. 正常终止训练
System Info
- current llama-factory reversion's commit id: 97346c1d3d87f0bd5ddcd70ff485f6a8273244aa
- OS: Ubuntu 22.04
-
transformers
version: 4.41.1 - Platform: Linux-5.15.0-43-generic-x86_64-with-glibc2.35
- Python version: 3.10.14
- Huggingface_hub version: 0.23.2
- Safetensors version: 0.4.3
- Accelerate version: 0.30.1
- Accelerate config: not found
- PyTorch version (GPU?): 2.3.0+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: Yes
Others
No response