Training was successful on a single card 4090GPU, but an error was reported on a 3*4090GPU. why
(lmflow_train) root@duxact:/data/projects/lmflow/LMFlow# ./scripts/run_finetune_with_lisa.sh
--model_name_or_path /data/guihunmodel8.8B
--dataset_path /data/projects/lmflow/case_report_data
--output_model_path /data/projects/lmflow/guihun_fintune_model
--lisa_activated_layers 1
--lisa_interval_steps 20
[2024-05-22 14:32:20,602] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/root/anaconda3/envs/lmflow_train/lib/python3.9/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
Traceback (most recent call last):
File "/data/projects/lmflow/LMFlow/examples/finetune.py", line 61, in NCCL_P2P_DISABLE="1" and NCCL_IB_DISABLE="1" or use accelerate launch` which will do this automatically.
Thanks for your interest in LMFlow! Currently we are working on the full multi-GPU support for LISA. Please stay tuned for our latest update, thanks for your understanding 🙏