ColossalAI icon indicating copy to clipboard operation
ColossalAI copied to clipboard

[BUG]: /bin/bash: line 0: export: `NPU-VISIBLE-DEVICES=0,1,2,3,4,5,6,7': not a valid identifier

Open Gera001 opened this issue 10 months ago • 2 comments

Is there an existing issue for this bug?

  • [x] I have searched the existing issues

The bug has not been fixed in the latest main branch

  • [x] I have checked the latest main branch

Do you feel comfortable sharing a concise (minimal) script that reproduces the error? :)

Yes, I will share a minimal reproducible script.

🐛 Describe the bug

运行这个是colossalai run --hostfile hostfile --nproc_per_node 8 lora_finetune.py --pretrained /home/ma-user/work/deepseek_7b/deepseek_7b --dataset lora_sft_data.jsonl --plugin moe --lr 2e-5 --max_length 256 -g --ep 8 --pp 3 --batch_size 24 --lora_rank 8 --lora_alpha 16 --num_epochs 2 --warmup_steps 8 --tensorboard_dir logs --save_dir DeepSeek-R1-bf16-lora遇到/bin/bash: line 0: export: `NPU-VISIBLE-DEVICES=0,1,2,3,4,5,6,7': not a valid identifier是为什么

Environment

No response

Gera001 avatar Feb 24 '25 02:02 Gera001

NPU-VISIBLE-DEVICES是本地设置的环境变量吗?正确格式应该是NPU_VISIBLE_DEVICES?

ver217 avatar Feb 24 '25 07:02 ver217

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Is NPU-VISIBLE-DEVICES an environment variable set locally? The correct format should be NPU_VISIBLE_DEVICES?

Issues-translate-bot avatar Feb 24 '25 07:02 Issues-translate-bot