syf-fgnb

Results 1 issues of syf-fgnb

我按照文档准备finetune第二阶段,直接运行sh scripts_asmv2/stage2-finetune.sh,报了下边的错: ValueError: Looks like distributed multinode run but MASTER_ADDR env not set, please try exporting rank 0's hostname as MASTER_ADDR 然后我把命令改成torchrun --master_port=xxxxx,结果报了CUDA Out of memory的错(即使我已经把bacthsize设成1了),环境是A100+deepspeed zero2,请问这是怎么回事