InternImage icon indicating copy to clipboard operation
InternImage copied to clipboard

Missing “local-rank” when Training Classification from Scratch

Open Jaakk0F opened this issue 2 years ago • 2 comments

Dear InternImage Developer:

Thanks for reading my message. I encountered the following error when I trained the classification model in ImageNet1K.

InternImage training and evaluation script: error: the following arguments are required: --local-rank

The script is modified from the [Training from Scratch on ImageNet-1K]. In my local machine, there are 2 GPUs. (https://github.com/OpenGVLab/InternImage/blob/master/classification/README.md#training-from-scratch-on-imagenet-1k).

python -m torch.distributed.launch --nproc_per_node 2 --master_port 12345  main.py --cfg configs/without_lr_decay/internimage_t_1k_224.yaml --data-path /path/to/data

I follow the install guidance in the classification README.

When I disabled the required condition for the "local_rank" argument. The script is stuck in constructing ModelEma when running with multiple GPUs. But the script can run with a single GPU with nproc_per_node as 1.

Jaakk0F avatar Aug 24 '23 02:08 Jaakk0F

I also encountered this problem. Have you solved it?

EasonXiao-888 avatar Apr 12 '24 16:04 EasonXiao-888

@Jaakk0F I have same question.

SpiderJack0516 avatar Oct 22 '24 07:10 SpiderJack0516