SMAP
SMAP copied to clipboard
Question about batch size
I have read your paper carefully. It mentioned that the batch size is set to 32, but I only see solver. IMG_PER_GPU = 2 in train.py. Is this a change in the code for a GPU training? Thanks a lot for your time and reading
The batch size is calculated in the setting of multi-gpu DistributedDataParallel training.
I only have one GPU, can I only set IMG_PER_GPU = 1?
How do I solve this problem? Is it because I only have one GPU? raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/usr/bin/python', '-u', 'train.py', '--local_rank=0']' died with <Signals.SIGKILL: 9>.
What is your setting of "nproc_per_node" ?
I’m setting nproc_per_node=1
It may be a problem. An easy solution is to call "train.py" directly rather than using "torch.distributed.launch".
It also have a problem.Is these something wrong with train.py?
2021-08-03 16:40:36 node02 root[2842] INFO using devices 0 train.sh: line 5: 2842 Killed python train.py