mvsplat
mvsplat copied to clipboard
How to start multi-gpus training in a single machine
Thanks for the excellent work!
I encounter a problem of how to start multi-gpu training. I have 8 gpus but each I ran the training command line I can only start one GPU training:
I use this command:
python -m src.main +experiment=re10k data_loader.train.batch_size=14
Does it mean even I train on single node with multiple GPUs, I still need to use slurm to run multi gpus training?
Hi @kevinhuangxf, thanks for your appreciation. Normally, the current setting should automatically utilize all available GPUs for training. I'm not sure what might be causing this issue. You could try explicitly specifying the training devices to use all GPUs by following the instructions here.