Cylinder3D icon indicating copy to clipboard operation
Cylinder3D copied to clipboard

Does it currently support distributed multi card training ?

Open chenrui17 opened this issue 3 years ago • 3 comments

Will it be supported in the future ? current single card training cost too much time

chenrui17 avatar Oct 27 '22 02:10 chenrui17

I got the model to run on multiple GPUs, however the training script in this repo is for single GPU.

With current versions of torch / spconv / CUDA the model is a lot faster to train. I rewrote it here for that purpose (for single GPU).

L-Reichardt avatar Feb 27 '23 14:02 L-Reichardt

How do I run models on multiple GPUs?

nakatomo8899 avatar Jun 28 '23 06:06 nakatomo8899

@nakatomo8899 I wrote my own Distributed Data Parallel (DDP) pipeline for this (not open source). I used a combination of Lei Maos cookbook, PyTorch's tutorial, and well documented repos such as Swin in order to do this.

L-Reichardt avatar Jun 28 '23 08:06 L-Reichardt