Pangu-Weather
Pangu-Weather copied to clipboard
gpu and memory for training
Hi From the paper, you used nvidia V100 GPU for training. Was it 16GB or 32GB V100? What is the memory footprint of the model in your implementation? Did you make use of NVLINK connection or was PCIe sufficient? Thank you
Hi,
- The V100 chips we used have 32GB memory.
- During the training, around 25GB memory is occupied on each card (batch size is 1).
- We used the default setting, PCIe.
Best
Hi, Is the 25GB memory the result after using a gradient checkpoint ?
Hello, I'm also interested to know if you used specific strategy for the training to fit in 32GB. With a straightforward implementation of the pseudocode, the training doesn't fit in 80GB GPU. Could you give us some implementation tips ?