zaplm
zaplm
Thank you @sirlddf, this problem has been bothering me for a long time, and I finally managed to solve it by using your method.
@ZikangZhou Hi, Zhou! What do you mean about ensembling (combination of 5 single models)? Are those models all QCNet, or are QCNet and other models?
It didn't happen right at the beginning of training. As the training progresses, the GPU memory usage increases. In the first epoch, it occupies 20GB per GPU, but by the...
Yes, while using RTX 3090, I noticed that the program quickly ran out of GPU memory by epoch 2. However, when I switched to A100, I observed that it consumed...
Thanks for your response! I will investigate the underlying cause of the memory leak issue.
@Qingfeng800, you can also attempt to decrease the radius, which will also result in lower GPU memory usage.