SCTNet icon indicating copy to clipboard operation
SCTNet copied to clipboard

CUDA out of memory

Open asher-bit opened this issue 1 year ago • 2 comments

你好, 首先,感谢您精彩的工作 我在尝试复现模型的时候,出现了显存报错。debug时发现在G-SAB模块中,tensor的尺寸过大(150万),导致显存溢出。但是我并未修改您的模型。请问是什么问题呢? 期待您的回复!

asher-bit avatar Apr 24 '24 08:04 asher-bit

image image

asher-bit avatar Apr 24 '24 08:04 asher-bit

150M is not too big in computational terms, what GPU are u using?, training with a batch size of 22 (defult is 22) needs a los of memory, maybe u will need 2 x 40GB of GPU memory for training it.

josair21 avatar Jun 24 '24 16:06 josair21