STOVE icon indicating copy to clipboard operation
STOVE copied to clipboard

GPU memory issue

Open ling-k opened this issue 4 years ago • 1 comments

Hi there! I have encountered a memory issue when training on GPU. At the very beginning, the GPU memory usage is about 7 Gb or so, but gradually it becomes more than 10 Gb after 30 epochs, which leads to a memory error. I wonder where exactly are the potential causes. Thanks.

ling-k avatar May 24 '21 03:05 ling-k

Hi Lingzhi, thanks for reaching out!

I wonder where exactly are the potential causes.

I remember we encountered this same behaviour and could not figure out why memory was leaking. The one thing that worked was downgrading PyTorch. With Pytorch 1.0.1 as given in https://github.com/jlko/STOVE/blob/master/requirements.txt you should not encounter this issue.

Hope this helps, Jannik

jlko avatar May 24 '21 07:05 jlko