pycave icon indicating copy to clipboard operation
pycave copied to clipboard

GPU memory leak.

Open epbsb opened this issue 1 year ago • 3 comments

Hello,

I'm using pycave for a project where the data is unidimensional of size 8e9. The GPU options works well, and I'm splitting the data in "for loops" to do the predictions. However, as the loops goes on, it takes more and more of the GPU memory, and eventually runs out of memory. To contour this issue, I'm using torch clean cache at each interaction in addition to the garbage collector function in python, as shown in the code below, however this process is slow.

import gc
def clear_gpu_memory():
    torch.cuda.empty_cache()
    gc.collect()
    torch.cuda.empty_cache()

I've tried to use the pycave built-in function of batches as well, but it also runs in the memory issue.

Is there anything I could do to fix this?

epbsb avatar Aug 10 '23 16:08 epbsb

I haven't seen this in the past and don't currently have a GPU available for testing, unfortunately 😕

borchero avatar Aug 12 '23 18:08 borchero

I "fixed" the problem. When I install pycave it forces an old installation of PyTorch (1.12) with the "torchkit" dependency. After that, I reinstall the latest 2.01 version of PyTorch and there are no more memory leaks!

I can use the batch function normally!!

epbsb avatar Aug 15 '23 12:08 epbsb

@borchero Since my last message I noticed something, in the poetry.lock file you have this:

[package.dependencies] numpy = ">=1.20.0,<2.0.0" pytorch-lightning = ">=1.8,<1.13"

Which I belive forces the install of the older version of PyTorch (1.12).

epbsb avatar Sep 21 '23 17:09 epbsb