wgan-gp icon indicating copy to clipboard operation
wgan-gp copied to clipboard

Memory Leak

Open SivanKe opened this issue 7 years ago • 5 comments

Hello,

I tried to run the gan_mnist.py file both with the most current master version of pytorch (0.2.0+75bb50b) and with an older commit (0.2.0+c62490b).

With both versions the memory used by the code keeps increasing in each iteration, until the program ends with out of memory error.

When I took only the code of the function calc_gradient_penalty() and integrated it into my code, it caused the same memory leak.

Surprisingly, when a friend took the exact same code and integrated it to cycle gan - it did not cause memory leak.

Do you know what is the problem, or of a specific git revision of pytorch where there is no memory leak?

SivanKe avatar Oct 18 '17 08:10 SivanKe

Maybe it is other parts of your code that cause memory leak, rather than the code of calc_gradient_penalty()

caogang avatar Oct 20 '17 06:10 caogang

I have the same issue.

It is caused by doing this:

y_r = critic(_x_r)
y_r.mean().backwards(mone)
# Train with fake.
y_g = critic(_x_g)
y_g.mean().backwards(one)
# Train with gradient penalty.
gp = compute_gradient_penalty(critic, _x_r.data, _x_g.data)
gp.mean().backwards()
optimizer.step()

Instead of

# Reset the gradients.
critic.zero_grad()
# Train with real.
y_r = critic(_x_r)
# Train with fake.
y_g = critic(_x_g)
# Train with gradient penalty.
gp = compute_gradient_penalty(critic, _x_r.data, _x_g.data)
loss = y_g - y_r + gp
loss.mean().backward()
optimizer.step()

Ping @SivanKe @caogang

Joeri

JoeriHermans avatar Oct 20 '17 08:10 JoeriHermans

I also have a memory leak when implementing, I think it's due to create_graph=True, but without it, the gradient (of the gp part of loss) does not backprop through entire D network. Would be interested in a solution.

DelsinMenolascino avatar Feb 11 '21 17:02 DelsinMenolascino

I also have a memory leak when implementing, I think it's due to create_graph=True, but without it, the gradient (of the gp part of loss) does not backprop through entire D network. Would be interested in a solution.

Oh, I now also facing this, did you solve it?

NUS-Tim avatar Sep 08 '21 06:09 NUS-Tim

I have the same problem.

chang-1 avatar Feb 04 '23 04:02 chang-1