ada3d
ada3d copied to clipboard
cuda memory increases with the iteration increasing
cuda memory increased with the iteration increasing,2M per/iteration on average,I find that it's owing to the mask_bn layer,can you tell the reason?
I‘ve known the reason, which is owing to the following code self.running_var = (1.-self.momentum)self.running_var + self.momentumvar_.view(-1)
the memory of ver_ is linked repeatedly to self.running_var without releasing, so the next question is why do you describe this?
So I add "with torch.no_grad():" before it and I consider that the grad of var can't need to be tracked, I'm looking forward to your reply,thanks!