torchdyn
torchdyn copied to clipboard
How to get memory usage for "adjoint" and "autograd" method?
Thanks for this amazing package!
I was trying to test the memory usage of adjoint, as claimed by authors of the original neural ODE paper, the memory usage of adjoint method should be smaller compared to vanilla "autograd". However, the output of torch.cuda.memory_summary() show an increase of GPU memory of the adjoint method compared to autograd. I'm wondering if I used torch.cuda.memory_summary() wrong, I printed it after the training. If my approach was incorrect, what is the correct way to get memory usage for "adjoint" and "autograd" method?
Hey, did you happen to make progress on this? I am curious to know and can hopefully provide some benchmarks when I get my problem running as well