learning-to-learn-by-pytorch
learning-to-learn-by-pytorch copied to clipboard
"Learning to learn by gradient descent by gradient descent "by PyTorch -- a simple re-implementation.
bug
when i run: RuntimeError: can't retain_grad on Tensor that has requires_grad=False
Why is preprocess=False in the code? If it is changed to Ture instead, NAN appears.
As I read the original paper and the repo from deepmind, it seems to me that the LSTM optimizer should only take 1 variable as input to optimize and save...
In the process of program training, GPU memory leaks and GPU memory keeps increasing until out of memory. Can you figure this bug ?