learning-to-learn-by-pytorch
learning-to-learn-by-pytorch copied to clipboard
Input dimension of LSTM differs from the original paper
As I read the original paper and the repo from deepmind, it seems to me that the LSTM optimizer should only take 1 variable as input to optimize and save the LSTM state for each variables. In other words, with an arbitrary number of parameters, it just updates one after another. While in this implementation, the dimension of the optimizer is fixed as the number of the optimizee.