Results 2 comments of Kelly W Zhang

Hello, so I replicated this issue when using the newest version of PyTorch. The issue seems to be just differences in the default dimensions (whether they're kept or squeezed). I...

The decoder is used in the loss function ``criterion``, which takes the decoder weights as input. It seems that the decoder doesn't receive gradients / get updated (except through weight...