Universal-Transformer-Pytorch
Universal-Transformer-Pytorch copied to clipboard
ReLU in PositionwiseFeedForward
Here i is the index of self.layers, therefore it is always less than the length of self.layers.
Probably you mean
if i < len(self.layers) - 1
Then no ReLU and Dropout after the last positionwise layer.