Transformers-RL
Transformers-RL copied to clipboard
Bugs report about memory mechanism
I found 2 bugs in transformer-xl code layers.py
.
- The init_mem function uses an incorrect shape. https://github.com/dhruvramani/Transformers-RL/blob/337d84aebacc383cd2d1bbafdf05dce448ee9382/layers.py#L261-L268
def init_memory(self, device=torch.device("cpu")):
return [
torch.empty(0, dtype=torch.float).to(device)
for _ in range(self.n_layers + 1)
]
- The calculation of the beginning index in update_mem is incorrect. https://github.com/dhruvramani/Transformers-RL/blob/337d84aebacc383cd2d1bbafdf05dce448ee9382/layers.py#L280-L288
new_memory = []
end_idx = mem_len + seq_len
# self.mem_len is the length of memory retention length. It is different with mem_len.
beg_idx = max(0, end_idx - self.mem_len)
After fixing these bugs above, the memory mechanism still caused incorrect values. I compared the output of the transformer with and without the memory mechanism, and they are totally different. I tried another stable-transformer code from this repo. If anyone wants to fix this further, he can refer to this code.