Transformers-RL Bugs report about memory mechanism

Bugs report about memory mechanism

Open TimerChen opened this issue 2 years ago • 0 comments

I found 2 bugs in transformer-xl code layers.py.

The init_mem function uses an incorrect shape. https://github.com/dhruvramani/Transformers-RL/blob/337d84aebacc383cd2d1bbafdf05dce448ee9382/layers.py#L261-L268

def init_memory(self, device=torch.device("cpu")): 
     return [ 
         torch.empty(0, dtype=torch.float).to(device) 
         for _ in range(self.n_layers + 1) 
     ]

The calculation of the beginning index in update_mem is incorrect. https://github.com/dhruvramani/Transformers-RL/blob/337d84aebacc383cd2d1bbafdf05dce448ee9382/layers.py#L280-L288

            new_memory = []
            end_idx = mem_len + seq_len
            # self.mem_len is the length of memory retention length. It is different with mem_len.
            beg_idx = max(0, end_idx - self.mem_len)

After fixing these bugs above, the memory mechanism still caused incorrect values. I compared the output of the transformer with and without the memory mechanism, and they are totally different. I tried another stable-transformer code from this repo. If anyone wants to fix this further, he can refer to this code.

Jan 03 '23 17:01 TimerChen

Transformers-RL Transformers-RL copied to clipboard

Bugs report about memory mechanism

Transformers-RL
Transformers-RL copied to clipboard