TianxingHe

Results 2 comments of TianxingHe

You can use my code, for the projected softmax: ``` if compute_full_outp == True: out_full_logps = [head_logprob[:, :self.cutoffs[0]]] offset = 0 cutoff_values = [0] + self.cutoffs for i in range(1,...

Thanks for the reply! I now understand that transformer-xl doesn't need to recompute things. Are you implying that for the vanilla transformer-lm, that's large overlap between mini-batches (so that each...