fairseq
fairseq copied to clipboard
Why send whole tokens to "repeat_ngram_blocker "?
Hi, In NGramRepeatBlock, the ngram_to_check is defined as
ngram_to_check = cpu_tokens[bbsz_idx][ -(self.no_repeat_ngram_size - 1) : ]
The checked tokens must be generated. but there are many padding tokens in the cpu_tokens, copied from tokens
.
https://github.com/facebookresearch/fairseq/blob/eba8a50d2b184a0eba8dbec8bada8a2129bbb77c/fairseq/sequence_generator.py#L416
So I think the reasonable way to call the repeat_ngram_blocker is
lprobs = self.repeat_ngram_blocker(tokens[:, :step+1], lprobs, bsz, beam_size, step)