Yushuo Guan
Yushuo Guan
So the input sentence of `tgt_seq` is like [BOS, word1, word2, ..., PAD, PAD, ..., EOS]?
Thanks for your reply. But in Models.py, [https://github.com/jadore801120/attention-is-all-you-need-pytorch/blob/20f355eb655bad40195ae302b9d8036716be9a23/transformer/Models.py#L202](url) ```python tgt_seq, tgt_pos = tgt_seq[:, :-1], tgt_pos[:, :-1] ``` If `tgt_seq` is like [BOS, word1, ..., PAD,..., PAD]; why should I delete...
Thanks for your clear explanation. I'm new in neural machine translation. I wonder whether it is a convention to encode the `tgt_seq` as [BOS, word1, ..., PAD,..., PAD, EOS]. I...
Because only the first len(tgt_seq) - 1 words are significant. The decoder takes [BOS, word1_gt, word2_gt, ...] as input and output [word1_pred, word2_pred, ...]. In other words, the maximum length...
Thanks for your comment. Since the repo is modified from https://github.com/Yuliang-Liu/TIoU-metric, I keep the same logic as Liu's original work. Maybe you should discuss with him first. I'll appreciate it...