fairseq
fairseq copied to clipboard
ABOUT SRC_TOKENS
In TransformerEncoderBase class, it's forward() function has a parameter 'src_tokens': tokens in the source language of shape (batch, src_len).
It's a tensor of indexes, suppoes that:
[ [10, 52, 138, ....],
[53, 108, 52, ....],
...............
[28, 82, 106, ....] ]
How can i get the word in the raw input text that corresponds to each index? Suppose that: [ ['I', 'want', 'to', ...], ['Today', 'I', 'have',...], ........................... ['this', 'movie', 'is', ...] ]
Thank you very much!