parseq
parseq copied to clipboard
Hello, I have a question in the paper. In the Input Context in the decoder of Fig 3, if a character line is entered during training, is this part removed during testing?
During training, all input tokens are used in the input context in order to take advantage of the parallel processing of Transformers.
For testing, only [B]
(the beginning-of-sequence token) is used as input context initially.