blog
blog copied to clipboard
Transformer-based Encoder-Decoder Model Formula issue
https://github.com/huggingface/blog/blob/0876091b35bd58e096625fffd607b6bba707e661/encoder-decoder.md?plain=1#L373
Should we change $Y_{1:n}$ to $Y_{1:m}$? Thanks.