deep-learning-with-python-notebooks
deep-learning-with-python-notebooks copied to clipboard
Mask for `TransformerDecoder` in the end-to-end Transformer (chapter11_part04_sequence-to-sequence-learning.ipynb)
In the chapter11_part04_sequence-to-sequence-learning.ipynb
, the TransformerDecoder receives the mask from the PositionalEmbedding layer of the target sequence:
x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)
x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)
Shouldn’t the mask be the one created from encoding the source sequence?
For example, I have seen that in this TF tutorial the mask from the source sequence is used instead.
Any clarification would be greatly appreciated.