keras-transformer icon indicating copy to clipboard operation
keras-transformer copied to clipboard

Return correct output shape for MultiHeadAttention

Open Callidior opened this issue 5 years ago • 0 comments

In contrast to MultiHeadSelfAttention, MultiHeadAttention has two inputs but only one input. The current implementation does not override compute_output_shape, which by default returns the input shapes unmodified. Instead, only the input shape of the decoder must be returned. Otherwise, this results in errors during model construction if the sequence length of the encoder and decoder differ.

Callidior avatar Sep 06 '19 12:09 Callidior