keras-attention Attention probabilites

Attention probabilites

Open Kaustubh1Verma opened this issue 6 years ago • 1 comments

Observing your code and trying to work with different input and output lengths,I saw that in AttentionDecoder implementation for return probabilites = True,the shape of returned probabilites is (None, self.timesteps, self.timesteps). So how do you get probabilites for varied input and output lengths?

Feb 11 '19 05:02 Kaustubh1Verma

This is possible in theory if your sentences are of different lengths, but in general, we pad our sentences so that they are all of the same lengths to take advantage of batch computation. We then use masking to consider sequences of the right length. This code was written for sequences of the same length and there is no masking support.

You can add masking support yourself, but I would highly recommend moving to a newer version of Keras.

Feb 11 '19 16:02 zafarali

keras-attention keras-attention copied to clipboard

Attention probabilites

keras-attention
keras-attention copied to clipboard