keras-attention
keras-attention copied to clipboard
Instability during training
I'm fairly new to this and for some reason I'm having during training. I've witnessed over 10% decrease in validation accuracy at some point.
It's a many-to-many problem similar to pos tagging (vocab size much smaller). Input is an array of 40 integers (zero-padded), output is an array of 40 one-hot vectors. Any idea what I'm doing wrong?
max_seqlen = 40
s_vocabsize = 17
t_vocabsize = 124
embed_size = 64
hidden_size = 128
input_ = Input(shape=(max_seqlen,), dtype='float32')
input_embed = Embedding(s_vocabsize, embed_size, input_length=max_seqlen , mask_zero=True)(input_)
bi_lstm = Bidirectional(LSTM(HIDDEN_SIZE, dropout=0.2, recurrent_dropout=0.2, return_sequences=True), merge_mode='concat')(input_embed)
dropout = Dropout(0.8)(bi_lstm)
y_hat = AttentionDecoder(hidden_size , alphabet_size=t_vocabsize, embedding_dim=embed_size )(dropout)
model = Model(inputs=input_, outputs=y_hat)
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=["accuracy"])```