keras-attention icon indicating copy to clipboard operation
keras-attention copied to clipboard

An operation has `None` for gradient

Open gionanide opened this issue 5 years ago • 0 comments

#one sample example Input = [1015 4 2 0 0 0 0 0 0 0] output = [ 65 116 2 0 0 0 0 0 0 0] (formated in one hot)

Also I have changed the following function in the AttentionDecoder, everything else remains the same:

from --> self._uxpb = _time_distributed_dense(self.x_seq, self.U_a, b=self.b_a, input_dim=self.input_dim, timesteps=self.timesteps, output_dim=self.units)

to --> dense = Dense(self.units, weights=self.U_a, input_dim=self.input_dim, bias=self.b_a) self._uxpb = TimeDistributed(dense)(self.x_seq)

My model architecture is as follows:

encoder_inputs = keras.layers.Input(shape=(input_max_sentence_length,),name='encoder_inputs')

encoder_embedding = keras.layers.Embedding(input_dim=input_vocabulary_size, output_dim=embedding_dimension, input_length=input_max_sentence_length, mask_zero=True, name='encoder_embedding', trainable=True)(encoder_inputs)

lstm0_output_hidden_sequence = keras.layers.Bidirectional(keras.layers.LSTM(hidden_units, dropout=0, return_sequences=True, return_state=False, name='bidirectional', trainable=True))(encoder_embedding)

lstm01_output_hidden_sequence, lstm01_output_h, lstm01_output_c = keras.layers.LSTM(hidden_units, dropout=dropout_lstm_encoder, return_sequences=True, return_state=True, name='summarization')(lstm0_output_hidden_sequence)

attention_decoder = AttentionDecoder(hidden_units, output_vocabulary_size, trainable=True)(lstm01_output_hidden_sequence)

full_model = keras.models.Model(inputs=[encoder_inputs], outputs=[attention_decoder])

And I am facing the following error:

raise ValueError('An operation has None for gradient. ' ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

gionanide avatar Jun 26 '19 08:06 gionanide