introtodeeplearning
introtodeeplearning copied to clipboard
[Lab 1 Part 2] - Missing softmax argument in Dense layer
In the solution for this task the final RNN model is created as follows:
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]),
LSTM(rnn_units),
tf.keras.layers.Dense(vocab_size)
])
return model
model = build_model(len(vocab), embedding_dim=256, rnn_units=1024, batch_size=32)
Why there's no activation function (softmax) defined in the Dense
layer? In the task it's said:
The final output of the LSTM is then fed into a fully connected Dense layer where we'll output a softmax over each character in the vocabulary, and then sample from this distribution to predict the next character.
According to the docs activation function is None
if not explicitly declared.