introtodeeplearning icon indicating copy to clipboard operation
introtodeeplearning copied to clipboard

[Lab 1 Part 2] - Missing softmax argument in Dense layer

Open ksadura opened this issue 1 year ago • 0 comments

In the solution for this task the final RNN model is created as follows:


def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]),
        LSTM(rnn_units), 
        tf.keras.layers.Dense(vocab_size)
    ])

    return model

model = build_model(len(vocab), embedding_dim=256, rnn_units=1024, batch_size=32)

Why there's no activation function (softmax) defined in the Dense layer? In the task it's said:

The final output of the LSTM is then fed into a fully connected Dense layer where we'll output a softmax over each character in the vocabulary, and then sample from this distribution to predict the next character.

According to the docs activation function is None if not explicitly declared.

ksadura avatar Jan 04 '24 00:01 ksadura