recommenders
recommenders copied to clipboard
How does regularisation affect the training speed? I am getting X100 boost.
Hello!
I am experimenting with the base model from tutorials.
I have noted that after adding tf.keras.regularizers.l2(0.05) ) to embeddings layer two things happen:
- Metrics start growing - understandable.
- Speed of trining becomes X50-100 times faster. Why can it happen?
One more thing I noted is that when I add a few Dense layers without any regularizers or add tf.keras.regularizers.l2(1) with big weight to embedding I am getting all accuracies Top1, Top3, ..., Top100 equal to 1, and loss is low and stable. What is happening in this case?
vocab = tf.keras.layers.experimental.preprocessing.StringLookup(vocabulary=vocabulary[slot_name])
embedding = tf.keras.layers.Embedding(vocab.vocabulary_size(), embedding_dim, embeddings_regularizer=tf.keras.regularizers.l2(0.05) )
inputs.append( tf.keras.Sequential([vocab,
embedding,
# tf.keras.layers.Dense(32, activation="relu"),
# tf.keras.layers.Dense(32, activation="relu"),
]) )
Image: growing lines on the top - experiments with regularisation. Others - without. top_100_categorical_accuracy

Hi @IgorHoholko,
When you say training speed, do you mean time per step or number of steps required for loss to reach a certain level?
For your second point there could be a few issues:
- Large L2 regularization causes your embedding weights to go to zero, resulting in numerical instability.
- The final dense layer should not have an activation.
Thank you @patrickorlando .
Sorry for being not clear. I mean time per step in training loop.
Then that is quite odd. Are you using multiple GPUs @IgorHoholko?
I use one RTX 4090.