recommenders
recommenders copied to clipboard
tfrs.models.Model "fit" and "evaluate" Return Loss Values for Last Batch Only
When I call "fit" or "evaluate" on a tfrs.model.Model the loss values (total_loss, loss, and regularization_loss) returned are only based on just the last batch (or last batch of each epoch for "fit"). I believe this is because of the way train_step and test_step are implemented.
def train_step(self, inputs):
"""Custom train step using the `compute_loss` method."""
with tf.GradientTape() as tape:
loss = self.compute_loss(inputs, training=True)
# Handle regularization losses as well.
regularization_loss = sum(self.losses)
total_loss = loss + regularization_loss
gradients = tape.gradient(total_loss, self.trainable_variables)
self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))
metrics = {metric.name: metric.result() for metric in self.metrics}
metrics["loss"] = loss
metrics["regularization_loss"] = regularization_loss
metrics["total_loss"] = total_loss
return metrics
def test_step(self, inputs):
"""Custom test step using the `compute_loss` method."""
loss = self.compute_loss(inputs, training=False)
# Handle regularization losses as well.
regularization_loss = sum(self.losses)
total_loss = loss + regularization_loss
metrics = {metric.name: metric.result() for metric in self.metrics}
metrics["loss"] = loss
metrics["regularization_loss"] = regularization_loss
metrics["total_loss"] = total_loss
return metrics
From what I understand the "metrics" values returned from these functions are meant to be metrics for the entire dataset (or epoch) up to and including this batch. The values here are just the last batch. The following code reproduces the issue.
import tensorflow_recommenders as tfrs
import tensorflow as tf
class Model(tfrs.models.Model):
def __init__(self):
super().__init__()
@tf.function
def compute_loss(self, inputs, training=False) -> tf.Tensor:
return inputs[0]
dataset = tf.data.Dataset.from_tensor_slices(([1, 1, 1, 10], [0, 0, 0, 0]))
model = Model()
model.compile()
h = model.fit(dataset, epochs=5)
print(h.__dict__)
result = model.evaluate(dataset, return_dict=True)
print(result)
I get the following as output
Epoch 1/5
WARNING:tensorflow:The dtype of the target tensor must be floating (e.g. tf.float32) when calling GradientTape.gradient, got tf.int32
WARNING:tensorflow:The dtype of the target tensor must be floating (e.g. tf.float32) when calling GradientTape.gradient, got tf.int32
4/4 [==============================] - 0s 1ms/step - loss: 4.6000 - regularization_loss: 0.0000e+00 - total_loss: 4.6000
Epoch 2/5
4/4 [==============================] - 0s 1ms/step - loss: 4.6000 - regularization_loss: 0.0000e+00 - total_loss: 4.6000
Epoch 3/5
4/4 [==============================] - 0s 1ms/step - loss: 4.6000 - regularization_loss: 0.0000e+00 - total_loss: 4.6000
Epoch 4/5
4/4 [==============================] - 0s 1ms/step - loss: 4.6000 - regularization_loss: 0.0000e+00 - total_loss: 4.6000
Epoch 5/5
4/4 [==============================] - 0s 1ms/step - loss: 4.6000 - regularization_loss: 0.0000e+00 - total_loss: 4.6000
{'validation_data': None, 'model': <__main__.MeetMeModel object at 0x16212e7f0>, '_chief_worker_only': None, '_supports_tf_logs': False, 'history': {'loss': [10, 10, 10, 10, 10], 'regularization_loss': [0, 0, 0, 0, 0], 'total_loss': [10, 10, 10, 10, 10]}, 'params': {'verbose': 1, 'epochs': 5, 'steps': 4}, 'epoch': [0, 1, 2, 3, 4]}
4/4 [==============================] - 0s 1ms/step - loss: 4.6000 - regularization_loss: 0.0000e+00 - total_loss: 4.6000
{'loss': 10, 'regularization_loss': 0, 'total_loss': 10}
Yes, I agree, there should be created something like self.loss_tracker = keras.metrics.Mean(name="loss")
for each additional loss in __init__
of the Model and in train_step
and test_step
they should be updated by self.loss_tracker.update_state(loss)