[BUG] Ranking model predict constant
Bug description
After a training which seems to be ok, the ranking model predict constant.
Steps/Code to reproduce bug
import nvtabular as nvt
import merlin.models.tf as mm
import merlin.io
from merlin.models.tf.transforms.negative_sampling import InBatchNegatives
output_path = "data/processed"
processed_train = nvt.Dataset(f"{output_path}/interactions/train/*.parquet")
processed_valid = nvt.Dataset(f"{output_path}/interactions/valid/*.parquet")
n_per_positive = 12
add_negatives = InBatchNegatives(processed_train.schema, n_per_positive, seed=42, prep_features=True, run_when_testing=True)
train_ranking_loader = Loader(processed_train, schema=schema, batch_size=batch_size, shuffle=True)
valid_ranking_loader = Loader(processed_valid, schema=schema, batch_size=batch_size, shuffle=True)
model = mm.DLRMModel(
processed_train.schema,
embedding_dim=64,
bottom_block=mm.MLPBlock([128, 64]),
top_block=mm.MLPBlock([64, 128, 512]),
prediction_tasks=mm.BinaryClassificationTask("Click"),
)
compile_args = {
"optimizer": tf.keras.optimizers.legacy.Adam(learning_rate=learning_rate),
"run_eagerly": False,
"metrics": [mm.RecallAt(10), mm.NDCGAt(10)],
"weighted_metrics": [tf.keras.metrics.BinaryAccuracy(),tf.keras.metrics.AUC()]
}
model.compile(**compile_args)
model.fit(train_ranking_loader.map(add_negatives),
validation_data=valid_ranking_loader.map(add_negatives),
class_weights={0: 1, 1: n_per_positive},
epochs=5)
This code produce the following output:
But when I try to predict with this model ranking_scores = model.batch_predict(potential_interactions_loader, batch_size=1024), I have the following warning message:
& the prediction is constant:
I'm asking if it's due to the 2nd warning message during prediction.
N.B: it's not due to potential_interactions_loader because I obtain the same kind of issue trying to predict with valid_ranking_loader.
Expected behavior
Get probability of click, obtained in the past but impossible to reproduce without identified reason.
Environment details
- Merlin version: 23.8.0
- Platform: macOS
- Python version: 3.10.12
- Tensorflow version (GPU?): 2.12.0+nv23.6
notebook is run in a container from the following nightly image available here: nvcr.io/nvidia/merlin/merlin-tensorflow:nightly
in which the last version of merlin models is pulled.
Thanks.