models icon indicating copy to clipboard operation
models copied to clipboard

[BUG] The negative sampling for ranking gives AUC 0.00 value when we add `sampling` layer to the Model class

Open rnyak opened this issue 3 years ago • 5 comments

Bug description

The negative sampling for ranking gives AUC 0.00 value and binary classification acc 0.999 when we add sampling layer to the mm.Model() class.

Steps/Code to reproduce bug

Please run the code below with any synthetic dataset and corresponding schema to repro the issue:

sampling = UniformNegativeSampling(schema, 5, seed=42)

model = mm.Model(
    mm.InputBlock(schema),
    sampling,
    mm.MLPBlock([64]),
    mm.BinaryClassificationTask("label"),
)


BATCH_SIZE=2048
model.compile("adam", run_eagerly=False, metrics=[tf.keras.metrics.AUC()])
model.fit(train, batch_size=BATCH_SIZE)

Expected behavior

We should be getting AUC > 0

Environment details

  • Merlin version:
  • Platform:
  • Python version:
  • PyTorch version (GPU?):
  • Tensorflow version (GPU?):

merlin-tensorflow-training:22.05 image with the latest main branches pulled.

rnyak avatar Jul 20 '22 14:07 rnyak

@rnyak , please triage this bug

viswa-nvidia avatar Jul 22 '22 21:07 viswa-nvidia

So it looks like this is something to do with eager/graph mode. Running with model.compile(..., run_eagerly=True) produces a non-zero AUC. Example below:

import pyarrow as pa
import tensorflow as tf
from merlin.models.tf.data_augmentation.negative_sampling import UniformNegativeSampling
import merlin.models.tf as mm
from merlin.io import Dataset
from merlin.datasets.entertainment import get_movielens


train, valid = get_movielens(variant="ml-100k")
schema = train.schema
schema = schema.without(["rating"])

# keep only positives
train_df = train.to_ddf().compute()
train = Dataset(train_df[train_df["rating_binary"] == 1])
train.schema = schema

model = mm.Model(
    mm.InputBlock(schema),
    UniformNegativeSampling(schema, 5, seed=42),
    mm.MLPBlock([64]),
    mm.BinaryClassificationTask("rating_binary"),
)

model.compile("adam", metrics=[tf.keras.metrics.AUC()], run_eagerly=True)
model.fit(train, batch_size=2048, epochs=1)
# => 25/25 [======] - 5s 187ms/step - loss: 0.4996 - auc_1: 0.5002

oliverholworthy avatar Aug 18 '22 11:08 oliverholworthy

@rnyak For now since this doesn't work correctly in the model context (without eager mode). We should probably consider this as a non-supported feature. And only recommend use of this in the dataloader in examples/documentation.

oliverholworthy avatar Aug 26 '22 10:08 oliverholworthy

@oliverholworthy this looks like working on graph mode, now.

rnyak avatar Oct 05 '22 15:10 rnyak

I still get zero AUC when run_eagerly=False (the default), with this sampling layer (now called InBatchNegatives in the model passing only positives as examples.

import pyarrow
import tensorflow as tf
from merlin.models.tf.transforms.negative_sampling import InBatchNegatives
import merlin.models.tf as mm
from merlin.io import Dataset
from merlin.datasets.entertainment import get_movielens


train, valid = get_movielens(variant="ml-100k")
schema = train.schema
schema = schema.without(["rating"])

# keep only positives
train_df = train.to_ddf().compute()
train = Dataset(train_df[train_df["rating_binary"] == 1])
train.schema = schema

model = mm.Model(
    mm.InputBlockV2(schema, aggregation=None),
    InBatchNegatives(schema, 5, seed=42),
    mm.MLPBlock([64]),
    mm.BinaryClassificationTask("rating_binary"),
)

model.compile("adam", metrics=[tf.keras.metrics.AUC()])
model.fit(train, batch_size=2048, epochs=1)
# 25/25 - 1s 13ms/step - loss: 0.2768 - auc_2: 0.0000e+00 - regularization_loss: 0.0000e+00

oliverholworthy avatar Oct 05 '22 17:10 oliverholworthy