recommenders Sample weights for both Retrieval and Ranking

I'd like to ask about using the 'sample_weight' argument for both the Retrieval and the Ranking Tasks' call methods.

Retrieval docs Ranking docs

For both models, does it make sense to apply higher weights for samples which are more recent and smaller weights to samples that are older, individually for each user (meaning that the weights are computed individually for each user)?
For the Ranking model, we have an imbalanced dataset (clicks on the webage, and we control the imbalance ratio) and we already tried to use the 'sample_weight' parameter, and we were surprised by the fact that not just the loss is weighted, but also the metrics. Based on this tutorial, we supposed that the 'sample_weight' parameter is used here to tackle class imbalance, similar to the 'class_weight' parameter when calling the Keras model.fit() method, by weighting the loss function, but not the metrics (source code). Could you elaborate on why the metrics are weighted? According to my experience weighting the loss function should have an indirect effect on the metrics.
Should we use 'sample_weight' -s for both the train and the validation dataset? If we don't use it for both, then there will be a huge gap due to the metrics weighting between the two resulting metrics. Otherwise the weights will be applied to the validation samples as well, which is not correct I think.
Retrieval Task - Could you mention examples for these parameters and use-cases? 4.1 batch_metrics: Optional[List[tf.keras.metrics.Metric]] = None 4.2 loss_metrics: Optional[List[tf.keras.metrics.Metric]] = None
Ranking Task - Could you mention examples for these parameters and use-cases? 5.1. prediction_metrics: Optional[List[tf.keras.metrics.Metric]] = None 5.2 label_metrics: Optional[List[tf.keras.metrics.Metric]] = None 5.3. loss_metrics: Optional[List[tf.keras.metrics.Metric]] = None

I would greatly appreciate some help from the community from those who are more experienced in the above questions.

Nov 16 '22 12:11 hkristof03

Hi,

I think it does. But how do you calculate it per user?

Nov 24 '22 16:11 Ullar-Kask

@patrickorlando do you have any thoughts about the above?

Nov 29 '22 12:11 hkristof03

Hi @hkristof03,

As with many things in recsys, it depends. For example if seasonality or trends affect the relevance of items then your model may perform better with a consistent time-based decay for the sample weight.
Sample weight here is not so much to correct class imbalance as is to weight the importance of a given sample. For example we are using clicks as a label, but know that the time spent on the page is important to us. In this case we can weight each sample by the time spent further tuning the model towards that behaviour. In this case it makes sense that our TopKAccuracy is weighted in the same way since we are not correcting class imbalance, but rather saying how much we care about a given example. So in short, sample_weight != class_weights
In the case I described above, yes you would use the sample weights in both training and validation.
The function doc explains these two as: https://github.com/tensorflow/recommenders/blob/2fdc65d37173d19cac6f5017bfaf9a44790d71fc/tensorflow_recommenders/tasks/retrieval.py#L59-L69
Similarly for the ranking task. https://github.com/tensorflow/recommenders/blob/2fdc65d37173d19cac6f5017bfaf9a44790d71fc/tensorflow_recommenders/tasks/ranking.py#L53-L56 for example you could calculate the average value for predictions, the average label value, or any other metrics you care about.

Nov 30 '22 22:11 patrickorlando

recommenders recommenders copied to clipboard

Sample weights for both Retrieval and Ranking

recommenders
recommenders copied to clipboard