implicit Fix `ranking_metrics_at

This PR resolves a few issues;

#412 "precision" on ranking_metrics_at_k is actually "recall"
I guess it's fine to update precision and recall since this library took major braking update (0.5.0)
#545: ranking_metric_at_k raises ValueError if K > num_items

This PR adds MRR, and Precision as new metrics.

Jul 11 '22 05:07 ita9naiwa

hi @benfred. Can you check and review this PR? this resolves inaccurate NDCG and MRR values of ranking_metrics_at_k function.

Aug 10 '22 07:08 ita9naiwa

Changes:

Precision and Recall metric has been switched.
fix MAP metric following
Added MRR, since it is also one of the most widely leveraged metrics in RS community e.g., RecSys Challenge 2022.

tr, te = train_test_split(ratings, random_state=1541)
model = AlternatingLeastSquares(random_state=1541, factors=30, iterations=10)
model.fit(tr)
ranking_metrics_at_k(model, tr, te, K=100)

as is :

{'precision': 0.3349958296821056,
 'map': 0.12534890653797998,
 'ndcg': 0.2686550155007732,
 'auc': 0.6093577862786992}

to be:

{'precision': 0.07930221607727832,
 'recall': 0.3349958296820349,
 'map': 0.06293165699220135,
 'ndcg': 0.2686550155007732,
 'auc': 0.6093577862785867,
 'mrr': 0.5348017396151994}

I guess that definition of MAP should follow precision

Aug 11 '22 04:08 ita9naiwa

@benfred any ETA on getting this in and released? I was debugging a model yesterday that had weird evaluation results and came to the same conclusion as @ita9naiwa.

Aug 23 '22 09:08 thomasjungblut

Hi @ita9naiwa. I was checking the code for your fix on ranking_metrics_at_k and I'm not sure about the way you define the denominator of Precision. You're using the size of the user's liked items on the test set, but shouldn't it be K, the number of recommended items? K would include True Positives + False Positives, which is what I have normally seen in the definitions I have read of precision. Correct me if I'm wrong, I'd appreciate your opinion on the issue. Thanks!

Aug 26 '22 13:08 malonsocortes

And the divisor for Recall is also wrong. It should always be divided by likes.size() and not by k if k is smaller. This would only push the score and not return the true recall value. Or am I wrong?

Jul 06 '23 10:07 Blo0dR0gue

Fix `ranking_metrics_at_k()`