neural_graph_collaborative_filtering icon indicating copy to clipboard operation
neural_graph_collaborative_filtering copied to clipboard

How to calcuate nDCG in this paper?

Open swyo opened this issue 3 years ago • 1 comments
trafficstars

I'm wondering how to calcuate nDCG paper? :confused:

Question: The nDCG calculation needs the link predictions of all unobserved items, but It is not tractable(because of sparse density). What is your nDCG protocol to reproduce the performance table in this paper as follows. image

If anyone know about it, please explain how to reproduce the nDCG. :pray:

Your model predicts link preferences between user and item pairs.

nDCG needs item ranking lists for each user.

Your model predicts only test link.

If nDCG is calculated, all unobserved items' preferences should be predicted. However, it takes too much time to predict all the items.

I am confusing while reviewing the NGCF paper.


Alternatively NeuCF model use Leave-one-out protocol as follows.

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. In WWW. 173–182. In [14], Leave-one-out protocol:

  1. Use the last timestamp item and do K negative sampling for each user/
  2. Predict rating K negative samples and the last timestamp test data(only one for each user), and then raking.
  3. Calculate Top K ranking metric (such as Precision, Recall, F1, nDCG@K). i.e., check whether the test data exist in the TopK ranking.

However, the paper(in NGCF) mentioned as follows. image

I tried to follow [31], but in there, train test split ratio was 4:1.

So, I think the leave-one-out protocol is not used. I am very curious about how to reproduce nDCG in this paper.

swyo avatar Jan 01 '22 06:01 swyo

Hi, just to further clarify the issue, at this line:

https://github.com/xiangwang1223/neural_graph_collaborative_filtering/blob/a718a4f2df7c3942ca0df6759926975762c61eed/NGCF/utility/metrics.py#L53

The way to compute the NDCG doesn't look the correct one, the index starts with a 2, but if you see here on WikiPedia is 2+i so should start with 3 and do +2 https://en.wikipedia.org/wiki/Discounted_cumulative_gain

return r[0] + np.sum(r[1:] / np.log2(np.arange(3, r.size + 2)))

See also here the same problem https://github.com/MI-911/warm-start-framework/issues/9

kuzeko avatar Apr 12 '22 11:04 kuzeko