tensorrec Calculate Normalised Discounted Cumulative Gain Error

I get the following error when running fit_and_eval from tensorrec.eval:

File "/usr/local/lib/python3.5/dist-packages/tensorrec/eval.py", line 81, in _dcg numer = (2**np.multiply(relevance.data, k_mask)) - 1 ValueError: operands could not be broadcast together with shapes (5347038,) (52804,)

https://github.com/jfkirk/tensorrec/blob/65fefe4437c8974b39cc9ab56b9769ed9eb70ffa/tensorrec/eval.py#L81

Looking at the source code and the definition of the discounted cumulative gain, I think that the calculation of k_mask is not correct for the application here, because it is calculated by using the data array of a different sparse matrix

https://github.com/jfkirk/tensorrec/blob/65fefe4437c8974b39cc9ab56b9769ed9eb70ffa/tensorrec/eval.py#L68

Instead it should, in my opinion, use the entire ror matrix, something like

k_mask = ror < k+1

However, < operator is quite inefficient for sparse matrices but I hope it is clear what I mean :)

Nov 28 '18 23:11 gallmerci

Hey @gallmerci ! Thanks for reporting.

What is the shape of your user_features, item_features, and interactions?

Dec 06 '18 14:12 jfkirk

Hey @jfkirk,

I have the follow shapes:

Shape of item features data: (120, 33) Shape of training user data: (3154, 12) Shape of interaction data: (3154, 120)

Dec 06 '18 15:12 gallmerci

Also having this issue, pointing to an issue in the NDCG method:

/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/scipy/sparse/compressed.py:202: RuntimeWarning: invalid value encountered in greater
  res = self._with_data(op(self.data, other), copy=True)
Traceback (most recent call last):
  File "TF_fit_eval.py", line 111, in <module>
    fit_kwargs=fit_kwargs)
  File "~/tensorrec/tensorrec/eval.py", line 160, in fit_and_eval
    n_at_k = ndcg_at_k(predicted_ranks, test_interactions, k=ndcg_k)
  File "~/tensorrec/tensorrec/eval.py", line 108, in ndcg_at_k
    dcg = np.asarray(_dcg(relevance, k_mask, ror_at_k, ranks_of_relevant))[0]
  File "~/tensorrec/tensorrec/eval.py", line 81, in _dcg
    numer = (2**np.multiply(relevance.data, k_mask)) - 1
ValueError: operands could not be broadcast together with shapes (1167,) (95,)

Dec 11 '18 17:12 kevglynn

@kevglynn Can you provide some example dataset where this happens? I can't reproduce this error at the moment... I understand in what circumstances this error occurs but I don't understand how this circumstances are possible at all :)

Dec 11 '18 20:12 gallmerci

@gallmerci sorry for the delay. It's only happening for me with a specific dataset, which I can't share (proprietary). I will try to come up with an example that reproduces...

Dec 16 '18 04:12 kevglynn

Hey all -- any luck with reproducible examples? I'd love to get to the bottom of this, but I've been poking at it and haven't been able to reproduce.

Jan 06 '19 17:01 jfkirk

tensorrec tensorrec copied to clipboard

Calculate Normalised Discounted Cumulative Gain Error

tensorrec
tensorrec copied to clipboard