tensorrec
tensorrec copied to clipboard
Calculate Normalised Discounted Cumulative Gain Error
I get the following error when running fit_and_eval
from tensorrec.eval
:
File "/usr/local/lib/python3.5/dist-packages/tensorrec/eval.py", line 81, in _dcg numer = (2**np.multiply(relevance.data, k_mask)) - 1 ValueError: operands could not be broadcast together with shapes (5347038,) (52804,)
https://github.com/jfkirk/tensorrec/blob/65fefe4437c8974b39cc9ab56b9769ed9eb70ffa/tensorrec/eval.py#L81
Looking at the source code and the definition of the discounted cumulative gain, I think that the calculation of k_mask is not correct for the application here, because it is calculated by using the data array of a different sparse matrix
https://github.com/jfkirk/tensorrec/blob/65fefe4437c8974b39cc9ab56b9769ed9eb70ffa/tensorrec/eval.py#L68
Instead it should, in my opinion, use the entire ror
matrix, something like
k_mask = ror < k+1
However, <
operator is quite inefficient for sparse matrices but I hope it is clear what I mean :)
Hey @gallmerci ! Thanks for reporting.
What is the shape of your user_features, item_features, and interactions?
Hey @jfkirk,
I have the follow shapes:
Shape of item features data: (120, 33) Shape of training user data: (3154, 12) Shape of interaction data: (3154, 120)
Also having this issue, pointing to an issue in the NDCG method:
/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/scipy/sparse/compressed.py:202: RuntimeWarning: invalid value encountered in greater
res = self._with_data(op(self.data, other), copy=True)
Traceback (most recent call last):
File "TF_fit_eval.py", line 111, in <module>
fit_kwargs=fit_kwargs)
File "~/tensorrec/tensorrec/eval.py", line 160, in fit_and_eval
n_at_k = ndcg_at_k(predicted_ranks, test_interactions, k=ndcg_k)
File "~/tensorrec/tensorrec/eval.py", line 108, in ndcg_at_k
dcg = np.asarray(_dcg(relevance, k_mask, ror_at_k, ranks_of_relevant))[0]
File "~/tensorrec/tensorrec/eval.py", line 81, in _dcg
numer = (2**np.multiply(relevance.data, k_mask)) - 1
ValueError: operands could not be broadcast together with shapes (1167,) (95,)
@kevglynn Can you provide some example dataset where this happens? I can't reproduce this error at the moment... I understand in what circumstances this error occurs but I don't understand how this circumstances are possible at all :)
@gallmerci sorry for the delay. It's only happening for me with a specific dataset, which I can't share (proprietary). I will try to come up with an example that reproduces...
Hey all -- any luck with reproducible examples? I'd love to get to the bottom of this, but I've been poking at it and haven't been able to reproduce.