Surprise
Surprise copied to clipboard
Full-length vector norm option for cosine similarity calculation
Hi,
I am aware that this library is focused on the explicit ratings, but I've come across the use case when it could be hugely useful to also have implemented one option related to implicit rating calculations. This paper e.g. mentions on the bottom of page 6 the formula for item-item similarities using baseline neighborhood model, being calculated using full-length vector norm rather than only the intersection of vectors as is the case in this library:
As your library is optimized by C language and has great performance for calculating item similarities for large datasets, I was wondering if you please could add to your options the possibility to calculate the cosine similarity also with full vector norms, i.e. in the denominator it would be u \in Ui
and u \in Uj
rather than u \in Uij
and analogously for users-based similarity. That would be nice extension covering both options - calculating similarity only on the intersection of vectors and on full vectors as well.
Thanks!