implicit icon indicating copy to clipboard operation
implicit copied to clipboard

Method 'recommend' is broken

Open IvanPetrovMck opened this issue 3 years ago • 4 comments

Hi,

I see a problem with current implementation of "recommend" function.

==== Example of wrong behaviour ====

For Last fm: current behaviour with default parameters: image

most probably expected behaviour (however still not sure if output is correct): image

==== Problem ====

There is some problem in recommend function logic for class MatrixFactorizationBase. I might be wrong, but you can't expect user_items.shape[0] != user_count and at the same time have current version of class method user = self._user_factor(....). I think the correct logic should be user_items.shape[0] > user_count where user_count = max(user_id) ?

Consider that by default user most probably will do on of this two calls:

model.recommend(11, user_plays) - this won't work in current version, but worked in version 0.4.0 model.recommend(11, user_plays[11:12]) - this will work in 0.6.1, but then it will get strange behaviour after calling: self._user_factor(userid, user_items, recalculate_user).

==== Code snippet with current implementation: ====

def recommend(
     self,
     userid,
     user_items,
     N=10,
     filter_already_liked_items=True,
     filter_items=None,
     recalculate_user=False,
     items=None,
 ):
     if filter_already_liked_items or recalculate_user:
         if not isinstance(user_items, csr_matrix):
             raise ValueError("user_items needs to be a CSR sparse matrix")
         user_count = 1 if np.isscalar(userid) else len(userid)
         if user_items.shape[0] != user_count:
             raise ValueError("user_items must contain 1 row for every user in userids")

     user = self._user_factor(userid, user_items, recalculate_user)

IvanPetrovMck avatar Nov 14 '22 12:11 IvanPetrovMck

I don't think the recommend call is broken for MF models. We've changed the API in https://github.com/benfred/implicit/issues/481 - and it seems like the change that is tripping you up is https://github.com/benfred/implicit/pull/526.

The right method of calling the recommend functions is model.recommend(11, user_plays[11]) now. The _user_factor function isn't part of the public API, but should also be called in a similar way model._user_factor(11, user_plays[11], recalculate_user).

The first example you posted with the filter_already_liked_items is working as expected - since we support batches of users, the # of items returned must have the same shape as the 'N' parameter. If you're asking for the entire dataset to be returned, and also asking to filter out items that the user has liked - we will just set the score for the filtered items to be -inf to indicate that these items would have been filtered out if possible.

benfred avatar Nov 21 '22 19:11 benfred

Hmm... thanks for explaining. Result make more sense now. For me it was unexpected and definitely such behaviour should be documented in docstring. I thought that items that person liked won't be returned at all.

IvanPetrovMck avatar Nov 24 '22 15:11 IvanPetrovMck

btw wanted to say that very much appreciate your work. And especially math documentation/articles.

IvanPetrovMck avatar Nov 25 '22 13:11 IvanPetrovMck

hi, i tried to use the newest one but seems like it is not working on my side as usual. It told me index 945 is out of bounds for axis 1 with size 240.

tangbufanwei avatar Jan 18 '24 05:01 tangbufanwei