lightfm icon indicating copy to clipboard operation
lightfm copied to clipboard

Predict recommendations without retraining model for new users

Open Angriff opened this issue 5 years ago • 7 comments

Hello!

I'm looking for a way for effective item recommendations for new users. I've read all created issues about LightFM (the most relevant existing question is https://github.com/lyst/lightfm/issues/347), but have not found full answer on my question.

My situation is: I have historical data, where users interact with subset of items from finite set. Users have feature representation (gender, age, etc), items have features (tags, types, etc). So, in my opinion LightFM hybrid model will be very suitable for giving user recommendations about items. In production mode item set will be the same. BUT all users will be new. I will know user features. And I need in online mode take new user-item interaction and give new recommendations. Is it possible to take into account new interractions in realtime? How? Is it possible to get this recommenations without changing model (i want to "freeze" model)?

Thank you any help!

Angriff avatar May 14 '19 16:05 Angriff

I think this was already answered in #210.

Fleetrun10 avatar May 22 '19 21:05 Fleetrun10

@raphgt2 , Thank you for your help. In mentioned issue the cold start predictions are discussed. In my case I have new users (and that is cold start), but also I have known interactions of this users. So how can I use information about new users interactions (avoiding changing fitted model params)?

Angriff avatar May 23 '19 14:05 Angriff

I think you'd have to encode the new user's known interactions as a feature somehow and do the original training from the known user-item interactions with the new feature(s) in addition to the gender/age/etc features already on the user

Fleetrun10 avatar May 23 '19 18:05 Fleetrun10

Hi,

I also have the same query. Having trained the model on a set of users, is it possible to use the existing model to recommend for new users with interactions (same features) ? I had checked #210 but this moves beyond a cold start problem. Is it possible to manually calculate using item and user embedding + bias values ? Or should the model be retrained with those new users as mentioned in this comment ?

Thank you

1adarshg avatar Jul 12 '19 19:07 1adarshg

@Angriff, @1adarshg did you get some ideas for solving this problem? I'm having the same situation

alexmilano avatar Sep 24 '20 16:09 alexmilano

Did anyone have solutions for this issue yet? I'm facing the same problem where I have used historical interactions + user feature + item feature data to train a hybrid model, but when I try to deploy this in production, a common use case is we have new users with user features + interactions. It seems that this is beyond cold start where you would only need to predict using user features, but has in this case there are interactions too.

zhenliu2012 avatar Sep 23 '21 15:09 zhenliu2012

One possible approach is to generate a new user factor by solving the linear least squares problem given the reconstructed item factors.

Given some initial set-up along these lines:

user_to_item = sparse row binary matrix (all ratings of 1)
item_to_features = sparse row binary matrix (one-of-n encoded)
lfm = LightFM() where no_components << num_items
lfm.train(user_to_item, item_features=item_to_features)

(NB: No user features are present here, and we're picking the defaults from LightFM)

You could do something along the lines of precomputing:

# if there are item features, the item representations are mapped to these so we
# just have to multiply with our original item_features sparse matrix to obtain
# item factors (which are composed as sums over their features)
item_feature_bias, item_feature_factors = lfm.get_item_representations()
item_factors = (item_features @ item_feature_factors)
item_bias = (item_features @ item_feature_bias)
item_factors_inv = np.linalg.pinv(item_factors)

Then to take the top 20 prediction for a new user, you could do something like:

# we can execute this routine in the prediction function, which should be a
# close approximation of the rank produced internally by LightFM
new_user_to_item = sparse row binary matrix (all ratings of 1)
user_factors = (item_factors_inv @ new_user_to_item.T).T
predictions = user_factors @ item_factors.T + item_bias
(-predictions).argsort()[0, :20]

If there are no item features then you can just compute the pseudo-inverse of the second output from get_item_representations() directly instead and it should approximate the rank of the built-in predict function.

The predictions in the case of features as movielens genres end up with a fairly low MSE compared to what is built in, that won't always be true but the rank orders tend to line up.

image

Ganners avatar May 13 '22 11:05 Ganners