lightfm icon indicating copy to clipboard operation
lightfm copied to clipboard

How to train LightFM in production???

Open hammadkhann opened this issue 5 years ago • 48 comments

Hello I want to use my recommendation model in production I don't know how to train it in production scenario online/offline?? Can you guide about how to efficiently use it because I have a lot of data to cater???

hammadkhann avatar Jul 24 '18 06:07 hammadkhann

Do I have to retrain my lightfm model on full feed data everytime? because I have around 70gb data rightnow which I gather from elastic search and then transform it for my model and it takes any solution for this problem I dont know how to use it in production

hammadkhann avatar Jul 24 '18 07:07 hammadkhann

From the docs it looks like you want LightFM.fit_partial. http://lyst.github.io/lightfm/docs/lightfm.html#lightfm.LightFM.fit_partial

RothNRK avatar Jul 24 '18 08:07 RothNRK

@RothNRK No I know about I was asking about how reduce training time or only train my model on the new data not on full feed does that work?

hammadkhann avatar Jul 24 '18 09:07 hammadkhann

You could train your model offline using your full set of historical data, and afterwards update your model using the LightFM.fit_partial with the new data during production.

In this case your model will continue grow if you add new users/items that were not included in the initial LightFM.fit.

Does this example help?

from lightfm import LightFM
from lightfm.data import Dataset

# Make some fake data.
def fake_data(n):
    users = np.random.choice([0., 1., 2.], (n, 1))
    items = np.random.choice([0., 1., 2.], (n, 1))
    weight = np.random.rand(n,1)
    return np.concatenate((users, items, weight), axis=1)

train_data = fake_data(10)
new_data = fake_data(3)

# Use Dataset to prep your interactions and weights.
dataset = Dataset()

dataset.fit(users=np.unique(train_data[:, 0]), items=np.unique(train_data[:, 1]))
train_interactions, train_weights = dataset.build_interactions((i[0], i[1], i[2]) for i in train_data)

dataset.fit_partial(users=np.unique(new_data[:, 0]), items=np.unique(new_data[:, 1]))
new_interactions, new_weights = dataset.build_interactions((i[0], i[1], i[2]) for i in new_data)

# Fit the model using your full set of historic data.
solver = LightFM()
solver.fit(interactions=train_interactions, sample_weight=train_weights)

# In production update your old model with new data.
solver.fit_partial(interactions=new_interactions, sample_weight=new_weights)

RothNRK avatar Jul 24 '18 12:07 RothNRK

Great it means I have to train my model once one full data offline and then use fit_partial on the new production data?

hammadkhann avatar Jul 24 '18 12:07 hammadkhann

Do I have to load user_features and item_features to pass them in predict method while getting predictions in production server?? I think my server memory will run out because of this? How do you guys manage this problem?? @maciejkula @RothNRK

hammadkhann avatar Aug 01 '18 10:08 hammadkhann

Its more common having two machines. One worker suited for training which will have more memory and another one just to serve the API. Their communications can be made with a DB.

nocedan avatar Aug 01 '18 11:08 nocedan

@nocedan Rightnow I am filtering my item_ids before prediction based on users country,city and in which category he is currently is and then I pass those item ids to the predict func this filtering part takes time when done on runtime and also I have to load items_data in to memory and then apply filters do you have any solution to that?

hammadkhann avatar Aug 01 '18 12:08 hammadkhann

I think it depends on your api architecture. I have implemented this kind of filtering this way having filter parameters in memory by user key. The latency was not a problem. IF it is in your case i think you shoud have some kind of cache for recommendations. Have you measured filtering time?

nocedan avatar Aug 01 '18 13:08 nocedan

@nocedan I have not deployed it yet but on my local machine my call is taking around 580ms

hammadkhann avatar Aug 01 '18 13:08 hammadkhann

This seams fine in my opinion. What about the filtering time alone?

nocedan avatar Aug 01 '18 13:08 nocedan

@nocedan Another filtering process I am doing i missed before is on items tags similarity in which I have to compare list of tags of each item_id against the list of tags of item_id passed in a post call means user is on that item right now then find their intersection if they are 80% similar then I pass those item ids to the predict function Can I skip that and pass Item id to predict func lets say my user is seeing details of Mcdonalds to my model recommends the user similar items like burgerking,kfc ??

hammadkhann avatar Aug 01 '18 13:08 hammadkhann

@nocedan I have optimised it rightnow it works fast i will tell you the exact time after measuring but as the data grows it will increase so I am worried about the scalability issue in future.

hammadkhann avatar Aug 01 '18 13:08 hammadkhann

Well IF your worried you can make tests with a simulated scenario. About the predictions i dont know, have you tried the documentation?

nocedan avatar Aug 01 '18 13:08 nocedan

@nocedan Yes i have seen the paper and documentation. Anyways can you guide me about online training of my lightFM model in production how can I do it do I have to use fit partial or what?

hammadkhann avatar Aug 01 '18 13:08 hammadkhann

@RothNRK In the example you provided above an error is thrown ValueError: The item feature matrix specifies more features than there are estimated feature embeddings: 2 vs 3. I stumbled upon this thread after getting this same error in my own project. Any idea how to solve this? The error is raised in the fir_partial call.

fischjer4 avatar Aug 01 '18 23:08 fischjer4

@fischjer4 I'm sorry but I just ran my example without any error (lightfm==1.15). If you run my example and still get an error can you try and increase the number of train_data = fake_data(100) to 100? Let me know how that works for you.

RothNRK avatar Aug 06 '18 13:08 RothNRK

@RothNRK Whoops, a mistake on my end. Sorry for the inconvenience.

As a cautionary note to new users, when using model.fit_partial the users/items/features in the supplied matrices must have been present during the initial training. Meaning, if you want to add new users/items/features, you must retrain the model. Check here

fischjer4 avatar Aug 06 '18 16:08 fischjer4

I'm happy you sorted out the issue.

RothNRK avatar Aug 07 '18 07:08 RothNRK

@fischjer4 , i'm glad as well that you brought this to light. In such cases, for new users its better to treat the problem as cold start, as in the examples of the documentation. But unfortunately, after a while, when new user has recorded behavior, it seams that the SGD that you have already ran is "lost".

nocedan avatar Aug 07 '18 17:08 nocedan

What is the difference between fit and fit_partial if the dimension of the interaction matrix needs to be the same. Use it for parameter tuning?

dwy904 avatar Oct 19 '18 20:10 dwy904

The only difference between fit and fit_partial is that fit calls _reset_state before training. fit_partial is useful to collect model quality metrics during training.

dbalabka avatar Feb 20 '20 14:02 dbalabka

Do I have to retrain my lightfm model on full feed data everytime? because I have around 70gb data rightnow which I gather from elastic search and then transform it for my model and it takes any solution for this problem I dont know how to use it in production

Also, how did you predict for 70gb of data, cause I am trying to do something similiar and the predict function after fitting the model fails. I also tried using the get_user_representations and get_user_representations but the dot product is slow as well. Can you suggest a solution ?

item_biases, item_embeddings = model.get_item_representations(item_feature) user_biases, user_embeddings = model.get_user_representations(user_feature) other_pred = ( user_embeddings.dot(item_embeddings.T) + () item_biases.reshape(1, -1) + user_biases.reshape(-1, 1) )

meghanad1 avatar Feb 28 '20 07:02 meghanad1

Hi everybody, I need help regarding partial fit model period and how to update recommendation as well as we are using different url hit for the two processes How to manage other events like add to cart and view behavior of customer

surajPowerWeave avatar Mar 04 '20 09:03 surajPowerWeave

@RothNRK what about if you have a new user with new recommendations? Your code works because you're adding new interaactions to existing users ([0,1,2]), but what happens with for example if you do something like this:

def fake_data(n, possible_users):
    users = np.random.choice(possible_users, (n, 1))
    items = np.random.choice(possible_users, (n, 1))
    weight = np.random.rand(n,1)
    return np.concatenate((users, items, weight), axis=1)

train_data = fake_data(10, [0., 1., 2.])
new_data = fake_data(3, [3., 4., 5.])

Now you will have problems with the error ValueError: The item feature matrix specifies more features than there are estimated feature embeddings: x vs y.

So, the question is, anyone knows how to partially train or just have recommendations for a new user and not get recommendations from the user's properties but by existing interactions?

An example can be, you can train all your data, and now you ask, do a survey or something for a user x that likes some items, so now, how can I recommend items to this user without retrain the model? is it impossible with LightFM?

alexmilano avatar Sep 24 '20 15:09 alexmilano

It's not impossible, but it's not supported officially. I have solved this by initialising a new LightFM instance with the new number of users/items/features, and then overwriting the top left block of all the internal matrices with the previously trained values. That way you maintain your weights, gradients, and biases for the existing users/items, and any new ones have the values from initialisation. Then you can continue with fit_partial to update all of them together.

Edit: Example of how to do this here

JohnPaton avatar Sep 24 '20 16:09 JohnPaton

I followed the example given by @RothNRK, and I got this error.

ValueError: The user feature matrix specifies more features than there are estimated feature embeddings: 14180912 vs 14207474.

I checked the data shapes:

len(model.user_embeddings) = 14180912
new_lightfm_guest_features.shape =  (7686831, 14207474)
lightfm_guest_features.shape = (7660276, 14180912)
len(dataset._user_feature_mapping) = 14207474

I load the dataset and then, use

dataset.fit_partial(
        users=new_booking_features['booking_id'].unique(),
        items=new_deal_features['hotel_deal_id'].unique(),
        item_features=generate_feature_list(new_deal_features, new_deal_feature_list),
        user_features=generate_feature_list(new_booking_features, new_
booking_features_list)) 

to add new item/users and I also load the model and use

model.fit_partial(new_interactions, item_features=new_lightfm_deal_features,
                      user_features=new_lightfm_guest_features, verbose=True,
                      num_threads=4, epoches=100)

to re-train the model.

ynorouzz avatar Oct 01 '20 12:10 ynorouzz

@ynorouzz Please check out the exchange between @alexmilano and @JohnPaton. You're getting this error because you've added new feature columns. fit_partial will update existing columns but not add new ones.

RothNRK avatar Oct 12 '20 07:10 RothNRK

@RothNRK Thank you for your response. Do you mean in model.fit_partial(), I should not use the item_features and user_features parameters? Then I will get this error: ValueError: Incorrect number of features in item_features

ynorouzz avatar Oct 13 '20 12:10 ynorouzz

@RothNRK Thank you for your response. Do you mean in model.fit_partial(), I should not use the item_features and user_features parameters? Then I will get this error: ValueError: Incorrect number of features in item_features

no @ynorouzz, you can use item_features or user_features on fit_partial, the thing is that fit_partial works for existing users, you can't add new interaction of new users, you can only add new interactions for existing users.

alexmilano avatar Oct 13 '20 13:10 alexmilano