lightfm icon indicating copy to clipboard operation
lightfm copied to clipboard

Plans for running in production / near real-time

Open robertddewey opened this issue 5 years ago • 9 comments

So I've read over the many great "issues" here and the comments seem very helpful.

I'm interested in running LightFM in production / near real-time. From what I've gathered, the best way to do this is:

  • Prep/fit my initial data using Dataset()
  • Subsequent additional data will call Dataset.fit_partial()
  • Fit the initial model using LightFM.fit()
  • Subsequent additional data will be fit with LightFM.fit_partial()

What I am planning is to have a Python script that continuously runs and looks for new data in a queue (probably using Redis), fits new data, and produces/caches recommendations. From what I gather reading comments here, I can save the model using pickle so I don't have to re-train in entirety later in the event my script fails, etc; I could just open the pickle file and resume.

My questions:

  1. When I resume training using the pickled LightFM model: Should I recreate the entire Dataset(), or just start where I left off with data that wasn't processed, and call fit_partial?

  2. Should I pickle the LightFM model every so often, as a "save state"? For example, every hour (arbitrary time value)

  3. I'm using user/item features. Should I call build_user|item_features() after dataset.fit_partial()? If yes, should it contain all user/item features, or only for the new user(s)/item(s)?

  4. When using user/item features for prediction, do I pass ALL user/item features (entire dataset) or only the features for the user(s)/item(s) I am predicting for?

  5. Are user/item interactions cumulative, or do they overwrite each other? For example, if a user interacts with an item multiple times with differing weights, is this handled in the model?

Thanks so much, everyone!

robertddewey avatar Feb 28 '19 05:02 robertddewey

  1. That depends. Running on old data as well as new data will give you as model that is relatively backward-looking, as it pays more attention to old data. Running only on new data will make it adapt faster to new data, but potentially at some cost to model stability. Once you have your system up and running, I would recommend A/B testing this.
  2. When training? Sure, that sounds very sensible. That way you lose less progress if your machine goes down etc.
  3. Yes! One thing that LightFM does not support, though, is adding new features. There are ways around this, but there is nothing out-of-the box.
  4. Up to you, this may be helpful to understand how things work.
  5. The general assumption is that there is only one interaction per user/item pair. You can customize this to your liking by manually constructing your interactions matrix.

maciejkula avatar Mar 03 '19 22:03 maciejkula

One sketch of what you might want to do is this:

  1. Consume new interactions from your queue.
  2. Once you have some fairly chunky number of new data, construct a dataset containing new data only, and call fit_partial on the model.
  3. Re-generate your recommendations.
  4. Run a separate job that trains a new model from scratch in a batch fashion. You can use this model to add new features/users/items.
  5. Periodically swap in the new batch model to be updated from your queue.

maciejkula avatar Mar 03 '19 22:03 maciejkula

Thank you so much for the feedback! Very helpful!

robertddewey avatar Mar 03 '19 23:03 robertddewey

One sketch of what you might want to do is this:

  1. Consume new interactions from your queue.
  2. Once you have some fairly chunky number of new data, construct a dataset containing new data only, and call fit_partial on the model.
  3. Re-generate your recommendations.
  4. Run a separate job that trains a new model from scratch in a batch fashion. You can use this model to add new features/users/items.
  5. Periodically swap in the new batch model to be updated from your queue.

@maciejkula Very helpfull feedback. may i ask to confirm that there is no way i can add new user/item into currently trained model? i have to create new from scratch?

Unimax avatar Jul 01 '19 09:07 Unimax

@Unimax Its been a few years so you've probably found a workaround, but for others who might be reading this now, you should check out this post for adding new users/items/features: https://github.com/lyst/lightfm/issues/347#issuecomment-716036635

lkurlandski avatar Oct 25 '20 13:10 lkurlandski

One sketch of what you might want to do is this:

  1. Consume new interactions from your queue.
  2. Once you have some fairly chunky number of new data, construct a dataset containing new data only, and call fit_partial on the model.

When calling fit_partial with new data, how many epochs should be used. Should it also be decided using cross-validation? Are there any other methods to decide on the number of epochs?

gergelyBognar avatar Feb 11 '21 18:02 gergelyBognar

I am also working on productionizing FM models, and looking forward to hear updates from this thread. I wonder if lightfm has a builtin approach to handle new users which we didn't see in training time, for example using hashing trick.

azarezade avatar Nov 19 '23 13:11 azarezade

Hello, did you succeed with pushing LightFM to production? What service did you use? I've been trying to use AWS Lambda but haven't got too far due to layer (package) size limitations. So I am wondering what approach you chose. I assume the use case is as follow: parsing user ratings / user features through an API and getting predictions back. @azarezade @robertddewey

johnnypetr93 avatar Apr 08 '24 20:04 johnnypetr93

Hello, did you succeed with pushing LightFM to production? What service did you use? I've been trying to use AWS Lambda but haven't got too far due to layer (package) size limitations. So I am wondering what approach you chose. I assume the use case is as follow: parsing user ratings / user features through an API and getting predictions back. @azarezade @robertddewey

We end up using Spark FM Regressor/Classifier, and for productionizing it we saved the embedding matrices and do matrix product using torch, and also used some layers for id mapping, and also implement hashing trick for new ids.

azarezade avatar Apr 09 '24 08:04 azarezade