implicit icon indicating copy to clipboard operation
implicit copied to clipboard

Continuous learning from a stream of data

Open dminovski0 opened this issue 4 years ago • 2 comments

I use the model to suggest recommendations. But, when there is new data, the entire model needs to be retrained from the start. Is there an option to use continuous learning, and only the new data to be added, for training the model?

dminovski0 avatar May 31 '21 18:05 dminovski0

You can't do that with implicit out of the box because you need a way to construct a user_items matrix for new data, also set up a tool to provide previously trained factors. Also you might not want to overfit on new data if you construct user_items matrix only on new streaming data

There are ways to train ALS on streaming data:

  • make and update counters like in the paper
  • build a neural net to mimick MF and update its weights on new data with SGD

gazon1 avatar Jun 09 '21 11:06 gazon1

I've added support for incremental retraining in this PR https://github.com/benfred/implicit/pull/527 .

Having said that, you'll still need to maintain the list of items for each users you want to retrain like @gazon1 mentioned - so there are some complexities here that you'd have to implement. (for instance only retraining users/items after the number of liked items changes by some percentage, and then having some code to construct the user_items matrix for those users etc).

benfred avatar Jan 25 '22 18:01 benfred