lightfm icon indicating copy to clipboard operation
lightfm copied to clipboard

User/Item feature selection

Open sacrosanct007 opened this issue 3 years ago • 1 comments

Hi,

I'm learning more about Recommender models and LightFM and I've a question. Is there a recommended way/guidelines around how to perform feature selection for User/ Item features for use in LightFM or is it completely based on trial and error?

Does it make sense to add diverse features by using correlation to exclude lower ranked variable in case a pair of features is highly correlated? My question has reference to some of the other issues where adding uninformative features reduced LightFM model's performance. However I'm not sure how to identify uninformative features in this case? See Macie's response to related issue 551 below:

The implementation isn't broken.

It is, however, very simple: the model simply averages the embeddings of all the features it is given. Because of the averaging, the model is incapable of figuring out which features are uninformative and ignoring them.

Consequently, if you add lots of uninformative features they will degrade your model by diluting the information provided by your good features. To prevent this, you may have to adopt more sophisticated models whose implementations are not offered by LightFM.

Note also that metadata features are likely to improve performance only on very sparse datasets, or sparse (long tail, cold-start) subsets of your data."

sacrosanct007 avatar Mar 08 '21 18:03 sacrosanct007

I am also interested in this

chiarapalma avatar Sep 27 '23 14:09 chiarapalma