lightfm icon indicating copy to clipboard operation
lightfm copied to clipboard

Using feature vectors of just 1 feature

Open mcarretta opened this issue 3 years ago • 0 comments

Hello to everyone! I'm implementing a LightFM model on a TV series dataset. I want to include user features on Binge-watcher users and item features on binge-worthy series: basically I have two matrices [number of users x 1] and [number of items x 1], with 1 if a user is a binge-watcher and 0 if it is not (same holds for bingeworthy series). If I try to convert these arrays into csr matrix and feed them into the model with self.lightFM_model = self.lightFM_model.fit(self.URM_train, epochs=self.epochs, num_threads=self.num_threads, user_features=self.users_features, item_features=self.items_features) I'd get "ValueError: Not all estimated parameters are finite, your model may have diverged. Try decreasing the learning rate or normalising feature values and sample weights". I read in a stack overflow question that the problem might be related to user or items having no feature set to 1. In my case, since 3K users over 42K total users are binge-watcher, most of them (39K) have a feature value that's 0, so, before converting those two arrays to CSR, I stacked them with an identity matrix so that I have no row with all zeros. For example, if I had just 3 users (and just the second one is a binge-watcher), I'd get a feature matrix like this one before converting it into CSR: [[0], [1], [0]] => [[0 1 0 0], [1 0 1 0], [0 0 0 1]] Problem's that by using this approach my recommender performance worsen by a thousand times wrt the same LightFM model without the binge-watching / binge-worthy feature vectors, becoming worse than a random recommender. I'm assuming there's something wrong with stacking that identity matrix to my feature vector, making it basically garbage. Can I train a lightFM model with just 1 feature for user and 1 feature for items or is it impossible? Thanks in advance for your help!

mcarretta avatar Feb 22 '21 11:02 mcarretta