lightfm icon indicating copy to clipboard operation
lightfm copied to clipboard

model.predict() and embedding multiplication gave different results

Open iDataist opened this issue 2 years ago • 3 comments

I have trained a hybrid model. What is the best way to generate predictions for every item-user pair and rank them? I know that model.predict() can be used. I also tried to multiply the user embedding with the item embedding but got a different result (x != y). Did I make a mistake in embedding multiplication? Do I need to add the bias term?


model = LightFM(loss='warp', 
                learning_schedule='adagrad',
                no_components=NO_COMPONENTS, 
                learning_rate=LEARNING_RATE, 
                user_alpha=USER_ALPHA,
                item_alpha=ITEM_ALPHA,
                max_sampled=MAX_SAMPLED,
                random_state=np.random.RandomState(SEEDNO))

model.fit(interactions=train_interactions,
           user_features=user_features,
           epochs=NO_EPOCHS);

x = model.predict(0, np.arange(n_items), user_features=user_features)
y = np.dot(model.user_embeddings[0, :], model.item_embeddings.T)

iDataist avatar Oct 13 '21 13:10 iDataist

Duplicate of #474

Rajon010 avatar Oct 16 '21 17:10 Rajon010

Yes, you need to add both the user_biases and item_biases. The best way to do it in bulk is to compute them while dotting. Here's how to do it:

  • In user_embeddings add an user_biases column followed by the np.ones column
  • In the item_embeddings add the np.ones column followed by item_biases column.

This will essentially multiply both biases for each interaction with 1 and add them to the final prediction score.

If it is still not clear, here is an example taken from this comment.

# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()

# combine item_factors with biases for dot product
item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1)
item_factors = np.concatenate((item_factors, item_biases.reshape(-1, 1)), axis=1)

# combine user_factors with biases for dot product  
user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1)
user_factors = np.concatenate((user_factors, np.ones((user_biases.shape[0], 1))), axis=1)
        
scores = user_factors.dot(item_factors.T)

Meghpal avatar Oct 26 '21 05:10 Meghpal

Yes, you need to add both the user_biases and item_biases. The best way to do it in bulk is to compute them while dotting. Here's how to do it:

* In **user_embeddings** add an `user_biases` column followed by the `np.ones` column

* In the **item_embeddings** add the `np.ones` column followed by `item_biases` column.

This will essentially multiply both biases for each interaction with 1 and add them to the final prediction score.

If it is still not clear, here is an example taken from this comment.

Hey, thanks for this explanation, however I was wondering why to use this approach, when one could simply add user_biases to each row and item_biases to each column of the resulting interaction matrix coming from the multiplication of the user and item embeddings. Is it somehow simpler, computationally speaking?

e.g.

        |  ui(1,1), ui(1,2) |
UxI =   |  ui(2,1), ui(2,2) |
        |  ui(3,1), ui(3,2) |


                            Bi_1     Bi_2
                              +        +
                 Bu_1 + |  ui(1,1), ui(1,2) | 
UxI + Biases =   Bu_2 + |  ui(2,1), ui(2,2) |
                 Bu_3 + |  ui(3,1), ui(3,2) |

wtfzambo avatar Mar 22 '22 18:03 wtfzambo