lightfm
lightfm copied to clipboard
model.predict() and embedding multiplication gave different results
I have trained a hybrid model. What is the best way to generate predictions for every item-user pair and rank them? I know that model.predict()
can be used. I also tried to multiply the user embedding with the item embedding but got a different result (x != y). Did I make a mistake in embedding multiplication? Do I need to add the bias term?
model = LightFM(loss='warp',
learning_schedule='adagrad',
no_components=NO_COMPONENTS,
learning_rate=LEARNING_RATE,
user_alpha=USER_ALPHA,
item_alpha=ITEM_ALPHA,
max_sampled=MAX_SAMPLED,
random_state=np.random.RandomState(SEEDNO))
model.fit(interactions=train_interactions,
user_features=user_features,
epochs=NO_EPOCHS);
x = model.predict(0, np.arange(n_items), user_features=user_features)
y = np.dot(model.user_embeddings[0, :], model.item_embeddings.T)
Duplicate of #474
Yes, you need to add both the user_biases
and item_biases
. The best way to do it in bulk is to compute them while dotting. Here's how to do it:
- In user_embeddings add an
user_biases
column followed by thenp.ones
column - In the item_embeddings add the
np.ones
column followed byitem_biases
column.
This will essentially multiply both biases for each interaction with 1 and add them to the final prediction score.
If it is still not clear, here is an example taken from this comment.
# load latent representations
item_biases, item_factors = model.get_item_representations()
user_biases, user_factors = model.get_user_representations()
# combine item_factors with biases for dot product
item_factors = np.concatenate((item_factors, np.ones((item_biases.shape[0], 1))), axis=1)
item_factors = np.concatenate((item_factors, item_biases.reshape(-1, 1)), axis=1)
# combine user_factors with biases for dot product
user_factors = np.concatenate((user_factors, user_biases.reshape(-1, 1)), axis=1)
user_factors = np.concatenate((user_factors, np.ones((user_biases.shape[0], 1))), axis=1)
scores = user_factors.dot(item_factors.T)
Yes, you need to add both the
user_biases
anditem_biases
. The best way to do it in bulk is to compute them while dotting. Here's how to do it:* In **user_embeddings** add an `user_biases` column followed by the `np.ones` column * In the **item_embeddings** add the `np.ones` column followed by `item_biases` column.
This will essentially multiply both biases for each interaction with 1 and add them to the final prediction score.
If it is still not clear, here is an example taken from this comment.
Hey, thanks for this explanation, however I was wondering why to use this approach, when one could simply add user_biases to each row and item_biases to each column of the resulting interaction matrix coming from the multiplication of the user and item embeddings. Is it somehow simpler, computationally speaking?
e.g.
| ui(1,1), ui(1,2) |
UxI = | ui(2,1), ui(2,2) |
| ui(3,1), ui(3,2) |
Bi_1 Bi_2
+ +
Bu_1 + | ui(1,1), ui(1,2) |
UxI + Biases = Bu_2 + | ui(2,1), ui(2,2) |
Bu_3 + | ui(3,1), ui(3,2) |