TwoTower unable to use user-feats in .predict()
Looking through source code, TwoTower.predict() seems to reference predict_from_embedding and not predict_tf_feat, despite the fact that TwoTower.recommend_user can use feats for prediction.
When trying to manually call predict_tf_feat, I receive this error:
`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/tmp/ipykernel_4360/2307691744.py in
/opt/conda/lib/python3.7/site-packages/libreco/recommendation/recommend.py in recommend_tf_feat(model, user_ids, n_rec, user_feats, seq, filter_consumed, random_rec, inner_id) 89 inner_id=False, 90 ): ---> 91 feed_dict = process_tf_feat(model, user_ids, user_feats, seq, inner_id) 92 if model.model_name == "SIM": 93 preds = model.sess.run(model.inference_output, feed_dict)
/opt/conda/lib/python3.7/site-packages/libreco/recommendation/preprocess.py in process_tf_feat(model, user_ids, user_feats, seq, inner_id)
131 assert isinstance(user_feats, dict), "user_feats must be dict."
132 sparse_indices, dense_values = set_temp_feats(
--> 133 model.data_info, sparse_indices, dense_values, user_feats
134 )
135
/opt/conda/lib/python3.7/site-packages/libreco/prediction/preprocess.py in set_temp_feats(data_info, sparse_indices, dense_values, feat_dict) 69 data_info.sparse_idx_mapping, 70 data_info.sparse_offset, ---> 71 feat_dict, 72 ) 73 _set_dense_values(dense_values_copy, data_info.col_name_mapping, feat_dict)
/opt/conda/lib/python3.7/site-packages/libreco/prediction/preprocess.py in _set_sparse_indices(sparse_indices, col_mapping, sparse_idx_mapping, offsets, feat_dict) 93 feat_idx = idx_mapping[val] 94 offset = offsets[field_idx] ---> 95 sparse_indices[:, field_idx] = feat_idx + offset 96 97
TypeError: 'NoneType' object does not support item assignment`
Hopefully this is something that can be implemented, being able to get prediction scores of new users using feats would be incredibly useful. it seems feats being excluded from TwoTower.predict() is an oversight.
TwoTower inherits from the DynEmbedBase class, so it can't use the predict_tf_feat function. Since it is an embedding model, you can use Dynamic Embedding Generation to generate the user embedding and perform dot product with an item embedding to get prediction score.
original_user, original_item = "Not A Real ID", "ITEM"
model = TwoTower(...)
model.fit(...)
user_embed = model.dyn_user_embedding(user=original_user, user_feats=row_dict)
item_id = model.data_info.item2id[original_item]
item_embed = model.item_embeds_np[item_id]
pred = user_embed @ item_embed
Thank you for the response. I actually found a solution fitting my use case, and would like to include a request:
What I did was manually grab the reccommend_tf_feats function and manually enabled the "return_scores" function.
from libreco.recommendation import check_dynamic_rec_feats, rank_recommendations
def score_all_items(
model,
user,
n_rec,
user_feats=None,
seq=None,
cold_start="average",
inner_id=False,
filter_consumed=True,
random_rec=False,
return_scores=False,
):
if user_feats is None and seq is None:
return super().recommend_user(
user, n_rec, cold_start, inner_id, filter_consumed, random_rec
)
check_dynamic_rec_feats(model.model_name, user, user_feats, seq)
user_embed = model.dyn_user_embedding(
user, user_feats=user_feats, seq=seq, include_bias=True, inner_id=inner_id
)
if user_embed.ndim == 1:
user_embed = np.expand_dims(user_embed, axis=0)
item_embeds = model.item_embeds_np[: model.n_items]
preds = user_embed @ item_embeds.T
computed_recs, computed_scores = rank_recommendations(
model.task,
model.convert_array_id(user, inner_id),
preds,
n_rec,
model.n_items,
model.user_consumed,
filter_consumed,
random_rec,
return_scores,
)
rec_items = (
computed_recs[0]
if inner_id
else np.array([model.data_info.id2item[i] for i in computed_recs[0]])
)
rec_scores = (
computed_scores[0]
)
# only one user is allowed in dynamic situation
return {user: rec_items}, {user: rec_scores}
score_all_items(loaded_model, user='user', user_feats=user_feats, n_rec=X, filter_consumed=False, return_scores=True)
This returned all items + scores for me. It could be helpful for many use cases to have both the Top Recs for a user, while also having the numerical scores for them. My request would be to have the return_scores callable for the function across all use cases.
OK, I'll consider adding the return_scores option.