LibRecommender icon indicating copy to clipboard operation
LibRecommender copied to clipboard

TwoTower unable to use user-feats in .predict()

Open Knarz-AP opened this issue 1 year ago • 3 comments

Looking through source code, TwoTower.predict() seems to reference predict_from_embedding and not predict_tf_feat, despite the fact that TwoTower.recommend_user can use feats for prediction.

When trying to manually call predict_tf_feat, I receive this error:

`--------------------------------------------------------------------------- TypeError Traceback (most recent call last) /tmp/ipykernel_4360/2307691744.py in 1 from libreco.recommendation import recommend_tf_feat ----> 2 recommend_tf_feat(model, user_ids='Not A Real ID', user_feats=row_dict, n_rec=10, filter_consumed=False, seq=None, random_rec=False)

/opt/conda/lib/python3.7/site-packages/libreco/recommendation/recommend.py in recommend_tf_feat(model, user_ids, n_rec, user_feats, seq, filter_consumed, random_rec, inner_id) 89 inner_id=False, 90 ): ---> 91 feed_dict = process_tf_feat(model, user_ids, user_feats, seq, inner_id) 92 if model.model_name == "SIM": 93 preds = model.sess.run(model.inference_output, feed_dict)

/opt/conda/lib/python3.7/site-packages/libreco/recommendation/preprocess.py in process_tf_feat(model, user_ids, user_feats, seq, inner_id) 131 assert isinstance(user_feats, dict), "user_feats must be dict." 132 sparse_indices, dense_values = set_temp_feats( --> 133 model.data_info, sparse_indices, dense_values, user_feats 134 ) 135

/opt/conda/lib/python3.7/site-packages/libreco/prediction/preprocess.py in set_temp_feats(data_info, sparse_indices, dense_values, feat_dict) 69 data_info.sparse_idx_mapping, 70 data_info.sparse_offset, ---> 71 feat_dict, 72 ) 73 _set_dense_values(dense_values_copy, data_info.col_name_mapping, feat_dict)

/opt/conda/lib/python3.7/site-packages/libreco/prediction/preprocess.py in _set_sparse_indices(sparse_indices, col_mapping, sparse_idx_mapping, offsets, feat_dict) 93 feat_idx = idx_mapping[val] 94 offset = offsets[field_idx] ---> 95 sparse_indices[:, field_idx] = feat_idx + offset 96 97

TypeError: 'NoneType' object does not support item assignment`

Hopefully this is something that can be implemented, being able to get prediction scores of new users using feats would be incredibly useful. it seems feats being excluded from TwoTower.predict() is an oversight.

Knarz-AP avatar Jan 31 '25 22:01 Knarz-AP

TwoTower inherits from the DynEmbedBase class, so it can't use the predict_tf_feat function. Since it is an embedding model, you can use Dynamic Embedding Generation to generate the user embedding and perform dot product with an item embedding to get prediction score.

original_user, original_item = "Not A Real ID", "ITEM"
model = TwoTower(...)
model.fit(...)
user_embed = model.dyn_user_embedding(user=original_user, user_feats=row_dict)
item_id = model.data_info.item2id[original_item]
item_embed = model.item_embeds_np[item_id]
pred = user_embed @ item_embed

massquantity avatar Feb 02 '25 03:02 massquantity

Thank you for the response. I actually found a solution fitting my use case, and would like to include a request:

What I did was manually grab the reccommend_tf_feats function and manually enabled the "return_scores" function.

from libreco.recommendation import check_dynamic_rec_feats, rank_recommendations
def score_all_items(
        model,
        user,
        n_rec,
        user_feats=None,
        seq=None,
        cold_start="average",
        inner_id=False,
        filter_consumed=True,
        random_rec=False,
        return_scores=False,
    ):
       
        if user_feats is None and seq is None:
            return super().recommend_user(
                user, n_rec, cold_start, inner_id, filter_consumed, random_rec
            )

        check_dynamic_rec_feats(model.model_name, user, user_feats, seq)
        user_embed = model.dyn_user_embedding(
            user, user_feats=user_feats, seq=seq, include_bias=True, inner_id=inner_id
        )
        if user_embed.ndim == 1:
            user_embed = np.expand_dims(user_embed, axis=0)
        item_embeds = model.item_embeds_np[: model.n_items]
        preds = user_embed @ item_embeds.T

        computed_recs, computed_scores = rank_recommendations(
            model.task,
            model.convert_array_id(user, inner_id),
            preds,
            n_rec,
            model.n_items,
            model.user_consumed,
            filter_consumed,
            random_rec,
            return_scores,
        )
        rec_items = (
            computed_recs[0]
            if inner_id
            else np.array([model.data_info.id2item[i] for i in computed_recs[0]])
        )
        rec_scores = (
            computed_scores[0]
        )
        # only one user is allowed in dynamic situation
        return {user: rec_items}, {user: rec_scores}

score_all_items(loaded_model, user='user', user_feats=user_feats, n_rec=X, filter_consumed=False, return_scores=True)

This returned all items + scores for me. It could be helpful for many use cases to have both the Top Recs for a user, while also having the numerical scores for them. My request would be to have the return_scores callable for the function across all use cases.

Knarz-AP avatar Feb 04 '25 00:02 Knarz-AP

OK, I'll consider adding the return_scores option.

massquantity avatar Feb 05 '25 12:02 massquantity