models [Task] Simplify Top-k recommender API

Problem

Generating the top-k prediction from a trained the retrieval model currently requires the user to follow these steps (Not straightforward). Besides, the top-k recommender is a ModelBlock that cannot be saved / re-loaded.
This feature should also fix issue #499

[ ] Decouple the top-k local prediction top-k evaluation from the retrieval contrastive learning task.
[ ] Convert retrieval models to a top-k recommender model : Matrix Factorization, Two-Tower, and YoutubeDNN
[ ] Ensure the Keras analogy for the top-k recommender where the user can call .predict, .evaluate, .save, and load the model.
[ ] ItemRecommender should support different top-k strategies: Brute-force || Streaming || ANN or any user-specific top-k strategy.
[ ] Ensure retrieval experiments and CI performance tests are returning same level of performance with the new Retrieval API

Definition: The Top-k recommender is a model with : Query encoder + Top-k layer
Prerequisite of the top-k recommender:
- Predict method: returns top-k items (scores and ids) for a given query (user)
- Evaluate method: compute ranking metrics for a dataset of users/queries
- batch_predict: return a dataset with top-k items for a dataset of users/queries
- save: The top-k model is the 'useful' part of the retrieval pipeline as it is the one that generates the prediction for the external endpoint. The user needs to save this model and reload it later for evaluation or local prediction
  - Supporting different top-k strategies
Arguments of the top-k layer:
- A cut-off k
- The dataset of candidates: pre-trained item embeddings
- The method index_from_dataset: to set the index for the top-k search
- The method score: the distance metric to use for computing the score between the query and the item embeddings. (default: dot-product)
- Call method:
  - Takes as input the query embeddings
  - Define the logic of how to retrieve the top-k items : * The scope of this first work is "Brute-Force"

Open questions:

Do we need to re-train the new recommender model with the pre-trained item embeddings? ==> e.g., convert a two-tower model to a youtube-dnn like model. ==> This can be done outside of the top-k recommender class. We should simplify the top-k recommender for supporting different top-k strategies
Should we define the Top-k layer as a sub-class of the CategoricalPrediction block?
implementation starting points: --> https://github.com/NVIDIA-Merlin/models/compare/main...tf/retrieval-models --> https://github.com/NVIDIA-Merlin/models/pull/663/files

Aug 04 '22 17:08 sararb

When this refactory is done, we should retest #339 to check if the slowness building the top-k index still persists.

Aug 18 '22 17:08 gabrielspmoreira

Closing this issue as it is a duplicate of a new task tracked in the session-based roadmap ticket.

Sep 06 '22 13:09 sararb