models icon indicating copy to clipboard operation
models copied to clipboard

[TASK] Support shared embeddings across the towers

Open rnyak opened this issue 3 years ago • 2 comments

🚀 Feature request

We should be able to jointly encode multiple columns (e.g., purchased_id, item_id) and let them share the same embedding tables during model training, both for ranking and retrieval models.

For Two-tower model (or similar), If we can have those functionalities that'd be very useful:

  • [x] support shared embeddings within each tower (encoder)
  • [ ] support shared embeddings across the towers

There might be cases that customers want to write a model that they might want to represent the users as a list of the items they consumed. In such case, customers want to have an input feature in the user tower to look up the embeddings from a separate tower - so that different features shares the same int_domain.

Motivation

Customers are interested in this feature and asked about it.

rnyak avatar Aug 25 '22 11:08 rnyak

check with @rnyak and split the ticket

viswa-nvidia avatar Sep 12 '22 16:09 viswa-nvidia

We need an example to showcase the functionality.

rnyak avatar Oct 05 '22 16:10 rnyak

closing since the functionality should be completed but we do not have an example notebook.

rnyak avatar Nov 16 '22 15:11 rnyak

We have a test now that can serve as reference for this functionality here (added in #841)

https://github.com/NVIDIA-Merlin/models/blob/release-22.11/tests/unit/tf/models/test_retrieval.py#L25-L76

oliverholworthy avatar Nov 16 '22 15:11 oliverholworthy