Transformers4Rec
Transformers4Rec copied to clipboard
[FEA] Support feeding pre-trained embeddings to TF4Rec model with high-level api
🚀 Feature request
Currently we do not have out of the box support for adding pre-trained embeddings to embedding layer, and ability to freeze them, and train a TF4Rec model. We have embedding_initializer but we never tested if it works accurately and as expected. May be we can create in PyTorch a class like TensorInitializer (TF) as we did in Merlin Models and expose the embedding initializer
and trainable
args to the user.
We need to
- [x] Expose definition of embeddings module in the input blocks: TabularFeatures and TabularSequenceFeatures
- [x] Support feeding pre-trained embeddings to TF4Rec model with high-level api (users should be add them to the embedding layer, and freeze them, i.e., set trainable=False (TF Api) or requires_grad=False (PyTorch API))
- [ ] create an example notebook for showcasing that functionality
Motivation
This is a FEA coming from our customers and users.
Is this related to or part of https://github.com/NVIDIA-Merlin/Merlin/issues/211?
Is this related to or part of NVIDIA-Merlin/Merlin#211?
@karlhigley more related to https://github.com/NVIDIA-Merlin/Merlin/issues/471. Not sure about the link to 211
.
When the embedding table are not huge and fit GPU memory, the new PretrainedEmbeddingsInitializer
( https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/572 ) can be used to initialize the embedding matrix with pre-trained embeddings and set them to trainable or not.
Is there an example notebook of usage of PretrainedEmbeddingsInitializer to initialize the embedding matrix
Is there an example notebook of usage of PretrainedEmbeddingsInitializer to initialize the embedding matrix
We dont have an example for this feature, but you can refer to the unit test, and try to implement it.
Is there an example notebook of usage of PretrainedEmbeddingsInitializer to initialize the embedding matrix
We dont have an example for this feature, but you can refer to the unit test, and try to implement it.
Thanks, I guess what I am looking for is how to use this along with the input block defined by a model schema, TabularSequenceFeatures (with a series of categorical and continuous features) and tr.NextItemPredictionTask and Electra config. Here's my pseudo code without using the embeddings
input_module = tr.TabularSequenceFeatures.from_schema(
schema,
max_sequence_length=max_sequence_length,
aggregation="concat",
d_output=d_model,
masking="mlm",
embedding_dim_default=embedding_dim_default,
)
metrics = [
tr.ranking_metric.NDCGAt(top_ks=[10, 20, 50, 100, 150, 200], labels_onehot=True),
tr.ranking_metric.AvgPrecisionAt(
top_ks=[10, 20, 50, 100, 150, 200], labels_onehot=True
),
tr.ranking_metric.RecallAt(top_ks=[10, 20, 50, 100, 150, 200], labels_onehot=True),
]
prediction_task = tr.NextItemPredictionTask(weight_tying=True, metrics=metrics)
transformer_config = tr.Electra.build(
d_model=d_model,
n_head=n_head,
n_layer=n_layer,
total_seq_length=max_sequence_length,
pad_token=PAD_TOKEN,
)
model = transformer_config.to_torch_model(input_module, prediction_task)
following up on this ^