š Issue: BERT encoder appears unused despite text_encoder_type='bert'
Hi, and thanks for the great work on this project!
I'm currently working with the training code and noticed something potentially inconsistent. While the documentation and flags suggest support for --text_encoder_type bert, it looks like the dataset is still loading GloVe embeddings via this line:
self.w_vectorizer = WordVectorizer(pjoin(opt.cache_dir, 'glove'), 'our_vab')
This occurs in:
data_loaders/humanml/data/dataset.py
This raises a few questions:
Is BERT actually used anywhere in the dataset loading or preprocessing pipeline?
If BERT is supported, where is it being applied?
Is there a separate dataset class or flow for BERT-based encoding?
Iād love clarification so I can ensure the correct embeddings are used during training.
The w_vectorizer is used only for evaluation; the bert/clip embeddings are obtained on-the-fly and not cached in the dataset.