FT-w2v2-ser
FT-w2v2-ser copied to clipboard
Questions about batch size and clustering model
- What's the rationale behind making the default batch size 64 for the pre-training, continued pre-training, and fine-tuning loops? Others have mentioned that they had to reduce the batch size to make it run on their systems, considering the original code uses a single GPU. Is this the batch size that produced the best results in your experiments?
- I noticed that
cluster.py
accepts either wav2vec or wav2vec2 as the model_type. Why did you move forward with making wav2vec2 as the default model? Could you have used HuBERT or other variations of a transformer-based model?