best-rq-pytorch
best-rq-pytorch copied to clipboard
Missing Convolution Subsampling?
Hi Lucas, I'm looking over the code and I believe you have missed the two convolution subsampling layers in conformer.py,
4.1.1. NON-STREAMING MODELS The model has two convolution layers at the bottom which provide 4 times temporal-dimension reduction for the input sequences. The rest of the layers are a stack of Conformer models. We explore 0.6B model size which is extensively studied in the previous works. The model contains 24 layers of Conformer models.
If you'd like I can create a pull request and implement this for you now. Thanks - If I've misunderstood the paper, please call me out! 😅