audio
audio copied to clipboard
Train Emformer RNN-T using provided recipe cannot converge
🐛 Describe the bug
Thank you for the amazing implementation of Emformer and the training recipe of it. But when I train from scratch using this recipe, the model cannot converge and stop improving after only about 6-7 epochs (tensorboard image below).

I trained on 100-hour training set of Librispeech. I also have re-created global_stats.json file to fit to only 100-hour data using this script (I have commented out other LibriSpeech sets of course)
Is there something I miss or something? Can you please check it out for me? Thank you very much!
Versions
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: 11.6
ROCM used to build PyTorch: N/A
OS: Ubuntu 18.04.6 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2
Libc version: glibc-2.27
Python version: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.4.0-122-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 9.1.85
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3060
Nvidia driver version: 510.85.02
cuDNN version: Probably one of the following:
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn.so.8
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_train.so.8
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_train.so.8
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.23.1
[pip3] pytorch-lightning==1.7.0
[pip3] torch==1.12.1
[pip3] torch-tb-profiler==0.4.0
[pip3] torchaudio==0.12.1
[pip3] torchmetrics==0.9.3
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.6.0 hecad31d_10 conda-forge
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py38h7f8727e_0
[conda] mkl_fft 1.3.1 py38hd3c417c_0
[conda] mkl_random 1.2.2 py38h51133e4_0
[conda] numpy 1.23.1 py38h6c91a56_0
[conda] numpy-base 1.23.1 py38ha15fc14_0
[conda] pytorch 1.12.1 py3.8_cuda11.6_cudnn8.3.2_0 pytorch
[conda] pytorch-lightning 1.7.0 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torch 1.12.1 pypi_0 pypi
[conda] torch-tb-profiler 0.4.0 pypi_0 pypi
[conda] torchaudio 0.12.1 pypi_0 pypi
[conda] torchmetrics 0.9.3 pypi_0 pypi