wenet
wenet copied to clipboard
blockformer
This PR is about implementation of blockformer in WeNet. (Original paper: https://arxiv.org/abs/2207.11697)
- Implementation Details
- add se layer ensemble conformer encoder outputs
- add se layer ensemble transformer decoder outputs
- using relative positional encoding in decoder
In main branch, extracting features by torchaudio make a little worse results than the paper. I will push a branch using kaldi features for aishell recipe which can reproduce results in the paper.
I think it's better if we add the experiment results on AIShell-1 and LibriSpeech, to show that we can get consistent and solid gain by using the model.
I think it's better if we add the experiment results on AIShell-1 and LibriSpeech, to show that we can get consistent and solid gain by using the model.
@robin1001 results of aishell has been added
hi, i run blockformer in 3080 and it just used 30% - 40% gpu. I change batchsize bigger and numworker but it didn't work. So what should i do to take more use of gpu?