wenet blockformer

blockformer

Open LeonWlw opened this issue 2 years ago • 3 comments

This PR is about implementation of blockformer in WeNet. (Original paper: https://arxiv.org/abs/2207.11697)

Implementation Details
- add se layer ensemble conformer encoder outputs
- add se layer ensemble transformer decoder outputs
- using relative positional encoding in decoder

In main branch, extracting features by torchaudio make a little worse results than the paper. I will push a branch using kaldi features for aishell recipe which can reproduce results in the paper.

Oct 19 '22 02:10 LeonWlw

I think it's better if we add the experiment results on AIShell-1 and LibriSpeech, to show that we can get consistent and solid gain by using the model.

Oct 19 '22 02:10 robin1001

I think it's better if we add the experiment results on AIShell-1 and LibriSpeech, to show that we can get consistent and solid gain by using the model.

@robin1001 results of aishell has been added

Oct 19 '22 12:10 LeonWlw

hi, i run blockformer in 3080 and it just used 30% - 40% gpu. I change batchsize bigger and numworker but it didn't work. So what should i do to take more use of gpu?

Dec 11 '22 09:12 903859154

wenet wenet copied to clipboard

blockformer

wenet
wenet copied to clipboard