trax icon indicating copy to clipboard operation
trax copied to clipboard

Reformer Model for Speech Recognition

Open stefan-falk opened this issue 4 years ago • 5 comments

Coming from tensor2tensor I was wondering whether the Reformer model would be also a candidate for speech recognition? Looking at the examples there is none for ASR.

Would it be possible to train an ASR model on the Reformer or would code changes be necessary? If so, can we estimate how much would have to be changed on the model implementation?

Thank you for any insight into this!

stefan-falk avatar Apr 09 '20 09:04 stefan-falk

I believe Reformer (esp. with SRU as feed-forward, which is an hparam already) should make a nice ASR model. Didn't have time to work on it yet, but it'd be great to try!

lukaszkaiser avatar Apr 27 '20 17:04 lukaszkaiser

I don't think many changes are needed in terms of the model, but the input pipeline may need some thought. I believe that just feeding bytes could work, but it needs experimentation to see...

lukaszkaiser avatar Apr 27 '20 17:04 lukaszkaiser

It would be interesting to see. I guess translation (or text2text problems in general) could be tested as well?

stefan-falk avatar Apr 28 '20 06:04 stefan-falk

@stefan-falk Have you managed to do the ASR problem?

RegaliaXYZ avatar Jul 29 '20 07:07 RegaliaXYZ

@RegaliaXYZ I didn't try it yet, sorry.

stefan-falk avatar Aug 12 '20 06:08 stefan-falk