Maybe there is a logic bug in the transformer-transducer decoding part

Open YuXI-Chn opened this issue 3 years ago • 3 comments

❓ Questions & Help

Thanks for providing the so excellent codebase. I have a question about the transformer-transducer decoding part. I am not sure whether there is a bug implicitly or I don't fully understand the trans-t framework.

Details

In the "greedy_decode" method, which is in transformer_transducer/model.py module, a "decoder_output" and an "encoder_output" are concatenated in each time step, and then the fused vector is fed to the joint network.

Within the scope of my knowledge, if the output is the "blank" symbol, the transducer only uses the next acoustic feature embedding, but keeps the decoder_output fixed. Why is the decoder_output also updated?

Thanks for taking the time to answer my question.

Nov 04 '21 12:11 YuXI-Chn

@hasangchun

Nov 04 '21 13:11 sooftware

@YuXI-Chn I think I missed it. Thanks for letting me know. I'll fix it as soon as possible.

Nov 07 '21 03:11 upskyy

happy to help!

Nov 10 '21 16:11 YuXI-Chn

openspeech openspeech copied to clipboard

Maybe there is a logic bug in the transformer-transducer decoding part

❓ Questions & Help

Details

openspeech
openspeech copied to clipboard