openspeech icon indicating copy to clipboard operation
openspeech copied to clipboard

Maybe there is a logic bug in the transformer-transducer decoding part

Open YuXI-Chn opened this issue 3 years ago • 3 comments

❓ Questions & Help

Thanks for providing the so excellent codebase. I have a question about the transformer-transducer decoding part. I am not sure whether there is a bug implicitly or I don't fully understand the trans-t framework.

Details

In the "greedy_decode" method, which is in transformer_transducer/model.py module, a "decoder_output" and an "encoder_output" are concatenated in each time step, and then the fused vector is fed to the joint network.

Within the scope of my knowledge, if the output is the "blank" symbol, the transducer only uses the next acoustic feature embedding, but keeps the decoder_output fixed. Why is the decoder_output also updated?

Thanks for taking the time to answer my question.

YuXI-Chn avatar Nov 04 '21 12:11 YuXI-Chn

@hasangchun

sooftware avatar Nov 04 '21 13:11 sooftware

@YuXI-Chn I think I missed it. Thanks for letting me know. I'll fix it as soon as possible.

upskyy avatar Nov 07 '21 03:11 upskyy

happy to help!

YuXI-Chn avatar Nov 10 '21 16:11 YuXI-Chn