espnetUser

Results 14 comments of espnetUser

@D-Keqi: Thanks for your helpful reply! I just had time to come back to this issue. > Can you show me your configuration of streaming decoding? ``` $ cat conf/tuning/decode_asr_streaming.yaml...

@D-Keqi and @sw005320: Could you please clarify the following questions regarding streaming: **1. Impact of repetition detection** In https://arxiv.org/abs/2006.14941 repetition detection was found to be an important part of the...

Thank you @eml914 for your helpful response and my apologies for the late reply! > The current streaming decoding cannot afford very long audio because the decoder still uses entire...

One last question where I would really appreciate your comment. With respect to the contextual_block_conformer could you please clarify which of these parameters in the streaming config affects the "built-in"...

@eml914: Thank you very much for the clarification!

@eml914: Sorry, to bother you again. Could you please help me understand the relationship between the following streaming config parameters of the block conformer encoder specified in the espnet configs...

Ah, thanks! I somehow missed that ```hop_size = N_l```. My current config is this: ``` block_size: 40 # streaming configuration hop_size: 16 # streaming configuration look_ahead: 8 # streaming configuration...

Thanks @eml914 for your clarification and recommendations on the `hop_size=N_c` which of course makes sense. I am running some trainings with different configs which essentially decrease look-ahead and center context...

Hi @DinnoKoluh, the larger latency you see for the first chunk might come from the block processing done by the streaming Conformer encoder as it needs to fill the entire...

> My guess for the increase in latency is that for each update of the transcription I get the whole transcription back instead of just the update. I am not...