espnetUser comments

Results 14 comments of


                                            espnetUser

Issue with long audio files and streaming models

@D-Keqi: Thanks for your helpful reply! I just had time to come back to this issue. > Can you show me your configuration of streaming decoding? ``` $ cat conf/tuning/decode_asr_streaming.yaml...

Issue with long audio files and streaming models

@D-Keqi and @sw005320: Could you please clarify the following questions regarding streaming: **1. Impact of repetition detection** In https://arxiv.org/abs/2006.14941 repetition detection was found to be an important part of the...

Issue with long audio files and streaming models

Thank you @eml914 for your helpful response and my apologies for the late reply! > The current streaming decoding cannot afford very long audio because the decoder still uses entire...

Issue with long audio files and streaming models

One last question where I would really appreciate your comment. With respect to the contextual_block_conformer could you please clarify which of these parameters in the streaming config affects the "built-in"...

Issue with long audio files and streaming models

@eml914: Thank you very much for the clarification!

Issue with long audio files and streaming models

@eml914: Sorry, to bother you again. Could you please help me understand the relationship between the following streaming config parameters of the block conformer encoder specified in the espnet configs...

Issue with long audio files and streaming models

Ah, thanks! I somehow missed that ```hop_size = N_l```. My current config is this: ``` block_size: 40 # streaming configuration hop_size: 16 # streaming configuration look_ahead: 8 # streaming configuration...

Issue with long audio files and streaming models

Thanks @eml914 for your clarification and recommendations on the `hop_size=N_c` which of course makes sense. I am running some trainings with different configs which essentially decrease look-ahead and center context...

Streaming ASR model latency issue

Hi @DinnoKoluh, the larger latency you see for the first chunk might come from the block processing done by the streaming Conformer encoder as it needs to fill the entire...

Streaming ASR model latency issue

> My guess for the increase in latency is that for each update of the transcription I get the whole transcription back instead of just the update. I am not...