Shinji Watanabe comments

Results 318 comments of


                                            Shinji Watanabe

Inference with Ngram

Thanks a lot. I recommend you make an ngram language model that is the same vocabulary as the ASR model. You can process the librispeech text with the ASR BPE...

Inference with Ngram

I think so! Glad to hear that you successfully make it work.

Slurp 2-pass model

This requires a high-level task/template design consideration. It also changes some basic ASR tasks and templates. Can you first summarize what this new function could do and what kind of...

Development plan for ESPnet2 speech enhancement

@LiChenda, please reflect https://github.com/espnet/espnet/pull/2212 to `enh.sh`

Development plan for ESPnet2 speech enhancement

@kamo-naoyuki, thanks for taking care of it!

Development plan for ESPnet2 speech enhancement

Sure, we're definitely interested in it. Thanks for reaching us! Can you contact me by email ([email protected])?

Development plan for ESPnet2 speech enhancement

> I'll be happy to integrate v2 version of our paper: https://arxiv.org/abs/2006.07637 Sounds good. We're now further refactoring our code and adding several examples. So, you can wait for our...

Development plan for ESPnet2 speech enhancement

Thanks! We (me, @LiChenda, @Emrys365, and @shincling) will discuss how we can integrate your efforts.

Add installer for ParallelWaveGAN

FYI, @kan-bayashi, it is still >2.0s https://github.com/espnet/espnet/runs/5158915826?check_suite_focus=true#step:8:334 If it happens often, we may need to reduce the time more.

Very poor performance when I used Speech2Text

I suggest you cut the audio by using vad. We do not expect such long audio inputs.