Yifan Peng

Results 29 comments of Yifan Peng

Hi @lazykyama Thanks for the investigation. In ESPnet2, did you increase the `batch_bins` with more GPUs? The config is different from that in ESPnet1, as described below: https://espnet.github.io/espnet/espnet2_training_option.html#the-relation-between-mini-batch-size-and-number-of-gpus In ESPnet2,...

Thanks @lazykyama for your new investigation. The batch size issue seems quite common. @sw005320 I have updated the three docs and made a PR here: https://github.com/espnet/espnet/pull/4436

Thanks for the great PR! I didn't look into the algorithm itself, but I made a few comments about the `doc` and `init` just now. I think it is already...

I have got two questions. 1. Does it support GPU inference? 2. Does it support automatic mixed precision training with `use_amp: true`? For LibriSpeech, I'm increasing the nonstreaming model size...

> > For LibriSpeech, I'm increasing the nonstreaming model size to 120M and extending the number of epochs to 60. > > Does the model need to be so large...

FYI, if we upgrade to a newer version, this warning will be gone, as it has been fixed in the whisper package.

I see. I do not know if there is any other conflicts..

I didn't check it carefully. Why does `tiktoken` affect our code? Do we use it?

Hi, thanks for the question! For LibriSpeech, I do not use the standard segmented version. Instead, I used the "original-mp3". I believe this is released along with the segmented version....

I have a high-level discussion about the desgin: Should we add more components into the `asr` task? I'm recently feeling that the ASR task is becoming more complicated, but some...