Dinghao Zhou

Results 114 comments of Dinghao Zhou

For binding:there is 3 questions: 1 get model by language type, so if we can supply small and big model for each language?(the small model could be trained with kd...

Moree data augment like rir https://github.com/pytorch/audio/issues/2624 torchaudio will add multi channel riri  based on pyroomacoustics

- [espnet optimize](https://github.com/espnet/espnet/blob/a672fe65030a7d9424465b2027019c906ae35fe1/espnet2/asr_transducer/beam_search_transducer.py) thanks @[b-flo](https://github.com/b-flo) - [Sequence Transduction with Recurrent Neural Networks](https://arxiv.org/pdf/1211.3711.pdf) - [Alignment-Length Synchronous Decoding for RNN Transducer](https://ieeexplore.ieee.org/document/9053040) - [Accelerating RNN Transducer Inference via One-Step Constrained Beam Search](https://arxiv.org/pdf/2002.03577.pdf) -...

> Hi @Mddct , > > For alignment-length synchronous (ALSD), time-synchronous decodng (TSD) and modified Adaptive Expansion Search (mAES) in ESPnet, please refer to [this](https://github.com/espnet/espnet/blob/a672fe65030a7d9424465b2027019c906ae35fe1/espnet2/asr_transducer/beam_search_transducer.py). The version you linked is...

If we want to use one-step decoding in the inference stage, can we try the implementation of this rnhnt loss later? [paper](https://arxiv.org/abs/1909.12415) [implement](https://github.com/csukuangfj/optimized_transducer)

You need make the decoder to have the function of copying like feature pipeline etc, by the way, the model can be multi-threaded

you can get ppg from encoder layer

> > @yuekaizhang is it possible? > > Yes, it's possible to add timestamp. Currently gpu inference using this [ctc_decoder](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/third_party/ctc_decoders), which needs to modify to add timestamps. Or we could...

I tried some environment variables but none of them worked. Maybe need to recompile libtorch without mkl

@lvzhiqiang sorry,I've been busy recently, and haven't tried to solve it yet。