wenet icon indicating copy to clipboard operation
wenet copied to clipboard

Can I get timestamp info by GPU inference?

Open JaheimLee opened this issue 3 years ago • 5 comments

I found timestamp info from here . But it's only for cpu. Is it possible to get timestamp from GPU inference? For example by using your docker server

JaheimLee avatar Jul 05 '22 03:07 JaheimLee

@yuekaizhang is it possible?

robin1001 avatar Jul 05 '22 14:07 robin1001

@yuekaizhang is it possible?

Yes, it's possible to add timestamp. Currently gpu inference using this ctc_decoder, which needs to modify to add timestamps. Or we could replace it with https://pytorch.org/audio/main/models.decoder.html, which has already supported timestamps.

I could implement it next week.

@Mddct would wenet python bindings give an python API for ctc_prefix_beam_search?

yuekaizhang avatar Jul 05 '22 15:07 yuekaizhang

@yuekaizhang is it possible?

Yes, it's possible to add timestamp. Currently gpu inference using this ctc_decoder, which needs to modify to add timestamps. Or we could replace it with https://pytorch.org/audio/main/models.decoder.html, which has already supported timestamps.

I could implement it next week.

@Mddct would wenet python bindings give an python API for ctc_prefix_beam_search?

@yuekaizhang python binding is for the convenience of using wenet。If we can binding wenet ctc prefix search in other repo to replace ctcdecoder?I've implemented a stateless binding before which maybe helpful https://github.com/Mddct/losses/blob/main/src/ctc_decoder.h

But If we introduce this api will complicate binding

@robin1001

Mddct avatar Jul 05 '22 15:07 Mddct

Binding is just for CPU and it is too heavy for the task. I prefer to use torchaudio built-in decoder.

robin1001 avatar Jul 06 '22 01:07 robin1001

@yuekaizhang Plz note this issue: https://github.com/parlance/ctcdecode/issues/148

pengzhendong avatar Jul 07 '22 07:07 pengzhendong

seems solved, close this issue

xingchensong avatar Feb 26 '23 03:02 xingchensong