eesen icon indicating copy to clipboard operation
eesen copied to clipboard

tedlium char based training scripts

Open xfwu opened this issue 8 years ago • 3 comments

Hi

Thank you very much for the great work!

I've tried the tedlium phone based scripts, they works fine. Now I am trying the char based scripts, from what I saw I think it might not have be tested right?

local/tedlium_prepare_char_dict.sh will produce lexicon1.txt which break <UNK> to < U N K > therefore units-nosil.txt will have A B C E G.. U [ ] < > as tokens which I assume they should be put together.

I noticed that the wsj/run_ctc_char.sh is a bit different from the /tedlium/v1/run_ctc_char.sh I am wondering if the tedlium scripts have been tested, and can I resort to the wsj script to train a proper char based system?

Thanks a million!

Best

xfwu avatar Apr 27 '16 08:04 xfwu

The previous tedlium char recipe does not work. I just made changes. It should work now. I am validating it on my side.

yajiemiao avatar Apr 27 '16 11:04 yajiemiao

Thank you very much for the quick response. I'll also test it today or tomorrow. Please let me know the results!

Best

xfwu avatar Apr 28 '16 07:04 xfwu

Any results from your end @xfwu ? (Yajie's results and response will not be available)

riebling avatar Dec 08 '16 18:12 riebling