icefall icon indicating copy to clipboard operation
icefall copied to clipboard

How to filter annotation(transcript) and choose suitable corpus for ASR

Open AI-X-King opened this issue 2 years ago • 0 comments

I plan to train an ASR model using own data with wenetspeech in egs. I want to know how the quality of annotation, good or bad, and different scene of corpus, such as conference conversation and read articles, influence my model. So, I can choose some beneficial corpus and delete those bad annotation that harm my model seriously. Of course, welcome other influence factors! Thanks!

AI-X-King avatar Apr 14 '23 07:04 AI-X-King