fstalign
fstalign copied to clipboard
Filler words
Hi
Do you plan to add a flag to disable filler words (like um, uh)?
We may add that flag eventually, but it is not on the immediate plan. For now we just remove any unwanted tokens from the transcript themselves.
@qmac In paper (https://arxiv.org/pdf/2104.11348v3.pdf), the reported WER is 11.3. Does this include filler words? Is there any script that I can use to reproduce paper result using Rev .nlp output files (https://github.com/revdotcom/speech-datasets/tree/main/earnings21/output/rev) ?
@naymaraq Yes it does include filler words. Let me see if we can find that script.