speech_recognition
speech_recognition copied to clipboard
Can speech_recognition do forced alignment?
I'd like to get timestamps for each word in my transcript. Is it possible with speech_recognition
?
also wondering this too
Same here, not sure if it is doable?