my-voice-analysis icon indicating copy to clipboard operation
my-voice-analysis copied to clipboard

Pronunication Scoring

Open dpny518 opened this issue 4 years ago • 1 comments

How is the pronunciation scored without text and alignment

Witt S.M and Young S.J [2000]; “Phone-level pronunciation scoring and assessment or interactive language learning”; Speech Communication, 30 (2000) 95-108.

requires the constrained phone loop

dpny518 avatar Apr 10 '20 05:04 dpny518

@yondu22 My-Voice-Analysis and MYprosody repos are two capsulated libraries from one of our main projects on speech scoring. The main project (its early version) employed ASR and used the Hidden Markov Model framework to train simple Gaussian acoustic models for each phoneme for each speaker in the given available audio datasets, then calculating all the symmetric K-L divergences for each pair of models for each speaker. What you see in these repos are just an approximate of those model without paying attention to level of accuracy of each phenome rather on fluency In the project's machine learning model we considered audio files of speakers who possessed an appropriate degree of pronunciation, either in general or for a specific utterance, word or phoneme, (in effect they had been rated with expert-human graders). Here below the figure illustrates some of the factors that the expert-human grader had considered in rating as an overall score

image

S. M. Witt, 2012 “Automatic error detection in pronunciation training: Where we are and where we need to go,”

Shahabks avatar Nov 06 '20 01:11 Shahabks