speechocean762 icon indicating copy to clipboard operation
speechocean762 copied to clipboard

Low Correlation in Completeness Calculation

Open teinhonglo opened this issue 5 months ago • 0 comments

Hello,

I am currently working on calculating completeness as shown in the documentation (score range: 0.0 – 1.0, representing the percentage of words with good pronunciation). My current approach is as follows:

From score.json, I calculate word-level accuracy using the ratio of the number of words that score greater than 7, 8, or 9 points as a threshold for "good pronunciation." Then, I compute the Pearson Correlation Coefficient (PCC) between the calculated word-level accuracy and completeness. Unfortunately, I am getting a very low correlation (around 0.2), which makes me question if my validation approach is correct.

My questions are:

Is my validation method for calculating word-level accuracy against completeness appropriate? Is there a more natural or standard way to calculate completeness that might yield better results? Any feedback or guidance on this would be highly appreciated!

Thank you!

teinhonglo avatar Sep 04 '24 11:09 teinhonglo