verifyml
verifyml copied to clipboard
upload speech to word example notebook set
This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.
🔍 Inspect: https://vercel.com/cylynx/verifyml/2FokXGiD7caN9Yh14f75XHhx7mrJ
✅ Preview: https://verifyml-git-speech-example-nb-cylynx.vercel.app
The test results of your model card is automatically generated with VerifyML! 🎉
📜 Test Result Summary
Type of Tests | Pass | Fail |
---|---|---|
Explainability Analysis | 0 | 0 |
Quantitative Analysis | 0 | 0 |
Fairness Analysis | 3 | 0 |
🔍 Inspect: Breast Cancer Wisconsin (Diagnostic) Dataset 🚨 A public repository is required to use the Model Card Viewer.
Some questions about this data point from here:
- How was the match count of 5 derived? It looks like there are 7 matches (everything except the 'two' which was classified as 'to').
- The output split column seems to have an additional empty string, would that affect the match count?
- Even though 'three' appears twice, it won't be double-counted right?
On a slightly related note, single digits were converted from numbers to words in your notebook. Were there any larger numbers involved? e.g. did any participant read something like 'one hundred' then google's model returns '100' instead?
I did a intersection count between the 2 set of words (e.g len(setA.intersectionsetB)). So in the example 'three' is spoken twice and the model got it right both times, but i will just count as 1 match. The assumption here is that im assuming the model will always correctly transcribe a 'three', which it is in this example but may not be true for all the cases. Also, spotted a mistake, the truth count in the above example should be 6 not 8, match count will still stand as 5. Its supposed to be a unique count, will change it.
Also, there is no order in my counting logic. Say if the truth is 'i went to sleep' and prediction is 'sleep to went i', match count will be 4 out of 4 but such cases are close to impossible to happen. If the model transcribe it as 'i went two o sleep', it will be 3 out of 4.
Empty string only exists in the prediction set and does not add into the match count.
And yeah there are few participants who will say it in 'hundreds' or 'millions' but my digit converter convert word for word. So thats another naunce.... But most are given long chunks of digits to recite.
The test results of your model card is automatically generated with VerifyML! 🎉
📜 Test Result Summary
Type of Tests | Pass | Fail |
---|---|---|
Explainability Analysis | 0 | 0 |
Quantitative Analysis | 0 | 0 |
Fairness Analysis | 3 | 0 |
🔍 Inspect: Breast Cancer Wisconsin (Diagnostic) Dataset 🚨 A public repository is required to use the Model Card Viewer.
Ok, was looking for the set intersection bit, LGTM! The MinMaxMetricThreshold
might not be very applicable in this case since the threshold is selected arbitrarily (I think)? But if it's just for the purposes of an example, should be ok
As discussed, let's modify the overview section to mention that we are evaluating Google's text to speech model. Thanks.
The test results of your model card is automatically generated with VerifyML! 🎉
📜 Test Result Summary
Type of Tests | Pass | Fail |
---|---|---|
Explainability Analysis | 0 | 0 |
Quantitative Analysis | 0 | 0 |
Fairness Analysis | 3 | 0 |
🔍 Inspect: Breast Cancer Wisconsin (Diagnostic) Dataset 🚨 A public repository is required to use the Model Card Viewer.