bozden

Results 108 comments of bozden

Good idea @raivisdejus. I think you are trying to correct this: > The "?" inside words were caused by an encoding issue during import from old sentence collector, unicode characters...

Hey @JJun-Guo, recordings in Common Voice are currently limited to 10 seconds. Here is a related recent discussion on allowing more: https://discourse.mozilla.org/t/discussion-relaxation-of-the-10-sec-recording-limitation/114142

I need to check it from the code, but from my head, it was 1 sec but dropped to 0.5... Actually, as it also includes silences, short uttrences can easily...

I was wrong. It is 1 sec. 0.5 sec is for the benchmark sentences (numbers etc). https://github.com/common-voice/common-voice/blob/3bccdf446f6acd8a9afda1db7a9a1664457e611d/web/src/components/pages/contribution/speak/speak.tsx#L42 But as I stated on the link given in the previous post, state-of-the...

AFAIK, a rule-of-thumb is to train a model with data which it will see in the wild. For a general purpose ASR model where the model is subjected to everyday...

If you are working on the cv-sentence-extractor rules (first run): Getting longer sentences are better I think. It is easier to get shorter sentences from other sources. Once it gets...

@MichaelKohler, can this be made adaptive? I mean, not to put an absolute minimum, but set a "requested_minimum", if the 3 sentences are not found, fill it with shorter ones...

As you know working on this was on my to-do list, if only I can get really good results... I'll look into this. E.g sorting sentences by length can help...

Very good point... But this is how it works now, isn't it? So, as of now, if an article has 3 sentences, they are taken if the rules match. One...

As I mentioned above, with the state-of-the-art models and HW advancements, it is better to get longer audio, thus longer texts. A change in this repo towards this goal would...