free-spoken-digit-dataset icon indicating copy to clipboard operation
free-spoken-digit-dataset copied to clipboard

Add other languages

Open ujagaga opened this issue 3 years ago • 2 comments

Hi! Would you consider adding Serbian language to the dataset? I am interesetd to contribute my voice and as many as I can gather. I suppose this would also be simpler to accomplish if we could gather audio online using an automated website.

ujagaga avatar Aug 11 '21 09:08 ujagaga

Why do you want to use the serbian language?

Jakobovski avatar Aug 11 '21 09:08 Jakobovski

Why do you want to use the serbian language?

Because it is my native language and my older relatives do not speak english well. I intend to collect my own samples, so I just deployed a website to collect the samples in serbian. So far I shared it with a specific group of facebook friends, but soon I will ask others to join, so I hope to gather a decent sample.

https://audiosampler.herokuapp.com/

I adjusted the website code so it can be used in any language and uploaded it to github:

https://github.com/ujagaga/audioSampler

so if you reference it here, perhaps the audio repository can grow in other languages too. The goal for me is to train a personal assistant for offline speach to text and custom command execution based on serbian language.

ujagaga avatar Aug 12 '21 15:08 ujagaga