resemble-enhance icon indicating copy to clipboard operation
resemble-enhance copied to clipboard

non english speech transformed to weird language

Open BahzBeih opened this issue 1 year ago • 11 comments

Peace, the non english speech transformed to weird language, i think it only work with english speech right now.

BahzBeih avatar Dec 15 '23 23:12 BahzBeih

Experienced the same with spanish audio. Sounds kinda german after denoising it.

karen-pal avatar Dec 16 '23 04:12 karen-pal

Experienced the same with spanish audio. Sounds kinda german after denoising it.

i faced the problem with Arabic language, and i have the same problem with adobe audio enhance online tool.

BahzBeih avatar Dec 16 '23 05:12 BahzBeih

The current model is mainly trained on English datasets and may not work as well with other languages. We hope to expand its language support in the future, and contributions are always welcome.

enhuiz avatar Dec 18 '23 06:12 enhuiz

@enhuiz Are those English datasets available anywhere?

peili avatar Dec 18 '23 08:12 peili

@enhuiz I'd like to help and contribute with other language models as well. Can you provide datasets as a reference?

wolfgang-wp avatar Dec 18 '23 22:12 wolfgang-wp

Hello @enhuiz and @ZohaibAhmed ,

I've been following the discussion on the challenges faced with non-English audio processing using the resemble-enhance tool. Like others here, I attempted to train a model using German language samples. However, without adequate reference datasets or examples, the training process did not yield a reasonable model (pt).

The model's performance with German language samples was suboptimal, leading to outcomes that were not practically usable. This experience aligns with what others have reported regarding Spanish and Arabic audio processing. It seems evident that the current model's training and optimization are heavily skewed towards English datasets.

I am keen on contributing to the enhancement of the tool for better performance with non-English languages, particularly German. Any guidance on accessing suitable datasets or reference models that have been effectively trained on non-English languages would be highly beneficial. The availability of such resources would greatly aid in developing more robust and language-inclusive models.

Thank you for your efforts in creating this tool, and I look forward to any possibility of collaboration or contribution towards its improvement in handling diverse languages.

anrice avatar Dec 19 '23 19:12 anrice

Hello ! Same problem there with french language. Are you familiar with Mozilla's Common Voice initiative ? You could use it to train the model with other languages :)

xylphe avatar Dec 27 '23 16:12 xylphe

Hello ! Same problem there with french language. Are you familiar with Mozilla's Common Voice initiative ? You could use it to train the model with other languages :)

Nice solution, it could do the trick!. However Common Voice is poorly supervised, and it might be a problem using deteriored samples for training enhance stage. Does anyone know if high audio quality is essential for training enhance system?

4lvrz avatar Feb 23 '24 11:02 4lvrz

I really like this tool for denoising, but enhancement doesn't really work on most of the samples. I found enhancement and denoising is done better in another open source project https://github.com/ruizhecao96/CMGAN which also works very well on non-english languages.

skirdey avatar Mar 15 '24 20:03 skirdey

I really like this tool for denoising, but enhancement doesn't really work on most of the samples. I found enhancement and denoising is done better in another open source project ruizhecao96/CMGAN which also works very well on non-english languages.

The demos sound great in the repo. But do you know if there's an easier tool to use this? For example, a CLI tool where I can just input a MP3 and it outputs an enhanced MP3?

rbozan avatar Apr 26 '24 11:04 rbozan

I was hoping to use this for Japanese, but seems like I'll need to hold out.

kanjieater avatar Jun 28 '24 04:06 kanjieater