MeloTTS icon indicating copy to clipboard operation
MeloTTS copied to clipboard

Fix Loudness Issue

Open alastorid opened this issue 11 months ago • 3 comments

Tested with Chinese and found no clipping.

PR Waveform
Before image
After image

Text used:

我要給阿Q做正傳,已經不止一兩年了。但一面要做,一面又往回想,這足見我不是一個「立言」的人,因為從來不朽之筆,須傳不朽之人,於是人以文傳,文以人傳——究竟誰靠誰傳,漸漸的不甚瞭然起來,而終於歸接到傳阿Q,彷彿思想裡有鬼似的。

Edit: Based on this PR https://github.com/myshell-ai/MeloTTS/pull/221, it normalizes the audio stream without affecting audio quality. However, the implementation is now different.

alastorid avatar Jan 19 '25 10:01 alastorid

Hello, @alastorid

I'm source melo TTS recently, and try implement this model to edge devcie. Follow your PR, I meet below problem. Please have a time to help, thank you! In melo/app.py

  • audio_list.append(utils.fix_loudness(audio,self.hps.data.sampling_rate))

cause almost audio content lost, just save the last around 3 seconds audio. base on aarch64/conda/python3.9/ubuntu22.04. All dependencies installed as the requirements.

HarryBXie avatar Jan 21 '25 11:01 HarryBXie

Hello @HarryBXie , Thanks for reporting the issue. I’ve reworked the solution, and it now works without the need for additional dependencies, unlike before. Could you please pull the latest version of this PR and try again?

alastorid avatar Jan 21 '25 14:01 alastorid

@alastorid, Thank you for your quick response, new PR is works well. I have tried two kinds of development platform, the PR both are effective.

HarryBXie avatar Jan 22 '25 02:01 HarryBXie