fish-speech icon indicating copy to clipboard operation
fish-speech copied to clipboard

Inferece result has noise

Open ChengsongLu opened this issue 2 months ago • 0 comments

Self Checks

  • [X] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find any relevant information that meets my needs. English 中文 日本語 Portuguese (Brazil)
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell us your story.

I followed the steps that inference with an specific speaker (text.wav), and the result (output.wav) quality doesn's sound good, it have some noises there.

inference.zip

And here are the steps I have done:

python tools/vqgan/inference.py -i "text.wav" --checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth"

python tools/llama/generate.py --text "你们这个是什么群啊,你们这是害人不浅啊你们这个群!谁是群主,出来!真的太过分了。" --prompt-text "人间灯火倒映湖中,她的渴望让静水泛起涟漪。若代价只是孤独,那就让这份愿望肆意流淌。流入她所注视的世间,也流入她如湖水般澄澈的目光。" --prompt-tokens "fake.npy" --checkpoint-path "checkpoints/fish-speech-1.5" --num-samples 2

python tools/vqgan/inference.py -i "codes_0.npy" --checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth"

2. What is your suggested solution?

Any suggestions on how to improve the sound quality?

3. Additional context or comments

No response

4. Can you help us with this feature?

  • [X] I am interested in contributing to this feature.

ChengsongLu avatar Dec 10 '24 03:12 ChengsongLu