The audio output from Text To Speech in Workflow is broken in v1.0.0
Self Checks
- [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [x] I have searched for existing issues search for existing issues, including closed ones.
- [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [x] Please do not modify this template :) and fill in all the required fields.
Dify version
v1.0.0
Cloud or Self Hosted
Cloud
Steps to reproduce
It can be reproduced with a simple Workflow that runs Text To Speech as follows
Model
- tts-1-hd
- tts-1
Input Hello. I'm dify AI speaker. this is a test to check my voice.
✔️ Expected Behavior
Noiseless audio is created and it can be play correctly.
FYI: v0.15.3 self-hosting creates audio without problems.
❌ Actual Behavior
The generated audio file contains a lot of noise when played in a browser. Additionally, when I download the file and try to play it using QuickTime Player on Mac, it does not play.
Hey @eucyt! I'm here to help you with bug fixes, answer questions, and support you in becoming a contributor. Let's work together to resolve the issue you're experiencing.
The issue you're experiencing with the Text To Speech feature in Dify v1.0.0, where the audio output is noisy and unplayable in QuickTime Player, might be related to changes in the audio configuration or encoding process. In the v1.0.0 release, there is a configuration file tts.yaml that specifies the audio_type as 'mp3' [1]. This could potentially affect compatibility or quality.
To troubleshoot, you might want to check if there are any updates or changes in the audio processing pipeline or dependencies that could have impacted the output. Additionally, comparing the configuration settings between v0.15.3 and v1.0.0 might help identify any differences that could be causing the issue.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other