dify icon indicating copy to clipboard operation
dify copied to clipboard

Firefox browser cannot play TTS audio.

Open gjfmlj opened this issue 1 year ago • 2 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.7.1

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Opening the webapp URL with Firefox, the AI's response will not be automatically played through voice, and the response content cannot be manually played either.

✔️ Expected Behavior

I hope to play TTS audio normally in the Firefox browser.

❌ Actual Behavior

The error message in the Firefox browser console is as follows:.

Uncaught DOMException: MediaSource.addSourceBuffer: Type not supported in MediaSource NextJS 37 6343-c531cfd2e942ddac.js:1 无法播放媒体。没有相应格式的解码器:audio/mpeg

gjfmlj avatar Aug 21 '24 07:08 gjfmlj

Hello @gjfmlj! I'm a bot that can help you solve bugs, answer questions, and become a contributor. I'm here to assist you while waiting for a human maintainer.

The error "Uncaught DOMException: MediaSource.addSourceBuffer: Type not supported in MediaSource" in Firefox is likely due to the browser not supporting the media type of the audio being used. This can happen if the audio format generated by the TTS service is not compatible with Firefox's supported media formats.

To resolve this issue, ensure that the TTS audio is in a format supported by Firefox. Commonly supported audio formats in Firefox include MP3, Ogg Vorbis, and WAV. You may need to configure the TTS service or use a tool like FFmpeg to convert the audio to a compatible format.

Here is an example of how you can use FFmpeg to convert an audio file to MP3 format:

ffmpeg -i input_audio_file -codec:a libmp3lame output_audio_file.mp3

Replace input_audio_file with the path to your original audio file and output_audio_file.mp3 with the desired output file name. This should help ensure that the audio format is compatible with Firefox [1].

Additionally, the configuration file tts-1-hd.yaml specifies the default audio format used by the TTS service in Dify, which is mp3 [2][3].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Aug 21 '24 07:08 dosubot[bot]

This issue has nothing to do with dify, I'll try to answer it.

  1. In dify tts api, the audio returned is in audio/mpeg(aka mp3) format. e.g. OpenAI
  if len(content_text) > word_limit:
      sentences = self._split_text_into_sentences(content_text, max_length=word_limit)
      executor = concurrent.futures.ThreadPoolExecutor(max_workers=min(3, len(sentences)))
      futures = [executor.submit(client.audio.speech.with_streaming_response.create, model=model,
                                 response_format="mp3",
                                 input=sentences[i], voice=voice) for i in range(len(sentences))]
      for index, future in enumerate(futures):
          yield from future.result().__enter__().iter_bytes(1024)

  else:
      response = client.audio.speech.with_streaming_response.create(model=model, voice=voice,
                                                                    response_format="mp3",
                                                                    input=content_text.strip())
  1. On the dify page, playing audio uses MediaSource
private listenMediaSource(contentType: string) {
  this.mediaSource?.addEventListener('sourceopen', () => {
    if (this.sourceBuffer)
      return

    this.sourceBuffer = this.mediaSource?.addSourceBuffer(contentType)
  })
}
  1. Not support audio/mpeg in Firefox MediaSource image

refer: All MIME types supported by MediaRecorder in Firefox and Chrome?

Possible solution

  1. Return other format audio. e.g. "opus", "aac", "flac", and "pcm" OpenAI Format
  2. Switch to another way of playing audio.
  3. Use Chrome. 🤣

wellCh4n avatar Aug 24 '24 14:08 wellCh4n

Timbre and tts are turned on, but it cannot play automatically, and the stream is also empty, the timbre switch is not effective, and all the voices are female image image

zscxc avatar Sep 23 '24 02:09 zscxc