fish-speech icon indicating copy to clipboard operation
fish-speech copied to clipboard

how to change the return audio length in streaming mode

Open wangdabee opened this issue 4 weeks ago • 1 comments

Self Checks

  • [x] This template is only for bug reports. For questions, please visit Discussions.
  • [x] I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
  • [x] I have searched for existing issues, including closed ones. Search issues
  • [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [x] Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Source)

Environment Details

python 312

Steps to Reproduce

python

✔️ Expected Behavior

No response

❌ Actual Behavior

When I use the vqgan model, adjusting the chunk_length in ServeTTSRequest streams audio in different lengths. However, when I use the dac model for streaming output, no matter how I change chunk_length, it always returns the entire audio at once. How can I adjust dac so that I can freely set the length of each part of the streamed audio?

wangdabee avatar Nov 25 '25 10:11 wangdabee

Hi @wangdabee! 👋

I've been working on Fish-Speech-Go, a Go-based API layer for Fish-Speech that provides better control over streaming behavior.

Current Architecture:

The Go server provides an OpenAI-compatible API layer that can be extended to support configurable chunk sizes:

// Potential API extension for chunk control
type TTSRequest struct     Text           string `json:"text"`
    Voice          string `json:"voice"`
    ChunkLength    int    `json:"chunk_length,omitempty"`    // Configurable! 
    ResponseFormat string `json:"response_format"`
}

What Fish-Speech-Go offers today:

  • OpenAI-compatible API (/v1/audio/speech)
  • Production-ready Docker deployment
  • High-performance Go HTTP server
  • Easy to extend with new parameters

Quick Start:

git clone https://github.com/itsDarianNgo/fish-speech-go
cd fish-speech-go/docker
cp .env. example .env  # Add your HF_TOKEN
docker compose up -d

If configurable chunk length is important to you, I'd welcome a contribution or feature request on the repo! The Go codebase is clean and easy to extend.

itsDarianNgo avatar Nov 27 '25 08:11 itsDarianNgo