xtts-streaming-server icon indicating copy to clipboard operation
xtts-streaming-server copied to clipboard

Streaming input to streaming TTS

Open santhosh-sp opened this issue 1 year ago • 8 comments

Hello Team,

Is it possible to run TTS streaming with streaming input text with same file name?

Example:

def llm_write(prompt: str):

    for chunk in openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        message=[{"role": "user", "content": prompt}],
        stream=True
    ):
        if (text_chunk := chunk["choice"][0]["delta"].get("content")) is not None:
            yield text_chunk

text_stream = llm_write("Hello, what is LLM?")

audio = stream_ffplay(
    tts(
        args.text,
        speaker,
        args.language,
        args.server_url,
        args.stream_chunk_size
    ), 
    args.output_file,
    save=bool(args.output_file)
)

With minimum words to the TTS api.

Thanks, Santhosh

santhosh-sp avatar Dec 07 '23 09:12 santhosh-sp

Is this possible?

mercuryyy avatar Dec 27 '23 18:12 mercuryyy

Yes, Its possible.

santhosh-sp avatar Dec 28 '23 02:12 santhosh-sp

Is it build into the [xtts-streaming-server] repo ? or it has to be tweaked?

I was getting ready to test it out this weekend before i install it.

mercuryyy avatar Dec 30 '23 15:12 mercuryyy

Any chance you can post some working examples? was able to get the docker working but i dont see any logic for providing the yield chunks as the text to the api

mercuryyy avatar Dec 30 '23 17:12 mercuryyy

is splitting at end of sentence (.?!) the best option here?

nurgel avatar Jan 02 '24 16:01 nurgel

def llm_write(prompt: str): buffer = "" for chunk in openai.ChatCompletion.create( model="gpt-3.5-turbo", message=[{"role": "user", "content": prompt}], stream=True ): if (text_chunk := chunk["choice"][0]["delta"].get("content")) is not None: buffer += text_chunk if should_send_to_tts(buffer): # Define this function to decide when to send yield buffer buffer = "" # Reset buffer after sending

text_stream = llm_write("Hello, what is LLM?")

for text in text_stream: audio = stream_ffplay( tts( text, speaker, language, server_url, stream_chunk_size ), output_file, save=bool(output_file) )

Fusion9334 avatar Jan 24 '24 22:01 Fusion9334

I believe input needs to be at least a sentence, as speech relies heavily on the context provided by subsequent words.

AI-General avatar Feb 13 '24 07:02 AI-General

def llm_write(prompt: str): buffer = "" for chunk in openai.ChatCompletion.create( model="gpt-3.5-turbo", message=[{"role": "user", "content": prompt}], stream=True ): if (text_chunk := chunk["choice"][0]["delta"].get("content")) is not None: buffer += text_chunk if should_send_to_tts(buffer): # Define this function to decide when to send yield buffer buffer = "" # Reset buffer after sending

text_stream = llm_write("Hello, what is LLM?")

for text in text_stream: audio = stream_ffplay( tts( text, speaker, language, server_url, stream_chunk_size ), output_file, save=bool(output_file) )

does this work

oscody avatar Jun 14 '24 12:06 oscody