agents [Question] Streaming chunks of text to TTS

[Question] Streaming chunks of text to TTS

Open BrianMwas opened this issue 4 months ago • 1 comments

I would like to know how I can handle chunk of text while still maintaining punctuations. I tried and am getting errors while doing this. This has caused increased latency from ~1s to close to ~4s which is quite long considering this is a conversation. Thanks this is how am working with it when adding context

async def enrich_with_rag(agent: VoicePipelineAgent, chat_ctx: llm.ChatContext):
    global pdf_index

    if pdf_index is None:
        logger.warning("RAG system not initialized. Skipping enrichment.")
        return

    user_msg = chat_ctx.messages[-1]
    try:
        # Use LlamaIndex to query the data
        query_engine = pdf_index.as_query_engine(
            llm=groqLLM, 
             streaming=True, # Adjust model as needed
            similarity_top_k=3
        )
        response = await asyncio.to_thread(query_engine.query, user_msg.content)
        
        
        if response:
            # Collect the entire streamed response
            relevant_text = ""
            for text in response:
                relevant_text += text
            
            logger.info(f"Enriching with RAG: {relevant_text[:100]}...")  # Log first 100 chars
            
            # Insert RAG context before the user's message
            chat_ctx.messages.insert(-1, llm.ChatMessage.create(
                text=f"Context:\n{relevant_text}",
                role="assistant",
            ))

        # Create a streaming LLM response
        llm_stream = agent.llm.chat(chat_ctx)
        return llm_stream
    except Exception as e:
        logger.error(f"Error during RAG enrichment: {str(e)}")```

Oct 17 '24 06:10 BrianMwas

agents agents copied to clipboard

[Question] Streaming chunks of text to TTS

agents
agents copied to clipboard