aisearch-openai-rag-audio icon indicating copy to clipboard operation
aisearch-openai-rag-audio copied to clipboard

Right Audio Capture Issue

Open madhubandru opened this issue 1 year ago • 4 comments
trafficstars

The application does not capture exactly what a speaker is saying.

Example: I asked "What are retention limits?"

AI is searching for: "definition of vulnerability"

image

This is happening most of the time. Is someone facing the same issue? Do you have any solution to fix this?

System Info: Running application in local VS Code

Thank you in Advance!

madhubandru avatar Oct 28 '24 20:10 madhubandru

Hi @pamelafox @pablocastro, checking with you both for some suggestions on this issue.

madhubandru avatar Oct 29 '24 12:10 madhubandru

The backend asks the model (using "tools") to decide on a search query based off what it hears, so either the model heard correctly but then translated it into a different search query, or it did not hear correctly, and that's why it suggests that search query.

It may be a limitation of the model, I've heard reports of mixed success with different accents. Have you tried the playground in Azure OpenAI to see if it understands the phrase? You could try saying "Repeat after me: ____" and seeing what it says.

pamelafox avatar Oct 29 '24 16:10 pamelafox

I tried in playground now, it not catching what speaker saying in most instances like 7/10.

In play ground my system instructions are as below

You are a helpful assistant. Repeat what user says.

madhubandru avatar Oct 29 '24 18:10 madhubandru

I've heard another developer say that they got better performance with different temperatures when the model didn't parse their voice, that might be worth trying?

pamelafox avatar Nov 01 '24 23:11 pamelafox