chat-with-your-data-solution-accelerator
chat-with-your-data-solution-accelerator copied to clipboard
Make sure the streamed response isn't repeating tools unnecessarily
I believe the app.py in this repo is based off the sample-app app.py. In that case, I believe it has a performance bug due to this line:
https://github.com/Azure-Samples/azure-search-openai-solution-accelerator/blob/db254e0316d0c9031158bb70a0531159b4d111af/app.py#L139
If you check in the browser when it's streaming, you'll see that the "tools" gets repeated in every chunk streamed, which requires a lot of bandwidth and is unnecessary. You should only need to stream the tools once, and the frontend can process it when it sees it.
Thank you so much, @pamelafox we'll plan for the correction accordingly.
Hello @pamelafox , it seems this branch is having very old code. If you see in the main branch of this repo which contains the latest code, over there we do not have such file called app.py. Instead we are using create_app.py which do not have any logic/condition for checking tools in every chunk streamed. Hence "tools" are not getting repeated in every chunk streamed in existing functionality. So can you please check once and let us know whether we can close this issue.
Okay, feel free to close if the chunks are good now, I don't have the time to inspect the output myself. Thanks!
As discussed, I am closing this issue.