chat-with-your-data-solution-accelerator icon indicating copy to clipboard operation
chat-with-your-data-solution-accelerator copied to clipboard

Not able to implement hybrid search(semantic and vector).

Open pranav-saji opened this issue 11 months ago • 10 comments

Describe the bug

Not able to implement hybrid search(semantic and vector). Only vector search is present. I believe because of this, the chat is not full functional.The chatbot is not giving precise answers. I believe its because of the lack of hybrid search.

Screenshots

Screen Shot 2024-03-22 at 12 15 14 AM

pranav-saji avatar Mar 22 '24 05:03 pranav-saji

Hi @PFA23SCM89S , thank you for raising this bug. Please could you provide us with the steps to reproduce this bug and what we should expect to see vs what actually happens? Thank you.

cecheta avatar Mar 25 '24 11:03 cecheta

question answer tool is not dynamic regarding search and top k

image

cherifbenham avatar Mar 25 '24 15:03 cherifbenham

@cecheta To reproduce this error, just upload any document in admin webpage and ask specific questions from the document, preferably a bigger prompt. Not sure how to fix this. @cherifbenham Any idea on what to change in the code?

pranav-saji avatar Mar 26 '24 07:03 pranav-saji

@PFA23SCM89S in your case, if you deployed from devcontainer without changing the code, then your search should be hybrid meaning text+vector that outputs the 4 closest docs (or less)

the bug i am trying to fix relates to the implementation of semantic+hybrid, i found a way to do it using semantic config

semantic_config = SemanticConfiguration(name="semantic_config", prioritized_fields=SemanticPrioritizedFields(title_field=SemanticField(field_name="title"),content_fields=["content"], keywords_fields=["title"]")

and then

sources = self.vector_store.similarity_search(query=question, k=10, search_type = "semantic_hybrid", semanticConfiguration="semantic_config")

i am trying to redeploy like this - will let you know how it goes

cherifbenham avatar Mar 26 '24 12:03 cherifbenham

@cherifbenham Thanks for the update. I tried to update my open ai version from gpt 35 turbo to gpt 4. Still the response i get is very poor. im not sure if the model is updated provide. can u help me with how to update the model properly.

pranav-saji avatar Mar 26 '24 18:03 pranav-saji

You can try to update the model version from gpt35 to gpt4 in env variables of function apps and in environment variable of web app. In your case, you’ll have more grounded and more developed responses.

You can also modify the system prompt in admin configuration tab to develop its answers more. You can also increase the max tokens output from 1000 to 2000. And finally you can increase your chunk size to get more context from relevant docs. I believe you should also investigate the quality of your chunks from index via rest api or via python sdk

Hope this helps

On Tue 26 Mar 2024 at 19:15, Pranav Saji @.***> wrote:

@cherifbenham https://github.com/cherifbenham Thanks for the update. I tried to update my open ai version from gpt 35 turbo to gpt 4. Still the response i get is very poor. im not sure if the model is updated provide. can u help me with how to update the model properly.

— Reply to this email directly, view it on GitHub https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/issues/517#issuecomment-2021162731, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXWMKH2IGUFDD4CLQDMHCKDY2G3MNAVCNFSM6AAAAABFCV3Q52VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRRGE3DENZTGE . You are receiving this because you were mentioned.Message ID: <Azure-Samples/chat-with-your-data-solution-accelerator/issues/517/2021162731 @github.com>

cherifbenham avatar Mar 26 '24 18:03 cherifbenham

@cherifbenham I had updated the model version like this actually. maybe because of semantic+hybrid. Any update on your deployment?

pranav-saji avatar Mar 27 '24 19:03 pranav-saji

@cherifbenham Can you please mention which file exactly did u update. can u provide a ss of this if possible

pranav-saji avatar Mar 27 '24 19:03 pranav-saji

Hello @PFA23SCM89S , for the /api/conversation/custom endpoint, the application currently supports either vectorSimpleHybrid or vectorSemanticHybrid search, depending on whether a semantic configuration has been supplied or not. For the /api/conversation/azure_byod endpoint, the application currently supports either simple or semantic.

May I know which endpoint you are currently using? If you are using the custom endpoint, you should be able to use semantic + vector search by configuring the AZURE_SEARCH_USE_SEMANTIC_SEARCH and AZURE_SEARCH_SEMANTIC_SEARCH_CONFIG environment variables. If you are using the azure_byod endpoint, we plan to add more search configurations in https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/issues/295

cecheta avatar Apr 12 '24 09:04 cecheta

@PFA23SCM89S - Do you have any response from the comment above? We will have to close this bug soon

ross-p-smith avatar Apr 29 '24 12:04 ross-p-smith

@gaurarpit and I investigated this and verified that hybrid search is being used for the /api/conversation/custom endpoint. As there is no further response from the user, we will be closing this issue.

superhindupur avatar May 13 '24 10:05 superhindupur