dspy
dspy copied to clipboard
Use retrieval reference as part of answer
I am building a Q&A system. I need to have a reference for retrieval to verify the generated answers.
Any suggestions to modify the retrieval processing to support showing the reference?
I am using AzureCognitiveSearch retrieval, where I did a PR before to this repo. My reference field is called source
which contains the title of the retrieval chunk. If the answer is in a given chunk, I need to show that title as part of the answer.
Could you please help with any suggestions on how to achieve this? Thanks in advance.
How many retrieved sources are you showing to the LLM at once?
I use topk = 3 This is multi hop Q&A and in the final predicted answer I need to show the reference.
There are a few ways to do this. Trying one or two of them should be pretty quick in DSPy.
The simplest reliable way to get attribution in this case would be to add a new step in the pipeline that selects one or two passages (per hop) and only these passages get used for generating the answer. Then these passages are cited. This guarantees that no other passages were used.
Alternatively, you can generate the answer as usual then add a final attribution step. Attribution can ask the LM (with ChainOfThought) to determine which passage IDs in the sources need to be cited.
The second way is probably easier and cheaper, so I'd start there.
You can try a module like this dspy.ChainOfThought("sources, question, predicted_answer -> sources_ids_to_cite")
.
But you may also get better results if you implement a full dspy.Signature class down the line since you need to tell the LM how you expect the source IDs to be formatted, so that you can parse them.
Thanks for the instructions. I will try what you suggested and update you.
Yes, happy to help more after you try this!