azure-search-openai-demo
azure-search-openai-demo copied to clipboard
Question: How does the app find the citation to the paragraph in a document that was used for answering the question?
Please provide us with the following information:
This issue is for a: (mark with an x)
- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
Minimal steps to reproduce
the code tries to retreive the document from blob storage for citation. Yet I have sharepoint library as a source of my index. So would like to know how a citation is found/determined after a document has been retreived?
Any log messages given by the failure
Expected/desired behavior
OS and Version?
Windows 11
azd version?
azd version 1.0.2 (commit 145e046b1ea9394bd4e1b1d539eb32e860d692fb)
Versions
Mention any other details that might be useful
I would like to know how a citation is found/determined after a document has been retreived? Great job with this so far!
Thanks! We'll be in touch soon.
Hi, the citation is really good thank you for the good work however, sometimes the citation is failing to high light the file name as a link that can be clickable to show the pdf on the right pane, could you please let us know who can we ensure the consistency of the citation ?
I'm not one of the devs, but the citations are not consistent because the code is just appending the title of each source doc retrieved from the cognitive search index to the start of the document, separated by a colon, and telling openai in the prompt that the titles are at the start of each source and to cite them in square brackets within each response.
results = [doc[self.sourcepage_field] + ": " + nonewlines(doc[self.content_field]) for doc in r] content = "\n".join(results) ... prompt = prompt_override.format(sources=content, chat_history=self.get_chat_history_as_text(history), follow_up_questions_prompt=follow_up_questions_prompt) ...
prompt_prefix = """Each source has a name followed by colon and the actual information, always include the source name for each fact you use in the response. Use square brackets to reference the source, e.g. [info1.txt]. Don't combine sources, list each source separately, e.g. [info1.txt][info2.pdf]. {follow_up_questions_prompt} {injected_prompt} Sources: {sources}"""
The titles are then rendered as links on the frontend to blob storage file paths for the citation panel. That's the basic gist anyway. So if your issue is the links are not consistently rendering, just play around with the openai prompts in the backend/approaches files and/or modify the format of how the titles are pasted into the prompt.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this issue will be closed.