azure-search-openai-demo icon indicating copy to clipboard operation
azure-search-openai-demo copied to clipboard

feature: automatic number of document as expert setting

Open cforce opened this issue 1 year ago • 1 comments

Instead of setting a fixed number of documents to be injected into the prompt, dynamically calculate this based on the user's configuration of "Max Length of a System Response" in the expert settings. Allow users to set the document count to "auto" and prompt them to configure the "Max Length of a System Response," with a default value provided.

The number of documents that can be injected into the prompt should be based on the formula:

#Max Response Tokens = #Prompt Tokens + #User Message Tokens + #Document Injected Tokens + #Response Message Tokens

Given:

  • #Response Message Tokens
  • #Max Response Tokens
  • #Prompt Tokens

variables:

  • #User Message Tokens

The process should iterate over the ranked and ordered document list, adding complete documents (or pages) one by one to the prompt until the condition #Max Response Tokens <= 0 is met.

cforce avatar Oct 22 '24 09:10 cforce

I've run evaluations on pulling in more documents for the RAG flow, and the results are often not better, due to the increase in irrelevant documents.

Therefore, I think a setting like this should only be used in conjunction with a minimum semantic ranker score threshold, as otherwise you can easily end up sending too many irrelevant documents to the LLM.

Given that, I do think an option like this makes sense, especially given the increasing size of context windows and people's desire to ask questions across many documents.

pamelafox avatar Nov 02 '24 00:11 pamelafox