julep
julep copied to clipboard
Optimize docs search hyper parameters
trafficstars
Let's run an automated evaluation on the RAG dataset (using a local model or something) and then tune the doc search hyperparameters based on this. Parameters are:
- num docs
k_docs - confidence
docs_confidence
https://github.com/julep-ai/julep/blob/dev/agents-api/agents_api/models/entry/proc_mem_context.py#L13
rag dataset: rag-12000
contains three columns: context, question, answer
evaluation recipe:
-
create an agent
-
add all the documents from the
contextcolumn as agent docs -
for every row in the dataset (use the train split only)
- create a session with the agent
- ask the question from
questioncolumn (you can set max_tokens to 1 since we dont care about the returned answer) - note the document-ids returned from session.chat
- get all documents using the document ids
- check if
context(value of that row) is in the fetched documents
cool thing: optuna: https://optuna.org/