dspy
dspy copied to clipboard
Different optimization results between 2.5.16 -> 2.5.20
Hi!
I created a simple module and a set of 10 questions and answers to evaluate a single pdf loaded into chromadb. When evaluating using DSPy version 2.5.16 like
evaluate = dspy.Evaluate(
devset=data, metric=metric, num_threads=24, display_progress=True, display_table=3
)
evaluate(rag)
I get a semantic F1 score of 69, then when I run the optimization (which takes about 15 minutes) and evaluating it I get a score of about 79.
tp = dspy.MIPROv2(
metric=metric, auto="medium", num_threads=24
) # use fewer threads if your rate limit is small
optimized_rag = tp.compile(
RAG(),
trainset=data[:7],
valset=data[7:],
max_bootstrapped_demos=2,
max_labeled_demos=2,
requires_permission_to_run=False,
seed=0
)
evaluate(optimized_rag)
However when I run this with version 2.5.20, I first get a score of 61 and after optimization I get a score of 69. These seem quite different from each other and significantly lower. Everything is the same except I upgrade the DSPy library. Interestingly the optimization now finished in about 2 minutes which is significantly faster. Any thoughts on these differences?