dspy icon indicating copy to clipboard operation
dspy copied to clipboard

Different optimization results between 2.5.16 -> 2.5.20

Open nielsgl opened this issue 3 months ago • 2 comments

Hi!

I created a simple module and a set of 10 questions and answers to evaluate a single pdf loaded into chromadb. When evaluating using DSPy version 2.5.16 like

evaluate = dspy.Evaluate(
    devset=data, metric=metric, num_threads=24, display_progress=True, display_table=3
)
evaluate(rag)

I get a semantic F1 score of 69, then when I run the optimization (which takes about 15 minutes) and evaluating it I get a score of about 79.

tp = dspy.MIPROv2(
    metric=metric, auto="medium", num_threads=24
)  # use fewer threads if your rate limit is small

optimized_rag = tp.compile(
    RAG(),
    trainset=data[:7],
    valset=data[7:],
    max_bootstrapped_demos=2,
    max_labeled_demos=2,
    requires_permission_to_run=False,
    seed=0
)

evaluate(optimized_rag)

However when I run this with version 2.5.20, I first get a score of 61 and after optimization I get a score of 69. These seem quite different from each other and significantly lower. Everything is the same except I upgrade the DSPy library. Interestingly the optimization now finished in about 2 minutes which is significantly faster. Any thoughts on these differences?

nielsgl avatar Oct 30 '24 11:10 nielsgl