haystack-core-integrations Allowing other LLMs and custom prompts in evaluation (specifically, deepeval)

Allowing other LLMs and custom prompts in evaluation (specifically, deepeval)

Open sanjayc2 opened this issue 6 months ago • 0 comments

Is your feature request related to a problem? Please describe. I cannot use a (small) local LLM or customized prompts for evaluation of the RAG pipeline output. Smaller LLMs (e.g., minicheck) have become as good as GPT for evaluation.

Describe the solution you'd like I would like to use a small local LLM for evaluation of the RAG pipeline output. At this time, it seems that only GPT LLMs are allowed. Smaller LLMs (e.g., minicheck) have become as good as GPT for evaluation. These local LLMs are available via Ollama. Also, there does not seem to be a way to customize the prompts used in haystack-deepeval.

Describe alternatives you've considered Use deepeval "offline", i.e. saved the question, contexts (chunks) and answer and use deepeval locally. This is not very convenient, since I would like to be able to fine tune the model.

Additional context The ability to use deepeval to evaluate a model during fine tuning is very useful. It is also good to be able to customize the prompt, since it looks like CoT or other techniques can improve evaluation outputs.

May 29 '25 02:05 sanjayc2

haystack-core-integrations haystack-core-integrations copied to clipboard

Allowing other LLMs and custom prompts in evaluation (specifically, deepeval)

haystack-core-integrations
haystack-core-integrations copied to clipboard