Gemini model 2.0-flash structured output
[x] I have checked the documentation and related resources and couldn't resolve my bug.
Describe the bug A clear and concise description of what the bug is.
Hey! Getting many Pydantic validation errors computing faithfulness and answer_correctness. Guess it depends on model used and StructuredOutput capabilities. Its failing very often.
Ragas version: 0.2.15 Python version: 3.11.10
Code to Reproduce Share code to reproduce the issue
self.google_llm = LangchainLLMWrapper(
ChatGoogleGenerativeAI(
model="gemini-2.0-flash-001",
temperature=0.1,
top_p=0.9,
max_tokens=2048,
)
)
self.metrics = [
context_precision,
context_recall,
faithfulness,
answer_relevancy,
]
result = evaluate(
dataset=dataset,
metrics=self.metrics,
llm=self.evaluator_llm,
embeddings=self.google_embeddings,
# run_config=RUN_CONFIG,
# callbacks=[delay_handler], # Add delay callback
)
Error trace
Got: 1 validation error for ClassificationWithReason FN Field required [type=missing, input_value={'TP': [{'statement': 'Es...d partial solutions.'}]}, input_type=dict] Got: 1 validation error for NLIStatementOutput statements.30.verdict Field required [type=missing, input_value={'statement': 'The conver...ext mentions different'}, input_type=dict] For further information visit https://errors.pydantic.dev/2.11/v/missing For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE ) Got: 2 validation errors for NLIStatementOutput statements.31.reason Field required [type=missing, input_value={'statement': 'Knowledge-... that leverage domain-'}, input_type=dict] For further information visit https://errors.pydantic.dev/2.11/v/missing statements.31.verdict Field required [type=missing, input_value={'statement': 'Knowledge-... that leverage domain-'}, input_type=dict] For further information visit https://errors.pydantic.dev/2.11/v/missing
Expected behavior A clear and concise description of what you expected to happen.
Additional context Add any other context about the problem here.