agent-squad
agent-squad copied to clipboard
Feature request: Integration with the evaluation frameworks
Use case
Hey Guys,
I would like to evaluate an agent. I didn't find the documentation for this, however I believe you have already some preferred approaches for this. I would like to adapt the solution to the AWS suggested and more important supported approach.
Solution/User Experience
What is the preferred approach do you suggest to evaluate the agents?
Do you have plans to integrate some evaluation framework like deepeval, ragas or lagfuse (additional tracing)?
Alternative solutions