langtest
langtest copied to clipboard
Add tests for RAG
We should add tests and benchmarks for RAG evaluation.
We can start with the ragas
evaluation metric:
Implement in library but give reference.
Evaluate LLMs and RAG a practical example using Langchain and Hugging Face
https://www.philschmid.de/evaluate-llm?ref=blog.langchain.dev