ragas
ragas copied to clipboard
Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
[ ] I have checked the [documentation](https://docs.ragas.io/en/latest/getstarted/testset_generation.html) and related resources and couldn't resolve my bug. **Bug Description** I wanted to generate a test dataset to evaluate my RAG application. I...
**Question** Using Azure OpenAI endpoint I've been able to define the Azure model like in https://github.com/explodinggradients/ragas/blob/main/docs/howtos/customisations/azure-openai.ipynb Q&A with chat completion works, however when trying to evaluate faithfulness I get an...
**Describe the bug** When evaluate with long context, an error was raised like this: ``` openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 4097 tokens....
I am trying to get the following metrics using ragas 0.1.0 , open AI version as 1.12.0, llama index as 0.8.51.post1 answer_relevancy, answer_correctness but getting this error **openai.NotFoundError: Error code:...
Hey, apologies for creating multiple issue threads so rapidly, but I like this library and hope it gains more traction! Some metrics are giving different results each time they are...
**Describe the bug** When attempting to evaluate `answer_similarity` and `answer_correctness` using the Ragas framework, I encounter a timeout error. While I can successfully retrieve metrics for `context_relevancy` and `context_recall`, the...
**Describe the bug** I was trying to follow the steps mentioned here: https://docs.ragas.io/en/v0.0.22/howtos/customisations/llms.html to bring my own llm(AzureOpenAI) into ragas evaluation. I was interested in 3 of the following metrics:...
**Describe the bug** When using gpt-4-0125-preview and gpt-4-1106-preview as llm I obtain only nan values for the context_recall metrics. Ragas version: 0.1.5 Python version: 3.11.7 **Code to Reproduce** ``` result_ragas...
**Describe the Feature** I would like to integrate https://github.com/michaelfeil/infinity for embeddings inference. This would automatically batch up concurrent request, uses flash-attention2, compatible with cuda, rocm, apple mps and cpu. Depending...
**Describe the Feature** with V0.1, support for langchain EvalChain is broken, add that back in