giskard [GSK-3513] Fix RAGAS metric computation

[GSK-3513] Fix RAGAS metric computation

Open pierlj opened this issue 1 year ago • 5 comments

According to this issue, some RAGAS metrics are not properly computed by RAGAS (this includes context recall, context precision and faithfulness).

To fix it:

added a retrieved_documents argument to the evaluate method
allowed the answer_fn to return retrieved documents alongside the answer to a question

May 10 '24 10:05 pierlj

May 10 '24 10:05 linear[bot]

Your pull request is modifying functions with the following pre-existing issues:

📄 File: giskard/rag/evaluate.py

Function	Unhandled Issue
`evaluate`	ValueError: Must provide either the `api_version` argument or the `OPENAI_API_VERSION` environment variable ... `Event Count:` 1
`evaluate`	TypeError: string indices must be integers, not 'str' main in model_pr... `Event Count:` 1
`evaluate`	NameError: name 'doc_chat' is not defined _main... `Event Count:` 1
`evaluate`	AttributeError: 'EvaluateRAG' object has no attribute 'custom_model' src.evalua... `Event Count:` 1

_{Did you find this useful? React with a 👍 or 👎}

May 10 '24 10:05 sentry[bot]

@pierlj looks good, can you add a test on the ragas metrics to make sure they are calculated correctly?

May 10 '24 13:05 mattbit

Good for me, @pierlj do you want to make a last check?

Yep, I will have a look

May 21 '24 09:05 pierlj

Please retry analysis of this Pull-Request directly on SonarCloud

May 28 '24 13:05 sonarqubecloud[bot]