ragas icon indicating copy to clipboard operation
ragas copied to clipboard

[R-243] docs: Add documentation on the best approach to define custom metrics.

Open ouphi opened this issue 10 months ago • 3 comments

[X] I checked the documentation and related resources and couldn't find an answer to my question.

I was exploring the possibility to create custom metrics. It seems that it is possible, by subclassing either Metric, MetricWithLLM or MetricWithEmbeddings.

For example I created a dummy example computing the answer length: Note that I don't need this specific metric, I used it to have a simple example.

import typing as t
from datasets import Dataset
from ragas import evaluate
from ragas.metrics.base import Metric, EvaluationMode
from langchain_core.callbacks import Callbacks
from ragas.run_config import RunConfig

class AnswerLength(Metric):
    """Simple example of a custom metric. Returning the answer length."""
    name: str = "answer_length"
    evaluation_mode: EvaluationMode = EvaluationMode.qa
    
    async def _ascore(
        self: t.Self, row: t.Dict, callbacks: Callbacks, is_async: bool
    ) -> float:
        return len(row["answer"])
    def init(self, run_config: RunConfig):
        """do nothing"""

answer_length = AnswerLength()

data_samples = {
    'question': ['When was the first super bowl?', 'Who won the most super bowls?'],
    'answer': ['The first superbowl was held on Jan 15, 1967', 'The most super bowls have been won by The New England Patriots'],
}

dataset = Dataset.from_dict(data_samples)
score = evaluate(dataset, metrics=[answer_length])
score.to_pandas()

My questions are:

  • Do you recommend creating custom metrics with ragas? Or is it preferable to exclusively rely on the pre-existing metrics offered by Ragas?
  • If yes, is the approach described above correct?

I am not sure if the possibility of creating custom metrics is an intended feature or not. I want to make sure that my custom metric implementations do not break when ragas evolves. I am interested in knowing the vision about supporting and documenting the possibility of having custom metrics in the future.

R-243

ouphi avatar Apr 08 '24 08:04 ouphi