evidently Add a new `ROUGE` metric to Evidently

Add a new `ROUGE` metric to Evidently

Open elenasamuylova opened this issue 5 months ago • 2 comments

About Hacktoberfest contributions: https://github.com/evidentlyai/evidently/wiki/Hacktoberfest-2024

Description

The ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metric evaluates the quality of a generated text by comparing it to a reference text (typically a summary). It measures how much of the reference text is covered by the generated summary through n-gram overlap. Several common ROUGE variants exist:

ROUGE-1: Measures unigram (word-level) overlap.
ROUGE-2: Measures bigram (two-word sequence) overlap.
ROUGE-N: Measures n-gram overlap between the candidate and reference text.

We can implement a ROUGE metric that takes the parameter n and computes both the descriptor values (overlap) for each row and a summary ROUGE metric for the dataset.

Note that this implementation would require creating a new Metric (instead of defaulting to ColumnSummaryMetric to aggregate descriptors values) to compute and visualize the summary ROUGE score. You can check other dataset-level metrics (e.g., from classification or ranking) for inspiration.

Sep 23 '24 21:09 elenasamuylova

evidently evidently copied to clipboard

Add a new `ROUGE` metric to Evidently

evidently
evidently copied to clipboard