evidently icon indicating copy to clipboard operation
evidently copied to clipboard

Add a New `WordNoMatch` Descriptor to Evidently

Open elenasamuylova opened this issue 5 months ago • 1 comments

Add a New WordNoMatch Descriptor to Evidently

About Hacktoberfest contributions: https://github.com/evidentlyai/evidently/wiki/Hacktoberfest-2024

Description:

Evidently already has an ExcludesWords() descriptor that checks if the text does not contain any specified words from a shared list.

However, sometimes you might need to check that the text does not contain words specific to each row instead of a shared list.

Example:

Question Response Forbidden Words
"Can I cancel my subscription at any time?" "You are allowed to cancel at any time, and we guarantee that you will receive a refund." ["guarantee", "allowed", "refund"]

What to Implement:

The new WordNoMatch() descriptor should:

  1. Accept a with_column parameter: This column contains a list of forbidden words specific to each row.
  2. Accept a lemmatize parameter (default True): When True, this will consider inflected or variant forms of words. Works the same as in the ExcludesWords() descriptor.
  3. Return True/False:
    • Return True if the words from the list are not present in the row's text.
    • Return False if forbidden words are present.

References:

  • Check the ExcludesWords() descriptor as reference.
  • For a two-column descriptor implementation, check the SemanticSimilarity descriptor and the CustomPairColumnEval template.

elenasamuylova avatar Sep 23 '24 18:09 elenasamuylova