guardrails icon indicating copy to clipboard operation
guardrails copied to clipboard

Add better_profanity as a validator which takes care of word dividers

Open harsh306 opened this issue 11 months ago • 4 comments

Description Add better_profanity as a validator which takes care of word dividers

Why is this needed These examples are not taken care of today.

  1. "You p1ec3 of sHit."
  2. 'h@ndj@b'

Implementation details A better library is available https://github.com/snguyenthanh/better_profanity

End result Profanity check on hub

harsh306 avatar Mar 12 '24 02:03 harsh306

Hello @harsh306 , thanks for opening this. We're currently using alt-profanity-check for our ProfanityFree validator. According to the description given by alt-profanity-check, they use a linear SVM model to detect profane words instead of a static blacklist. Here's their comparison with better-profanity:

Screenshot 2024-03-12 at 9 56 25 AM

Really like the examples which include special characters and are still profane, which I don't think would be covered by the SVM model. It's an age-old question: whether to use a static match vs an ML model to detect. I think what we can do though - is use a combination of both approaches with an or, so that we can use the best of both worlds. What do you think @ShreyaR @zsimjee @CalebCourier ?

thekaranacharya avatar Mar 12 '24 14:03 thekaranacharya

Great discussion. I think that using both is ideal EXCEPT for the added latency. We should see if we could parallelize the two reqs and not hurt perf. If so, we should do that. Otherwise, we should see the magnitude on the hit on perf if we run these 2 serially. If the magnitude is large, it might make sense to parameterize the validator to use one or the other.

zsimjee avatar Mar 12 '24 17:03 zsimjee

Sounds good. Will close this issue once we add this update. TODO: Add update to ProfanityFree validator.

thekaranacharya avatar Mar 13 '24 15:03 thekaranacharya

Thanks, Looking forward to the PR

hpathak-godaddy avatar Mar 13 '24 19:03 hpathak-godaddy

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 14 days.

github-actions[bot] avatar Aug 22 '24 03:08 github-actions[bot]

This issue was closed because it has been stalled for 14 days with no activity.

github-actions[bot] avatar Sep 05 '24 03:09 github-actions[bot]