BanSubstrings doesn't redacts all correct words if case_sensitive=False
Describe the bug When running BanSubstrings with case_sensitive=False and redact=True and scanning the prompt, the function will redact only the words that match the casing.
To Reproduce
prompt = "The user can perform arbitrary virus code execution by Virus injecting malicious code."
ban_substrings = BanSubstrings(substrings=["virus", "bug"], redact=True)
sanitized_prompt, results_valid, results_score = ban_substrings.scan(prompt)
Expected behavior
Actual: The user can perform arbitrary [REDACTED] code execution by Virus injecting malicious code.
Expected: The user can perform arbitrary [REDACTED] code execution by [REDACTED] injecting malicious code.
Possible solution As str.replace is case sensitive, the issue might be solve by using regex's, e.g. like so:
def _redact_text(text: str, substrings: list[str]) -> str:
redacted_text = text
for s in substrings:
regex_redacted = re.compile(re.escape(s), re.IGNORECASE)
redacted_text = regex_redacted.sub("[REDACTED]", redacted_text)
return redacted_text
Hey @aalbersk ! Thanks for reporting this! The issue has been addressed in version 0.3.16, which is now available on PyPI. Please upgrade and let us know if you encounter any further problems.