llm-guard icon indicating copy to clipboard operation
llm-guard copied to clipboard

BanSubstrings doesn't redacts all correct words if case_sensitive=False

Open aalbersk opened this issue 1 year ago • 1 comments

Describe the bug When running BanSubstrings with case_sensitive=False and redact=True and scanning the prompt, the function will redact only the words that match the casing.

To Reproduce

prompt = "The user can perform arbitrary virus code execution by Virus injecting malicious code."
ban_substrings = BanSubstrings(substrings=["virus", "bug"], redact=True)
sanitized_prompt, results_valid, results_score = ban_substrings.scan(prompt)

Expected behavior Actual: The user can perform arbitrary [REDACTED] code execution by Virus injecting malicious code. Expected: The user can perform arbitrary [REDACTED] code execution by [REDACTED] injecting malicious code.

Possible solution As str.replace is case sensitive, the issue might be solve by using regex's, e.g. like so:

def _redact_text(text: str, substrings: list[str]) -> str:
        redacted_text = text
        for s in substrings:
            regex_redacted = re.compile(re.escape(s), re.IGNORECASE)
            redacted_text = regex_redacted.sub("[REDACTED]", redacted_text)
        return redacted_text

aalbersk avatar Jan 17 '25 18:01 aalbersk

Hey @aalbersk ! Thanks for reporting this! The issue has been addressed in version 0.3.16, which is now available on PyPI. Please upgrade and let us know if you encounter any further problems.

asofter avatar May 19 '25 12:05 asofter