llm-guard icon indicating copy to clipboard operation
llm-guard copied to clipboard

Toxicity Scanner to return the type of content

Open RQledotai opened this issue 11 months ago • 1 comments

When using the input or output toxicity scanner, it would be preferrable to return the type of label (e.g. sexual_explicit) instead of the offensive content. It would enable applications to communicate the issue.

RQledotai avatar Mar 18 '24 20:03 RQledotai

Hey @RQledotai , thanks for reaching out. Apologies for the delay.

I agree, and such refactoring is in works to actually return an object with more context about the reason behind blocking. Currently, the only way to monitor is logs.

asofter avatar Mar 22 '24 08:03 asofter