llm-guard
llm-guard copied to clipboard
Toxicity Scanner to return the type of content
When using the input or output toxicity scanner, it would be preferrable to return the type of label (e.g. sexual_explicit
) instead of the offensive content. It would enable applications to communicate the issue.
Hey @RQledotai , thanks for reaching out. Apologies for the delay.
I agree, and such refactoring is in works to actually return an object with more context about the reason behind blocking. Currently, the only way to monitor is logs.