prompttools
prompttools copied to clipboard
Add benchmarks and evals for jailbreaks
🚀 The feature
As we add benchmarks, it would be good to cover common jailbreak scenarios. We should incorporate these benchmarks, and have auto-evals that check responses to see if they are "broken"
Motivation, pitch
https://github.com/llm-attacks/llm-attacks
Alternatives
No response
Additional context
No response