Asaad
Results
1
issues of
Asaad
This PR (addressing #683 ) adds a new probe implementing Persuasive Adversarial Prompts (PAP) from the paper ["Persuasive Adversarial Prompts"](https://arxiv.org/abs/2401.06373). This probe tests whether LLMs can resist jailbreak attempts that...
probes