PyRIT
PyRIT copied to clipboard
FEAT: Psychosocial Harms Red Teaming Automation
Description
Adding notebook for red teaming for psyschosocial harms using a multi-step approach of modeling user behaviors, contexts, and evaluations
- Created new conversation scorer to score the entire conversation
- Added a toy dataset with sample multi-turn conversations
- Added a sample attack strategy yaml file modeling a user escalation towards crisis
Tests and Documentation
Ran notebook