PyRIT
PyRIT copied to clipboard
FEAT: CBT-Bench Dataset
CBT-Bench Dataset Integration for PyRIT
This pull request introduces support for the CBT-Bench dataset in PyRIT. The CBT-Bench dataset is a benchmark designed to evaluate the alignment and therapeutic safety of Large Language Models (LLMs) in the context of Cognitive Behavioral Therapy (CBT). By integrating this dataset, PyRIT can analyze psychotherapy-related prompts and identify potential risks, including cognitive distortions, harmful self-beliefs, and other vulnerabilities.
Key Changes
- Added a new function
fetch_cbt_bench_datasetto fetch and process the dataset. - Mapped dataset fields (
situation,core_belief_fine_grained) to PyRIT'sSeedPromptmodel. - Included default configuration (
core_fine_seed) with support for other configurations. - Updated documentation and comments to explain the purpose and usage of the dataset.
Related Issue
Closes: #865
Tagging @romanlutz for review.
Tests and Documentation
- Tests: Unit tests have not been added. They can be included in a subsequent commit within the same pull request if requested by the reviewer.
- Documentation: Inline documentation has been updated. The README has not been updated yet but will be addressed if deemed necessary.
- JupyText: Not applicable, as this contribution focuses on dataset integration rather than API or documentation changes.
References
Citation
@article{zhang2024cbt,
title={CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy},
author={Zhang, Mian and Yang, Xianjun and Zhang, Xinlu and Labrum, Travis and Chiu, Jamie C and Eack, Shaun M and Fang, Fei and Wang, William Yang and Chen, Zhiyu Zoey},
journal={arXiv preprint arXiv:2410.13218},
year={2024}
}
@microsoft-github-policy-service agree
@sravan0446 are you still interested in addressing the issues? 🙂