CBT-Bench Dataset Integration for PyRIT

This pull request introduces support for the CBT-Bench dataset in PyRIT. The CBT-Bench dataset is a benchmark designed to evaluate the alignment and therapeutic safety of Large Language Models (LLMs) in the context of Cognitive Behavioral Therapy (CBT). By integrating this dataset, PyRIT can analyze psychotherapy-related prompts and identify potential risks, including cognitive distortions, harmful self-beliefs, and other vulnerabilities.

Key Changes

Added a new function fetch_cbt_bench_dataset to fetch and process the dataset.
Mapped dataset fields (situation, core_belief_fine_grained) to PyRIT's SeedPrompt model.
Included default configuration (core_fine_seed) with support for other configurations.
Updated documentation and comments to explain the purpose and usage of the dataset.

Related Issue

Closes: #865
Tagging @romanlutz for review.

Tests and Documentation

Tests: Unit tests have not been added. They can be included in a subsequent commit within the same pull request if requested by the reviewer.
Documentation: Inline documentation has been updated. The README has not been updated yet but will be addressed if deemed necessary.
JupyText: Not applicable, as this contribution focuses on dataset integration rather than API or documentation changes.

References

CBT-Bench Dataset on Hugging Face

Citation

@article{zhang2024cbt,
  title={CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy},
  author={Zhang, Mian and Yang, Xianjun and Zhang, Xinlu and Labrum, Travis and Chiu, Jamie C and Eack, Shaun M and Fang, Fei and Wang, William Yang and Chen, Zhiyu Zoey},
  journal={arXiv preprint arXiv:2410.13218},
  year={2024}
}

Apr 20 '25 19:04 sravan0446

@microsoft-github-policy-service agree

Apr 20 '25 19:04 sravan0446

@sravan0446 are you still interested in addressing the issues? 🙂

May 02 '25 07:05 romanlutz

FEAT: CBT-Bench Dataset

CBT-Bench Dataset Integration for PyRIT

Key Changes

Related Issue

Tests and Documentation

References

Citation