FEAT: Add TDC23-Red Teaming Dataset
Describe the solution you'd like
This dataset should be available within PyRIT: https://huggingface.co/datasets/walledai/TDC23-RedTeaming
Additional context
There are examples of how PyRIT interacts with other datasets here: https://github.com/search?q=repo%3AAzure%2FPyRIT%20%23%20The%20dataset%20sources%20can%20be%20found%20at%3A&type=code
[[Content Warning: Prompts are aimed at provoking the model, and may contain offensive content.]] Additional Disclaimer: Given the content of these prompts, keep in mind that you may want to check with your relevant legal department before trying them against LLMs.
can you please add me to this issue?
Almost worked on the idea and tried to test it but ran out of time so adding my pseudo code here:
def fetch_tdc23_redteaming_dataset() -> PromptDataset:
"""
Fetch TDC23-RedTeaming examples and create a PromptDataset.
Returns:
PromptDataset: A PromptDataset containing the examples.
"""
# Load the TDC23-RedTeaming dataset
data = load_dataset("walledai/TDC23-RedTeaming", "default")
prompts = [item["prompt"] for item in data["train"]]
dataset = PromptDataset(
name="PKU-SafeRLHF",
description="""This is a Hugging Face dataset that labels a prompt and 2 responses categorizing their
helpfulness or harmfulness. Only the 'prompt' column is extracted.""",
source="https://huggingface.co/datasets/walledai/TDC23-RedTeaming",
prompts=prompts,
)
return dataset
Add these functions to init.py as well to make it available in pyrit datasets
from pyrit.datasets import fetch_tdc23_redteaming_dataset
Then follow this notebook example to run this dataset and get it working https://github.com/Azure/PyRIT/blob/main/doc/code/orchestrators/pku_safe_rlhf_testing.ipynb