FEAT: dataset improvements

Open rlundeen2 opened this issue 1 month ago • 0 comments

Going through GitHub issues, we've had so many community contributions related to datasets, and we have so many remaining open issues. It's a really easy way to add value to PyRIT. Unfortunately, these often go unused because they are tough to manage. We can't query them in a standard way. We don't necessarily know the harm categories, the sizes of the datasets, etc.

As a solution to this, I created SeedDatasetProvider, which automatically will load SeedDatasets; This is a large usability improvement. Out of the box, users can now easily use any of our Datasets for attacks.

# This now has all our built-in datasets, including all yaml files and remote datasets like harmbench
builtin_datasets = await SeedDatasetProvider.fetch_all_datasets()

This also allows us to normalize how Scenarios will select datasets (always from the database, never from YAML). Right now, this only manages what the user will start an attack with, but in the future I would like to talk about extending this to other components. As an example, in the database we could have hundreds of adversarial system prompts and tag which are compatible with which attack. But querying this way with the YAML files is heavy and error prone.

One of our big advantages in pyrit is we have a database end-to-end. We should use that more :)

As I implemented this, I uncovered many bugs and inconsistencies I had to change to make it usable.

Documentation improvement

Overhauled the dataset documentation for both users and contributors.

Dataset structure consistency

Moved score datasets to the dataset folder along with converters and executors so it's consistent.
All paths should use path variables so it should be easier to change going forward.
NOTE: I debated on moving the yaml files close to their components, but I think we likely want to normalize on them using the database for datasets and NOT yaml files, and we could load these in the database the same way

Seed Structure improvements

Added to_attack_parameters method so users can use SeedGroups to make attack paramaters (objectives, prepended_conversations, SeedGroup)
Added harm_categories for seed_group
Renaming SeedDataset to contain "seeds" so it can map cleanly to prompt Groups
Added groups property to SeedDataset to easily retrieve those
Adding all filter params to get_seed_groups since that is likely what we want to query

Bugs Fixed

Consistently raise exceptions in memory. Before some places were and others not. But this was hiding bugs.
Bug fix so that get_seed_groups returns all seeds in the group if any of the seeds matches
Bug fix with sample rate
Bug fix with yaml objectives when is_objectives is set
Tons of little fixes with the datasets themselves. E.g. roles not defined, etc etc.
Fixed a bug in ConsolePrinter where it changed turns even if prompts were sent together (e.g. it relied on user messagepieces to print the turn)
Unknown harm categories, etc when fetching datasets. Vague warnings

Next items:

Create an initializer that uses SeedDatasetProvider to load all SeedDatasets into the database on initialize_pyrit
Normalize scenarios to only pull from the database for default objectives
Unfixed bug: targeted harm categories doesn't really work. It only adds them if a SeedPrompt is passed and not a SeedObjective
(mid-term) Normalize how executors, converters, and scorers use datasets. Likely we will want to auto-load these in the db in a similar way
(mid-term) add exected return value to SeedGroups
(mid-term) organize OSS datasets with harm category, objective information, etc.

Tests:

Unit tests added
All integration tests pass other than AML ones and the previously existing harm_categories one (that test passed before but the bug existed)

Nov 25 '25 16:11 rlundeen2