PyRIT icon indicating copy to clipboard operation
PyRIT copied to clipboard

FEAT add tdc23 red teaming dataset

Open Lakshmiaddepalli opened this issue 1 year ago • 8 comments

Description This dataset should be available within PyRIT: https://huggingface.co/datasets/walledai/TDC23-RedTeaming from hugging face is available within pyrit

created a function in the fetch_example_datasets.py module within pyrit/datasets that loaded and extracted the "prompts" column from https://huggingface.co/datasets/walledai/TDC23-RedTeaming made changes in the init.py within pyrit/datasets to include the new function in the env path

Tests and Documentation Added fetch_tdc23_redteaming_dataset_testing.ipynb and fetch_tdc23_redteaming_dataset_testing.py in main/doc/code/orchestrators

@nina-msft @jbolor21

Lakshmiaddepalli avatar Oct 07 '24 16:10 Lakshmiaddepalli

@microsoft-github-policy-service agree

Lakshmiaddepalli avatar Oct 07 '24 18:10 Lakshmiaddepalli

@nina-msft @jbolor21 can you please review and provide comments

Lakshmiaddepalli avatar Oct 07 '24 18:10 Lakshmiaddepalli

@nina-msft @jbolor21 can you please review and provide comments

Yes - will take a look later today. Thank you for the contribution!

nina-msft avatar Oct 07 '24 18:10 nina-msft

Sure should update it today!

Lakshmiaddepalli avatar Oct 08 '24 08:10 Lakshmiaddepalli

flake8...................................................................Failed
- hook id: flake8
- exit code: 1

pyrit/datasets/fetch_example_datasets.py:441:121: E501 line too long (333 > 120 characters)

Check Links in Python and md Files.......................................Passed
pylint...................................................................Passed
mypy.....................................................................Passed
Error: Process completed with exit code 1.

Looks like this build failed, let me take a look to fix this issue 

Lakshmiaddepalli avatar Oct 09 '24 15:10 Lakshmiaddepalli

@nina-msft @romanlutz can you please review again

Lakshmiaddepalli avatar Oct 09 '24 15:10 Lakshmiaddepalli

Pipeline failed with

pyrit/datasets/fetch_example_datasets.py:441:121: E501 line too long (122 > 120 characters)
pyrit/datasets/fetch_example_datasets.py:442:121: E501 line too long (130 > 120 characters)
pyrit/datasets/fetch_example_datasets.py:443:121: E501 line too long (127 > 120 characters)

romanlutz avatar Oct 09 '24 16:10 romanlutz

@Lakshmiaddepalli - great job!

Are you able to run the pre-commit hooks on your end? That should fix the remaining build failures. If you installed PyRIT following our contributor guide, you should have pre-commit dependency available in your conda Python Environment.

pre-commit run --all-files

nina-msft avatar Oct 09 '24 18:10 nina-msft

Ran pre-commit run --all-files and it passed. @nina-msft

(venv) lakshmiaddepalli@Lakshmis-Air PyRIT % pre-commit run --all-files
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...............................................................Passed
check for added large files..............................................Passed
detect private key.......................................................Passed
black....................................................................Passed
flake8...................................................................Passed
Check Links in Python and md Files.......................................Passed
pylint...................................................................Passed
mypy.....................................................................Passed

Lakshmiaddepalli avatar Oct 14 '24 23:10 Lakshmiaddepalli

Ran pre-commit run --all-files and it passed. @nina-msft

(venv) lakshmiaddepalli@Lakshmis-Air PyRIT % pre-commit run --all-files
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...............................................................Passed
check for added large files..............................................Passed
detect private key.......................................................Passed
black....................................................................Passed
flake8...................................................................Passed
Check Links in Python and md Files.......................................Passed
pylint...................................................................Passed
mypy.....................................................................Passed

It's still failing on mypy in CI for some reason 🤔 (in files that you did not modify...so maybe it is transient).

When you make the last change I recommended above, I'll rerun the build for you and we can investigate further. Thanks for following up on this!

nina-msft avatar Oct 15 '24 17:10 nina-msft

Hey @Lakshmiaddepalli - I pushed a change directly to your branch that handles the merge conflict and some of the last comments I made :-) They are minor changes. Thank you for your work here, I'll make sure it gets merged in!

nina-msft avatar Oct 18 '24 22:10 nina-msft

Thanks, it was great working on this PR with you, @nina-msft @jbolor21 @jsong468 @romanlutz !

Lakshmiaddepalli avatar Oct 19 '24 00:10 Lakshmiaddepalli