PyRIT icon indicating copy to clipboard operation
PyRIT copied to clipboard

FEAT Add image generation example with red teaming orchestrator and unify existing orchestrator definitions

Open romanlutz opened this issue 2 months ago • 0 comments

Description

This is based on #82 by @ysy970923 and related to #74.

The goal of this PR was to create a notebook example for image generation where we use a model to generate prompts that we pass to the image generation endpoint. We then score the response and pass the scoring result back to the prompt generating model. If the score indicates that we succeeded we can stop. If a prompt gets blocked (due to content filters) we can pass that information back to the model as well so that it can generate a better prompt.

While making the adjustments from the original PR to support this natively within the red teaming orchestrator it became evident that the additional complexity we introduced to support EndTokenRedTeamingOrchestrator would make maintaining this going forward even tougher. Since that orchestrator can be supported using a scoring red teaming orchestrator (with small modifications), a large part of the changes in this PR are about unifying the definitions into a single RedTeamingOrchestrator that uses a scorer and now also supports images. [FYI @rlundeen2 as mentioned a week ago]

In this process, I discovered that some validation code (e.g., in the DALLE Target) may not be working as intended. The target should know what to send. So if it can only accept 1 prompt request piece then we should send just that rather than failing validation. The normalizer is currently unaware of what the target may or may not accept and can't adjust. FYI @rdheekonda

Apart from that, the DALLE Target needs to handle errors slightly differently. When no image was generated, the data type has to be text since we don't have an image_path for an image to store. Otherwise, the memory code raises exceptions. FYI @jbolor21

Tests and Documentation

Note that tests and other notebooks have yet to be adjusted.

romanlutz avatar May 04 '24 08:05 romanlutz