PyRIT FEAT add support for reasoning models as scorers

Is your feature request related to a problem? Please describe.

Right now, there is no support for models which output tokens besides the json with the scoring. This is a problem for models like deepseek-r1 which output tokens.

Describe the solution you'd like

Proposed in PR #719 Just change from remove_markdown_json which just removes the "```" tokens to getting the content between the "{" and "}" tokens

Describe alternatives you've considered, if relevant

Additional context

Mar 10 '25 11:03 joaodunas

Many scorers rely on getting JSON-formatted responses. Removing the curly braces would break that behavior. Isn't the response from a reasoning model also going to be text? In what way is that different? Can you provide an illustrative example?

I'd love to help or at least suggest a way forward, but I suspect I'm missing something critical.

Mar 10 '25 19:03 romanlutz

In the PR I had the curly braces again, the code is working. I am using that modified version for my thesis but I know it would require additional testing to get pushed to production. But I need some help creating those tests (I don't have experience on that)

Mar 10 '25 19:03 joaodunas

This can handle markdown tags around the JSON, or even thinking tokens or anything else that could be around the JSON

Mar 10 '25 20:03 joaodunas