promptfoo
promptfoo copied to clipboard
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command...
## Summary Remove the retry command and `--retry-errors` flag functionality that was added in #5647, as it is redundant with existing filtering capabilities. ## Problem Analysis The retry functionality provided...
## Summary - replace the old eval/history routes with a new `/results` page that exposes tabs for evals and red team reports, keeps the URL query param in sync, and...
# Fix JSON Parsing Bug in Custom Redteam Strategy ## Problem The Custom redteam strategy was failing with "Expected eval grader value to be a boolean" errors due to incorrect...
https://linear.app/promptfooo/issue/ENG-432/results-table-css-animation-can-undulate-infinitely
WIP
## Bug Description The MCP server's `get_evaluation_details` tool cannot fetch details for evaluations listed by `list_evaluations` due to an eval ID format validation mismatch. ## Steps to Reproduce 1. Call...
**Describe the bug** Hello, The `promptfoo:simulated-user` provider does not check for errors after calls to `sendMessageToAgent` in `promptfoo/dist/src/providers/simulatedUser.js`. Notice how `sendMessageToUser` **does** perform the error check. As a result, if...
I have a vision QA use case with multiple prompts and each prompt has separate set of images & test cases/assertions. I was looking at recommended project structure as mentioned...
**Describe the bug** When using a file-based JS rubric (e.g., `value: file://index.js:rubric`) in a `llm-rubric` assertion, the output of the script is logged but not used as the rubric text...
Hi, so I've been trying to integrate n8n prompts by first extracting them and then converting the variables to a promptfoo compatible format. However, I got stuck with an output...