promptfoo icon indicating copy to clipboard operation
promptfoo copied to clipboard

Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command...

Results 390 promptfoo issues
Sort by recently updated
recently updated
newest added

## Summary Remove the retry command and `--retry-errors` flag functionality that was added in #5647, as it is redundant with existing filtering capabilities. ## Problem Analysis The retry functionality provided...

## Summary - replace the old eval/history routes with a new `/results` page that exposes tabs for evals and red team reports, keeps the URL query param in sync, and...

codex

# Fix JSON Parsing Bug in Custom Redteam Strategy ## Problem The Custom redteam strategy was failing with "Expected eval grader value to be a boolean" errors due to incorrect...

https://linear.app/promptfooo/issue/ENG-432/results-table-css-animation-can-undulate-infinitely

## Bug Description The MCP server's `get_evaluation_details` tool cannot fetch details for evaluations listed by `list_evaluations` due to an eval ID format validation mismatch. ## Steps to Reproduce 1. Call...

**Describe the bug** Hello, The `promptfoo:simulated-user` provider does not check for errors after calls to `sendMessageToAgent` in `promptfoo/dist/src/providers/simulatedUser.js`. Notice how `sendMessageToUser` **does** perform the error check. As a result, if...

I have a vision QA use case with multiple prompts and each prompt has separate set of images & test cases/assertions. I was looking at recommended project structure as mentioned...

**Describe the bug** When using a file-based JS rubric (e.g., `value: file://index.js:rubric`) in a `llm-rubric` assertion, the output of the script is logged but not used as the rubric text...

bug

Hi, so I've been trying to integrate n8n prompts by first extracting them and then converting the variables to a promptfoo compatible format. However, I got stuck with an output...