promptfoo
promptfoo copied to clipboard
Enhance UI to group `--repeat` tests into collapsible sections
I often test with --repeat 20 to verify that output is reliable across multiple LLM runs. The resulting table output in the web viewer is hard to navigate due to all the duplicates.
It would be wonderful if I could get an overview of the tests and see at a glance how many times each test PASSED/FAILED out of the 20 runs.
Here is an idea for a simple UI improvement that could work for me:
Description of Test1 - 18 PASS, 2 FAIL
Normal table output goes here for all 20 instances of this testDescription of Test2 - 20 PASS
Normal table output goes here for all 20 instances of this test
One more comment on this, as I've considered parsing the json output and building our own UI to show results.
In the json output, there is currently nothing that you can use to group together different --repeats of the same test. So a good first step would be to add to the json output an identifier that you can use to group all the repeats of the same test.