promptfoo icon indicating copy to clipboard operation
promptfoo copied to clipboard

Isolated test-prompt combinations per config

Open rgevrek opened this issue 1 year ago • 2 comments

Description: When executing promptfoo eval -o report/report.html -c usecases/*-config.yaml, all prompts and tests from all configuration files are combined. This means that each test is asserted against each prompt, regardless of the configuration file they are defined in.

However, there is currently no option to have isolated test-prompt combinations per configuration file. This could be useful in scenarios where specific tests should only be run against prompts defined in the same configuration file.

Proposed Solution: Introduce a new command-line flag or configuration option that allows users to specify whether test-prompt combinations should be isolated per configuration file or combined across all configurations.

For example, a new flag --isolated could be added to the promptfoo eval command:

promptfoo eval -o report/report.html -c usecases/*-config.yaml --isolated

When the --isolated flag is present, the evaluation would run each test only against the prompts defined in the same configuration file.

Alternatively, aquivalent to scenarios consisting of var-test pairs, prompts can have a tests property or a new property containing prompts and tests can be introduced.

Benefits:

Increased flexibility in running tests against prompts Better organization and separation of concerns for different use cases or configurations Easier debugging and troubleshooting when tests fail, as the scope is limited to a specific configuration file

rgevrek avatar Apr 02 '24 11:04 rgevrek

Thanks for the suggestion! I want to make sure I am understanding correctly. It sounds like essentially you could just run the promptfoo eval command several times. Is the catch that you want to view the results in a single view?

typpo avatar Apr 03 '24 14:04 typpo

Yes, absolutely, you could execute promptfoo eval for each YAML file separately, but cannot take advantage of an easy asterix command like usecases/*-config.yaml, have to maintain or add another script for evaluating all YAML configurations and the output on the console and within the report are not aggregated in a single view.

rgevrek avatar Apr 03 '24 14:04 rgevrek