agential
agential copied to clipboard
[Feature Request]: Evaluation Metrics
Feature Description
Evaluation metrics like f1, precision, recall, EM, fuzzy match?, pass@k and any other ones relevant to our currently supported benchmarks
Reason
No response