lighteval
lighteval copied to clipboard
[FT] Add tool usage benchmarks
Issue encountered
Lighteval does not allow evaluating models on tool usage.
Solution/Feature
Add benchmarks for tool usage
Add benchmarks for tool usage Are there any? What do you suggest?