Koan-Sin Tan
Results
251
comments of
Koan-Sin Tan
* select the benchmark, not datasets (they should not be selectable) from UI. e.g., assuming we have both ifeval and tinymmlu as planned, they are not supposed to be electable...