mobile_app_open
mobile_app_open copied to clipboard
Allow more than 1 LLM benchmark
with multiple models such as 1b, 3b, or 8b (and possibly multiple datasets such as mmlu or ifeval)