insight-bench
insight-bench copied to clipboard
Is it reasonable for all tasks to use the same goal?
I noticed that currently all tasks are using the same goal. Is this the intended design?
In my understanding, each task should have its own specific goal derived from its metadata, rather than sharing a common goal across all tasks. This would allow for more task-specific and accurate evaluation.
load agent
agent = agents.Agent( model_name=exp_dict["model_name"], max_questions=exp_dict["max_questions"], branch_depth=exp_dict["branch_depth"], n_retries=2, savedir=savedir, ) This does not fill the goal.