insight-bench icon indicating copy to clipboard operation
insight-bench copied to clipboard

Is it reasonable for all tasks to use the same goal?

Open duoyw opened this issue 2 months ago • 0 comments

I noticed that currently all tasks are using the same goal. Is this the intended design?

In my understanding, each task should have its own specific goal derived from its metadata, rather than sharing a common goal across all tasks. This would allow for more task-specific and accurate evaluation.

load agent

agent = agents.Agent( model_name=exp_dict["model_name"], max_questions=exp_dict["max_questions"], branch_depth=exp_dict["branch_depth"], n_retries=2, savedir=savedir, ) This does not fill the goal.

Image

duoyw avatar Nov 11 '25 07:11 duoyw