Available configurations/hyperparameters for data science scenarios

Open ShuxinLin opened this issue 1 month ago • 1 comments

Hi team,

First, thank you for the excellent work on the project. I’ve had success running the RD agent on several mle-bench/kaggle tasks, and now I’m looking to better understand and control the full execution lifecycle of the agent.

Are there any configuration options or hyperparameters exposed for the RD agent that I can start experimenting with?

For example, in other MLE-Bench agents, I’ve seen configurations such as: • step: maximum number of steps an agent can take • time_limit: total allowed time for the entire process (including Python execution) • exec_timeout: maximum time allowed for a single Python execution

…and potentially others.

I assume I can manage the overall time_limit by placing a timeout around the rdagent command itself, but I’d like to know what else is configurable and whether these parameters—or equivalents—are available for the RD agent.

It would be extremely helpful if the documentation could provide more detail on this. Thank you!

Nov 20 '25 17:11 ShuxinLin

Hi, @ShuxinLin , thanks for your question and for trying out RD-Agent!

You can check the configurable parameters for rdagent data_science in rdagent/app/data_science/loop.py.

For parameters that can be set via the .env file, see rdagent/app/data_science/conf.py.

For example, you mentioned:

step → corresponds to the step_n and loop_n arguments in the main() function;
time_limit → corresponds to the timeout argument in main() function;
exec_timeout → corresponds to the full_timeout parameter in conf.py;

We’ll update the documentation soon to make these options clearer.

Hope this helps you better control the agent’s execution!

Nov 24 '25 11:11 SunsetWolf