[Feature Request] add GAIA2 benchmark
Required prerequisites
- [x] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
- [x] Consider asking first in a Discussion.
Motivation
https://huggingface.co/blog/gaia2
https://huggingface.co/datasets/meta-agents-research-environments/gaia2
Solution
No response
Alternatives
No response
Additional context
No response
@offthewallace thanks for reporting this!
Maybe I can try to do this job. ref:https://platform.openai.com/docs/api-reference/chat
@LuoPengcheng12138 thanks ,assign to you
hi, @fengju0213 ,I encountered a rate limit issue while running the benchmark. Even when I set max_concurrent_scenarios to 1, the problem persists. Are there any good solutions? Here's the error log:
2025-11-04 02:27:39,337 - camel.models.model_manager - ERROR - Error processing with model: <camel.models.openai_model.OpenAIModel object at 0x123c95100>
2025-11-04 02:27:39,338 - camel.camel.agents.chat_agent - WARNING - Rate limit hit (attempt 1/3). Retrying in 0.5s
hi, @fengju0213 ,I encountered a rate limit issue while running the benchmark. Even when I set
max_concurrent_scenariosto 1, the problem persists. Are there any good solutions? Here's the error log:2025-11-04 02:27:39,337 - camel.models.model_manager - ERROR - Error processing with model: <camel.models.openai_model.OpenAIModel object at 0x123c95100> 2025-11-04 02:27:39,338 - camel.camel.agents.chat_agent - WARNING - Rate limit hit (attempt 1/3). Retrying in 0.5s
is max_concurrent_scenarios a parameter in chatagent?
max_concurrent_scenarios is a parameter in the Gaia2 runtime environment, used to set the number of benchmark scenarios that run in parallel.But even when set to 1 (running sequentially), rate limit still occurs.
max_concurrent_scenariosis a parameter in the Gaia2 runtime environment, used to set the number of benchmark scenarios that run in parallel.But even when set to 1 (running sequentially), rate limit still occurs.
i see ,That's likely because the task is quite long and the ratelimit of the key used is relatively low. This issue occurred after multiple calls. There probably isn't a good solution right now; we'll have to wait until the account's ratelimit is increased.