BrowserGym
BrowserGym copied to clipboard
WebArena Shopping evaluator issues
There is an issue with some WebArena shopping tasks:
- On task 275: it's a search task where the agent is asked to search for "xbox". So the reference URL is
__SHOPPING__/catalogsearch/result/?q=xbox. The agent (GenericAgent) gets to that URL correctly but is rewarded 0. - Same thing for task 274 and probably other tasks.
@xhluca do you have other task IDs with the same failure?