BrowserGym
BrowserGym copied to clipboard
Feature/webcanvas integration
Pull Request: Integrate WebCanvas Key Node Evaluation and Mind2web-live Benchmark into BrowserGym
Description
This PR officially integrates the WebCanvas key node evaluation and the Mind2web-live benchmark into BrowserGym.
Core Features
-
Key Node-Based Evaluation:
- Implements a key node-based evaluation system to provide detailed assessments of web task processes.
-
JavaScript Event Evaluation:
- Utilizes page JavaScript events to deliver accurate evaluations independent of the action space.
-
Debug Modules and Logging:
- Includes various debug modules and logging functionalities to clearly display the evaluation process.
-
Mind2web-live Dataset Integration:
- Integrates the Mind2web-live dataset to support benchmark testing.
-
Community Contributions:
- Encourages further contributions from the community to expand and refine the evaluation framework and benchmarks.