[WIP] Browser backends
- Separation of the miniwob task and underlying browser backend
- MCP-Playwright as browser
Description by Korbit AI
What change is being made?
Introduce a new Browser/MCP-based backend for MiniWob benchmarks, along with environment, task, and benchmark scaffolding, plus a basic MCP Playwright integration and a small test script to exercise the backend.
Why are these changes being made?
Add browser-based task execution support to run MiniWob tasks via MCP Playwright, enabling end-to-end interaction with web tasks through a modular backend and benchmark framework. This lays the groundwork for browser-backed experiments and testing of browser interactions in a structured, extensible way.
Is this description stale? Ask me to generate a new description by commenting
/korbit-generate-pr-description
Based on your review schedule, I'll hold off on reviewing this PR until it's marked as ready for review. If you'd like me to take a look now, comment /korbit-review.
Your admin can change your review schedule in the Korbit Console