[WIP] Browser backends

Open ollmer opened this issue 5 months ago • 1 comments

Separation of the miniwob task and underlying browser backend
MCP-Playwright as browser

Description by Korbit AI

What change is being made?

Introduce a new Browser/MCP-based backend for MiniWob benchmarks, along with environment, task, and benchmark scaffolding, plus a basic MCP Playwright integration and a small test script to exercise the backend.

Why are these changes being made?

Add browser-based task execution support to run MiniWob tasks via MCP Playwright, enabling end-to-end interaction with web tasks through a modular backend and benchmark framework. This lays the groundwork for browser-backed experiments and testing of browser interactions in a structured, extensible way.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

Oct 31 '25 17:10 ollmer

Based on your review schedule, I'll hold off on reviewing this PR until it's marked as ready for review. If you'd like me to take a look now, comment /korbit-review.

Your admin can change your review schedule in the Korbit Console

Oct 31 '25 17:10 korbit-ai[bot]