strix icon indicating copy to clipboard operation
strix copied to clipboard

feat: Replace Playwright-based browser automation with browser-use

Open 0xallam opened this issue 2 weeks ago • 0 comments

Our current Playwright-based browser layer effectively requires vision/screenshot reasoning. Non-vision LLMs can’t interact with the browser today.

Motivations:

Non‑vision model support: Browser Use exposes browser control via higher‑level APIs designed for LLM agents, making interactions accessible without needing vision models (Playwright’s current use relies on visual screenshots). This means non‑vision models can work reliably with the browser.

Stealth/anti‑bot advantages: Browser Use ecosystem includes “stealth browser” features, proxy rotation, CAPTCHA bypass, and session persistence which can help bypass Cloudflare/anti‑bot defenses more reliably than plain Playwright.

Browser profiles: Built-in support for persistent browser profiles allows long-lived sessions, cookies, stored tokens, etc., improving stability across agent runs.

Headful login flows: Supports running a real visible browser for initial onboarding flows (SSO, 2FA, magic links, CAPTCHA), then re-using the authenticated profile headlessly afterward.


Browser Use docs: https://docs.browser-use.com/introduction https://github.com/browser-use/browser-use

0xallam avatar Dec 13 '25 22:12 0xallam