stagehand icon indicating copy to clipboard operation
stagehand copied to clipboard

`when` parameter

Open sameelarif opened this issue 1 year ago • 1 comments

Feature request

Add a method for making the browser wait for a given event to occur, given a natural-language input.

Ideas for Syntax

As a parameter in existing functions:

page.act({
     action: "click the 'add to cart' button",
     when: "the release timer runs out"
})

as a paremeter in a new goto function:

page.goto("https://vercel.com/bfcm", {
     waitFor: "total requests to exceed 86 million"
})

as it's own method:

page.waitFor("the release timer to run out")

baked into existing syntax

page.act({ action: "wait for the release timer to run out" })
page.act({ action: "click the 'add to card button'" })

Ideas for Implementation

Natural Language -> Playwright Code

Pass the current page into an LLM (potentially some JavaScript too since this will deal with reactive pages) and request it to return playwright code that waits for the input condition to pass. This is most efficient if the user's request can be fulfilled with a simple page.waitForLoadState("networkidle") or page.waitForSelector("<xpath>"). If the input requires multiple conditions to pass, we'll ask the LLM to return multiple of these statements.

Potentially the LLM could return custom JavaScript that works for more complex tasks. Like, for example, checking if an element has a certain CSS class or property attached to it.

sameelarif avatar Dec 04 '24 19:12 sameelarif

I wonder if this actually provides better performance than just giving the LLM a wait(seconds) tool and telling it to keep calling it until the condition is satisfied before continuing.

I think it's difficult to implement a waitFor(condition) without polling anyway, so might as well give the LLM control over the polling loop?

pirate avatar Nov 21 '25 20:11 pirate