stagehand
stagehand copied to clipboard
Múltiple scrollable elements
Some pages' main scroll bar is not at the main window, but in some inner div.
Other pages may even have many scrollable elements, the main window being one of them or not.
It seems all scrolls made in the API calls to read more chunks assume a single scroll on the main window (e.g. when counting chunks and scrollToHeight()).
This makes all content inside scrollable areas other than the main window invisible to the API.
Additionally, scrollable elements are not eligible as "interactive" if not having such role attribute (which is not very common practice afaik) thus cannot be included in the DOM elements provided as context to the LLM. That means one cannot expect the LLM to decide to scroll them on its own, as it's unaware of them.
It would be nice if stagehand could try emitting scrollwheel events. Stagehand (and LLM output code) prefers window.scrollTo() which doesn't work in complex DOMs as mentioned here.
Stagehand could check if window height == viewport height: try scrollwheel.
This does work, but it would be good for Stagehand to reach for this tool as needed instead of having to code it statically.
await stagehand.page.mouse.wheel(0, 200);
Patching the handler to add new tools is possible. This one for example enables Stagehand to scroll with the mousewheel. A few usage notes:
- This patching function must be called after
stagehand.init() - In prompts to the LLM, you must tell it about the new
ScrollDownALittletool
/*
Monkeypatching actHandler._performPlaywrightMethod() Stagehand method that handles tool use calls from LLM responses
This adds a 'scrollDownALittle' tool that emits mousewheel events and works on dynamic SPAs
You need to tell the LLM (prompt) about this new tool or it won't use it
*/
function patchScrollBehavior(stagehand: any) {
// Get the act handler instance
const actHandler = Reflect.get(stagehand, 'actHandler');
if (!actHandler) {
throw new Error('Could not access actHandler');
}
const proto = Object.getPrototypeOf(actHandler);
const originalMethod = proto._performPlaywrightMethod;
// Monkeypatch to add a tool
proto._performPlaywrightMethod = async function(
method: string,
args: unknown[],
xpath: string,
domSettleTimeoutMs?: number
) {
if (method === 'scrollDownALittle') {
const viewport = await this.stagehand.page.viewportSize();
const scroll_y = viewport.height * 0.9;
await this.stagehand.page.mouse.wheel(0, scroll_y);
await this.waitForSettledDom(domSettleTimeoutMs);
return;
}
// Passthrough any other tool calls to the original implementation
return originalMethod.call(this, method, args, xpath, domSettleTimeoutMs);
};
}
Usage example:
await stagehand.init();
patchScrollBehavior(stagehand); // <-------------
await stagehand.page.goto("http://localhost/");
await stagehand.act({ action: "Scroll to the bottom of the page. Only use 'scrollDownALittle' for this." });
Hey! Sorry this fell through the cracks. Looking at this now. Totally agree here; do you have any websites you're trying to run Stagehand on that we can use in evals as we develop this feature?
No sorry just a local app.
Hi, a typical site could be https://www.mcmaster.com/products/screws/ (McMaster-Carr hardware parts supplier, screws section) at the left there is a scrollable vertical div, and inside it many scrollable sub-elements for different filters (that happen to load content dynamically when scrolled down).
Thanks!
On Fri, 3 Jan 2025 at 14:05, Vlad Ionescu @.***> wrote:
No sorry just a local app.
— Reply to this email directly, view it on GitHub https://github.com/browserbase/stagehand/issues/276#issuecomment-2569193684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMMDZVATIIBNX4B3GWLCOCT2I2DIHAVCNFSM6AAAAABTBGOXVCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRZGE4TGNRYGQ . You are receiving this because you authored the thread.Message ID: @.***>
@kamath Here's another example for experimentation (left column list of places)
https://www.google.com/maps/@35.7524409,139.6445643,11z/data=!4m3!11m2!2sPNFEEkYgSfymHKKn8zgJtA!3e3?entry=ttu&g_ep=EgoyMDI1MDEwOC4wIKXMDSoASAFQAw%3D%3D
@javuiz sorry this got a bit transpiled check out the new version with texExtract:true I just tested it on mcmastercarr and it worked