stagehand
stagehand copied to clipboard
Regression on Stagehand 2.2.1: Text extraction on links sometimes grabs an incorrect number
This code previously worked on stagehand 2.1:
const { count } = await page.extract({
instruction: `Extract the ${widgetTitle} link's value`,
schema: z.object({
count: z.number(),
}),
});
This would consistently produce a target number. Now we're getting inconsistent numbers with the 2.2.1 upgrade. The workaround we implemented that works consistently is:
const [action] = await page.observe({
instruction: `Extract the ${widgetTitle} link's value`,
});
const { count } = await page.extract({
selector: action.selector,
schema: z.object({
count: z.number(),
}),
});
Which tells me it's a problem around how we're processing instructions in extract (observe is consistently accurate, extract is not).
@seanmcguire12 this looks like related to your changes, since I see your name associated with changes around extract
Here's the HTML (part of a larger page) that the test interacts with:
<a data-test-selector="pivot-link" class="focusable inline-block hover:text-focus active:text-focus" title="" target="_blank" href=""> 20 </a>
Hey @jds2501-cs! Thanks for trying out the latest version. If you are looking to get the text value of the link here, I would try a prompt like "extract the ${widgetTitle} link's text". The LLM never gets to see the raw html structure, so it's understandable that it might get confused by what you mean by "value".
I don't think this is the LLM getting confused - if that was the case, I would see inconsistent results on observe. Observe consistently works without any issues.
The number also that prompt in extract is also finding a number that doesn't exist on the page.