StageHand Application Error When Using Local Ollama Model
Issue Description
I created a StageHand application using npx create-browser-app and configured it to use a local Ollama model. When I run npm run start, I encounter an error.
Reproduction Steps
- Created a new application using
npx create-browser-app - Configured the application to use a local Ollama model
- Ran
npm run start - Encountered the error described below
Configuration Code
stagehand.config.ts
llmClient: new CustomOpenAIClient({
modelName: "llama3.2:3b",
client: new OpenAI({
baseURL: "http://localhost:11434/v1",
apiKey: "ollama",
}),
}),
Error Message
INFO: found elements
category: "observation"
elements: []
/Users/xxx/Projects/stagehand/utils.ts:47
const xpathList = results.map((result) => result.selector);
TypeError: Cannot read properties of undefined (reading 'selector')
at <anonymous> (/Users/xxx/Projects/stagehand/utils.ts:47:52)
at Array.map (<anonymous>)
at drawObserveOverlay (/Users/xxx/Projects/stagehand/utils.ts:47:29)
at main (/Users/xxx/Projects/stagehand/index.ts:42:9)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at run (/Users/xxx/Projects/stagehand/index.ts:110:3)
Environment Information
- Operating System: macOS Version 15.3.2
- Node.js Version: v20.11.0
Hey! This is likely due to the fact that Ollama models are usually too small to consistently run Stagehand code. I strongly recommend trying Gemini 2.0 Flash: it's our most accurate model, and it's basically free!
I work on a project that involves testing browser use tools with as many models as I can reasonably setup and run.
@HeXavi8 if you are just trying to explore / have fun with local models, I'd suggest you try a qwen family of instruct models. And don't go for the smaller ones.
In my experience with stagehand and local models on non-data center GPUs, the speed and inconsistencies require some defensive coding like increasing the timeout and super gracefully handling errors with retries. (Happy to push up a few code changes to the examples if that's something @kamath and team would like 🤷 ). -- to be clear, I'm also not recommending this approach for "production" use cases, either. And that llm leaderboard that y'all have is an amazing resource! 🤘
P.S. These local models also might perform better in the winter, when ambient temps around your laptop are lower. 🫠 😂
How can we use google gemini flash as recommended here: https://www.browserbase.com/blog/unit-testing-ai-agents-in-the-browser?dub_id=DekAJX1UpdOfFVsi
This project seems interesting @esthor , browserbase might need it, for testing out different models with them
This project seems interesting @esthor , browserbase might need it, for testing out different models with them
Hey! Yeah, it's kinda perfect for the use case of Agent tool devs for iteration on models/prompts/etc as well as agent devs for even deciding what tool.
tbh, I kinda put it on the back-burner, because a million other fires, but if it'd be useful for y'all, I'd love to pic it back up and ship it. What's the best place to carry-on this convo?