browser-use icon indicating copy to clipboard operation
browser-use copied to clipboard

Add Gemini Computer Use Integration

Open Cheggin opened this issue 1 month ago • 2 comments

Summary

Adds native support for Gemini's Computer Use API by converting Computer Use function calls into Actor Use calls and returning them back to the Computer Use model. Currently requires creating a child of Agent() to allow for message formatting and function calling loop

Core Files

  • chat.py - ChatGeminiComputerUse LLM wrapper

    • Handles Computer Use tool configuration
    • Serializes messages between Browser Use and Gemini formats
    • Returns raw responses for function call handling
  • agent.py - ComputerUseAgent class

    • Extends base Agent with Computer Use function calling loop
    • Manages multi-turn conversation with screenshot feedback
    • Auto-formats results as JSON for structured_output support
  • bridge.py - ComputerUseBridge

    • Converts Gemini function calls to ActionResult objects
    • Orchestrates execution via ComputerUseActionExecutor
  • executor.py - ComputerUseActionExecutor

    • Executes Computer Use actions via Actor API
    • Handles coordinate denormalization (0-999 → actual pixels)
    • Implements all Computer Use functions (click_at, type_text_at, navigate, etc.)
  • computer_use_system_prompt.md - System prompt template for Computer Use workflow

Usage

from browser_use.llm.gemini_computer_use import ChatGeminiComputerUse, ComputerUseAgent

llm = ChatGeminiComputerUse(
    model='gemini-2.5-computer-use-preview-10-2025',
    api_key=os.getenv('GOOGLE_API_KEY'),
    enable_computer_use=True,
)

agent = ComputerUseAgent(
    task="Find the founders of Browser Use startup",
    llm=llm,
    use_vision=True,
    max_function_iterations= X, 
)

result = await agent.run()
print(result.structured_output.result)  

🤖 Generated with Claude Code

Cheggin avatar Oct 11 '25 20:10 Cheggin

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Oct 11 '25 20:10 CLAassistant

This would be great to have, it's a very fast and accurate model.

Niek avatar Oct 30 '25 10:10 Niek