Clickable (interactive) elements not being detected.
Bug Description
When I try to run the agent for the URL "https://www.ebay.com/sch/i.html?_nkw=cards+against+humanity+christmas+2024", it does not detect anything on the page as interactable after scrolling the page once (no elements get highlighted). So i want it to click on the next page button, but it is not being detected. LLM used: gemini 2.0 flash
Reproduction Steps
Run the agent (The code sample is a very minimal version of my code, but i have tried creating custom controllers for scrolling and used a longer, more specific prompt)
Code Sample
from browser_use import Agent, Browser, Controller, ActionResult, BrowserConfig
from browser_use.browser.context import BrowserContextConfig, BrowserContext
from langchain_google_genai import ChatGoogleGenerativeAI
import sys
import asyncio
import os
from dotenv import load_dotenv
load_dotenv()
llm = ChatGoogleGenerativeAI(
model="gemini-2.0-flash",
api_key=os.getenv("GOOGLE_API_KEY"),
temperature=0,
)
controller = Controller()
config = BrowserConfig(
headless=False,
)
context_config = BrowserContextConfig(
wait_for_network_idle_page_load_time=3,
browser_window_size={"width": 1400, "height": 850},
highlight_elements=True,
)
@controller.action("Open URL")
async def open_url(url: str, browser: BrowserContext):
page = await browser.get_current_page()
page.set_default_navigation_timeout(0)
await page.goto(url)
print("Opening URL")
await page.wait_for_load_state("domcontentloaded")
await page.wait_for_timeout(3000)
print("Opened URL")
return ActionResult(return_message="Opened URL")
async def main(url):
browser = Browser(config=config)
context = BrowserContext(browser=browser,config=context_config)
agent = Agent(
task="Scroll to the botom of the page and click on the next page button",
llm=llm,
controller=controller,
use_vision=False,
generate_gif=False,
browser_context=context,
initial_actions=[
{'open_url': {'url': url, 'browser': context}},
],
)
result = await agent.run(max_steps=50)
await browser.close()
if __name__ == "__main__":
sys.stdout.reconfigure(encoding="utf-8")
sys.stderr.reconfigure(encoding="utf-8")
asyncio.run(main("https://www.ebay.com/sch/i.html?_nkw=cards+against+humanity+christmas+2024"))
Version
0.1.40
LLM Model
Other (specify in description)
Operating System
Windows 11
Relevant Log Output
It works. Would you test the git version?
INFO [agent] 🚀 Starting task: Scroll to the botom of the page and click on the next page button
Opening URL
Opened URL
INFO [agent] 📍 Step 1
INFO [agent] 🤷 Eval: Unknown - I don't know if the click was successful yet.
INFO [agent] 🧠 Memory: Starting with the new task. I have completed 1/10 steps
INFO [agent] 🎯 Next goal: Scroll to the bottom of the page.
INFO [agent] 🛠️ Action 1/1: {"scroll_down":{}}
INFO [controller] 🔍 Scrolled down the page by one page
INFO [agent] 📍 Step 2
INFO [agent] 🤷 Eval: Unknown - I don't know if the scroll was successful yet.
INFO [agent] 🧠 Memory: Starting with the new task. I have completed 1/10 steps. I have scrolled down once.
INFO [agent] 🎯 Next goal: Scroll to the bottom of the page.
INFO [agent] 🛠️ Action 1/1: {"scroll_down":{}}
INFO [controller] 🔍 Scrolled down the page by one page
INFO [agent] 📍 Step 3
INFO [agent] 🤷 Eval: Unknown - I don't know if the scroll was successful yet.
INFO [agent] 🧠 Memory: Starting with the new task. I have completed 1/10 steps. I have scrolled down twice.
INFO [agent] 🎯 Next goal: Scroll to the bottom of the page.
INFO [agent] 🛠️ Action 1/1: {"scroll_down":{}}
INFO [controller] 🔍 Scrolled down the page by one page
INFO [agent] 📍 Step 4
INFO [agent] 🤷 Eval: Unknown - I don't know if the scroll was successful yet.
INFO [agent] 🧠 Memory: Starting with the new task. I have completed 1/10 steps. I have scrolled down three times.
INFO [agent] 🎯 Next goal: Scroll to the bottom of the page.
INFO [agent] 🛠️ Action 1/1: {"scroll_down":{}}
INFO [controller] 🔍 Scrolled down the page by one page
INFO [agent] 📍 Step 5
INFO [agent] 🤷 Eval: Unknown - I don't know if the scroll was successful yet.
INFO [agent] 🧠 Memory: Starting with the new task. I have completed 1/10 steps. I have scrolled down four times.
INFO [agent] 🎯 Next goal: Scroll to the bottom of the page.
INFO [agent] 🛠️ Action 1/1: {"scroll_down":{}}
INFO [controller] 🔍 Scrolled down the page by one page
INFO [agent] 📍 Step 6
INFO [agent] 🤷 Eval: Unknown - I don't know if the scroll was successful yet.
INFO [agent] 🧠 Memory: Starting with the new task. I have completed 1/10 steps. I have scrolled down four times.
INFO [agent] 🎯 Next goal: Click on the next page button.
INFO [agent] 🛠️ Action 1/1: {"click_element":{"index":284}}
INFO [controller] 🖱️ Clicked button with index 284:
INFO [agent] 📍 Step 7
INFO [agent] 👍 Eval: Success - I clicked on the next page button.
INFO [agent] 🧠 Memory: Starting with the new task. I have completed 1/10 steps. I have scrolled down four times.
INFO [agent] 🎯 Next goal: Complete the task.
INFO [agent] 🛠️ Action 1/1: {"done":{"text":"I have scrolled to the bottom of the page and clicked on the next page button.","success":true}}
INFO [agent] 📄 Result: I have scrolled to the bottom of the page and clicked on the next page button.
INFO [agent] ✅ Task completed
INFO [agent] ✅ Successfully
It works. Would you test the git version?
where can i find that version?
pip install git+https://github.com/browser-use/browser-use
This command didn't work, it just ended up deleting everything
Did you get any errors when installing?
https://docs.browser-use.com/development/local-setup
I did not get any errors, but upon installing thtere was only the buildomtree.js file in the library, nothing else. Is there some way to install it properly using pip only, without using a venv?
Is the library's location the same as the one found in pip show browser-use?
In local installation, you can skip venv too.
yes, i checked in that location only
this error should be fixed now, there was a build issue briefly. please pull main, uv sync, and try again
i cloned the repo into Lib/site-packages and ran pip install . , that worked for me. Thanks for the help.