browser-use icon indicating copy to clipboard operation
browser-use copied to clipboard

Can not locate em element

Open muww opened this issue 8 months ago • 1 comments

Bug Description

I use Deepseek-v1 trying to a button, which is an em element, but browser use can not locate it.

Reproduction Steps

execute the Code Sample.

Here is the web page and the element:

Image

I also read the issues and some say use the custom function, here is the code:

@controller.action('Accept w3schools cookies') async def accept_w3schools_cookies(url: str, browser: Browser): page = await browser.get_current_page() await page.evaluate(''' () => { const element = document.getElementById('xxxx'); if (element) { element.click(); } } ''') msg = 'xxx' logger.info(f'🤘 - {msg}') return ActionResult(extracted_content=msg)

but the get_current_page method doesn't exist:

Image

Is there any way I can tell it how to exactly locate the element? Like an XPath?

Code Sample

from browser_use import Agent, Browser, BrowserConfig, Controller, ActionResult
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
import asyncio

load_dotenv()

browser = Browser(
    config=BrowserConfig(
        chrome_instance_path="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
    )
)

llm = ChatOpenAI(base_url='https://api.deepseek.com/v1', model='deepseek-chat', api_key="sk-xx")

agent = Agent(
    task='''
    1.open url:https://study.163.com/course/courseLearn.htm?courseId=1213776805#/learn/video?lessonId=1285515182&courseId=1213776805
    2.click the em element with title="学过了"
    ''',
    llm=llm,
    browser=browser,
    use_vision=False
)

async def main():
    await agent.run()

if __name__ == '__main__':
    asyncio.run(main())

Version

0.1.37

LLM Model

DeepSeek Coder

Operating System

mac OS 13.3

Relevant Log Output

INFO     [agent] 🧠 Memory: Starting the task: 1. Navigate to the specified URL.
INFO     [agent] 🎯 Next goal: Navigate to the URL provided in the task.
INFO     [agent] 🛠️  Action 1/1: {"go_to_url":{"url":"https://study.163.com/course/courseLearn.htm?courseId=1213776805#/learn/video?lessonId=1285515182&courseId=1213776805"}}
INFO     [controller] 🔗  Navigated to https://study.163.com/course/courseLearn.htm?courseId=1213776805#/learn/video?lessonId=1285515182&courseId=1213776805
INFO     [agent] 📍 Step 2
INFO     [agent] 👍 Eval: Success - Successfully navigated to the specified URL.
INFO     [agent] 🧠 Memory: Navigated to the course page. Next, locate and click the element with title '学过了'.
INFO     [agent] 🎯 Next goal: Find and click the element with title '学过了'.
INFO     [agent] 🛠️  Action 1/2: {"scroll_down":{"amount":500}}
INFO     [agent] 🛠️  Action 2/2: {"scroll_down":{"amount":500}}
INFO     [controller] 🔍  Scrolled down the page by 500 pixels
INFO     [controller] 🔍  Scrolled down the page by 500 pixels
INFO     [agent] 📍 Step 3
INFO     [agent] 👍 Eval: Success - Successfully navigated to the specified URL.
INFO     [agent] 🧠 Memory: Navigated to the course page. Scrolled down to locate the element with title '学过了'. The element is not visible in the current list of interactive elements.
INFO     [agent] 🎯 Next goal: Continue scrolling or explore other sections to locate the element with title '学过了'.
INFO     [agent] 🛠️  Action 1/2: {"scroll_down":{"amount":500}}
INFO     [agent] 🛠️  Action 2/2: {"scroll_down":{"amount":500}}
INFO     [controller] 🔍  Scrolled down the page by 500 pixels
INFO     [controller] 🔍  Scrolled down the page by 500 pixels
INFO     [agent] 📍 Step 4
INFO     [agent] 👍 Eval: Success - Successfully navigated to the specified URL and scrolled down to locate the element.
INFO     [agent] 🧠 Memory: Navigated to the course page and scrolled down. The element with title '学过了' is not visible in the current list of interactive elements. Next, explore other sections or scroll further to locate it.
INFO     [agent] 🎯 Next goal: Continue scrolling or explore other sections to locate the element with title '学过了'.
INFO     [agent] 🛠️  Action 1/2: {"scroll_down":{"amount":500}}
INFO     [agent] 🛠️  Action 2/2: {"scroll_down":{"amount":500}}
INFO     [controller] 🔍  Scrolled down the page by 500 pixels
INFO     [controller] 🔍  Scrolled down the page by 500 pixels
INFO     [agent] 📍 Step 5
INFO     [agent] ⚠ Eval: Failed - The element with title '学过了' is not found in the current list of interactive elements despite scrolling.
INFO     [agent] 🧠 Memory: Navigated to the course page and scrolled down multiple times. The element with title '学过了' is not visible in the current list of interactive elements. Next, explore other sections or check if the element is hidden or requires a specific action.
INFO     [agent] 🎯 Next goal: Check if the element '学过了' appears after interacting with other elements or if it is hidden.
INFO     [agent] 🛠️  Action 1/2: {"click_element":{"index":3}}
INFO     [agent] 🛠️  Action 2/2: {"click_element":{"index":4}}
INFO     [controller] 🖱️  Clicked button with index 3: 课时介绍
INFO     [controller] 🖱️  Clicked button with index 4: 分享
INFO     [agent] 📍 Step 6
INFO     [agent] ⚠ Eval: Failed - The element with title '学过了' is not found in the current list of interactive elements despite clicking other elements.
INFO     [agent] 🧠 Memory: Navigated to the course page, scrolled down, and clicked other elements. The element with title '学过了' is not visible in the current list of interactive elements. Next, check if the element appears after completing a specific action or if it is hidden under a different context.
INFO     [agent] 🎯 Next goal: Check if the element '学过了' appears after completing a specific action or if it is hidden under a different context.
INFO     [agent] 🛠️  Action 1/2: {"click_element":{"index":9}}
INFO     [agent] 🛠️  Action 2/2: {"click_element":{"index":10}}
INFO     [controller] 🖱️  Clicked button with index 9: 
INFO     [controller] 🖱️  Clicked button with index 10: 自动打开下一课时

muww avatar Mar 26 '25 13:03 muww

See my comment here: https://github.com/browser-use/browser-use/issues/871#issuecomment-2755411883

Within an action browser should be a BrowserContext object, not a Browser object. (We may merge these classes in the future as currently it is slightly confusing)

See the examples/use-cases/google_sheets.py for an example.

@controller.registry.action('Some example action')
- async def some_action(browser: Browser, some_arg: str):
+ async def some_action(browser: BrowserContext, some_arg: str):
	page = await browser.get_current_page()
	...

pirate avatar Mar 26 '25 18:03 pirate