browser-use icon indicating copy to clipboard operation
browser-use copied to clipboard

I had tried to run Captcha.py but not working as expected

Open avejmaniyar124 opened this issue 9 months ago • 4 comments

Bug Description

I had run captcha.py and other scripts as well but giving error as "LangChainBetaWarning: The function load is in beta. It is actively being worked on, so the API may change.value['message'] = load(value['message'])", Could you please check and help me to resolve this.

Image

Reproduction Steps

  1. Go to use-cases directory
  2. Run Captcha.py
  3. Script will run initially
  4. but ended up giving message on terminal as "LangChainBetaWarning: The function load is in beta. It is actively being worked on, so the API may change. value['message'] = load(value['message'])"

Code Sample

"""
Goal: Automates CAPTCHA solving on a demo website.


Simple try of the agent.
@dev You need to add OPENAI_API_KEY to your environment variables.
NOTE: captchas are hard. For this example it works. But e.g. for iframes it does not.
for this example it helps to zoom in.
"""

import os
import sys

sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

import asyncio
from langchain_openai import ChatOpenAI
from browser_use import Agent
from dotenv import load_dotenv

# Load environment variables
load_dotenv()
if not os.getenv('OPENAI_API_KEY'):
    raise ValueError('OPENAI_API_KEY is not set. Please add it to your environment variables.')

async def main():
    llm = ChatOpenAI(model='gpt-4o')
    agent = Agent(
	    task='go to https://captcha.com/demos/features/captcha-demo.aspx and solve the captcha',
		llm=llm,
	)
    await agent.run()
    input('Press Enter to exit')

if __name__ == "__main__":
    asyncio.run(main())

Version

0.1.40

LLM Model

GPT-4o

Operating System

Windows 10

Relevant Log Output

"C:\Program Files\Python312\python.exe" C:\Automation\browser-use-main\browser-use-main\examples\use-cases\captcha.py 
INFO     [browser_use] BrowserUse logging setup complete with level info
INFO     [root] Anonymized telemetry enabled. See https://docs.browser-use.com/development/telemetry for more information.
C:\Users\maniyar\AppData\Roaming\Python\Python312\site-packages\browser_use\agent\message_manager\views.py:59: LangChainBetaWarning: The function `load` is in beta. It is actively being worked on, so the API may change.
  value['message'] = load(value['message'])
INFO     [agent] 🚀 Starting task: go to https://captcha.com/demos/features/captcha-demo.aspx and solve the captcha
INFO     [agent] 📍 Step 1
INFO     [agent] 📍 Step 1
INFO     [agent] 📍 Step 1
INFO     [agent] 📍 Step 1

Process finished with exit code -1

avejmaniyar124 avatar Mar 03 '25 09:03 avejmaniyar124

Is there anything about it? I am getting the same behavior (opens empty browser window, loops through "INFO [agent] 📍 Step 1") for any code I am trying to run.

atemerev avatar Mar 11 '25 13:03 atemerev

i have the same question, how could i solve this problem?

luminfeiS avatar Mar 13 '25 09:03 luminfeiS

i have the same question. with blank browser windows, loops with Step 1, but browser did not show any content and url.

zhong-denny avatar Mar 18 '25 14:03 zhong-denny

LangChainBetaWarning: The function load is in beta is a red herring, unrelated to the issue of seeing Step 1 over and over again.

One common cause of it getting stuck on startup is that playwright's await page.title() can hang indefinitely if for some reason that page's CDP channel is crashed. I fixed that particular bug in a commit here (as part of a bigger unrelated PR). Subscribe over there to be notified when it merges.

pirate avatar Mar 23 '25 08:03 pirate

I have an issue related to Captcha.py. Is it normal to give wrong answers in this use case?

Code:

# Modified from https://github.com/browser-use/browser-use/blob/main/examples/use-cases/captcha.py

"""
Goal: Automates CAPTCHA solving on a demo website.


Simple try of the agent.
NOTE: captchas are hard. For this example it works. But e.g. for iframes it does not.
for this example it helps to zoom in.
"""

import os
import sys

sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

import asyncio

from browser_use import Agent
from dotenv import load_dotenv
from langchain_ollama import ChatOllama

# Load environment variables
load_dotenv()


async def main():
    llm = ChatOllama(model="granite3.2:latest")
    agent = Agent(
        task="go to https://captcha.com/demos/features/captcha-demo.aspx and solve the captcha",
        llm=llm,
    )
    await agent.run()
    input("Press Enter to exit")


if __name__ == "__main__":
    asyncio.run(main())

Log:

INFO     [agent] 🚀 Starting task: go to https://captcha.com/demos/features/captcha-demo.aspx and solve the captcha
INFO     [agent] 📍 Step 1
INFO     [agent] 👍 Eval: Success - Navigated to the captcha page
INFO     [agent] 🧠 Memory: Starting with step 1: Going to the captcha demo page. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Solve the captcha on the page
INFO     [agent] 🛠️  Action 1/1: {"go_to_url":{"url":"https://captcha.com/demos/features/captcha-demo.aspx"}}
INFO     [controller] 🔗  Navigated to https://captcha.com/demos/features/captcha-demo.aspx
INFO     [agent] 📍 Step 2
INFO     [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO     [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Click the validate button
INFO     [agent] 🛠️  Action 1/2: {"input_text":{"index":8,"text":"captchaCode"}}
INFO     [agent] 🛠️  Action 2/2: {"click_element":{"index":10}}
INFO     [agent] 📍 Step 3
INFO     [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO     [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Click the validate button
INFO     [agent] 🛠️  Action 1/2: {"input_text":{"index":8,"text":"captchaCode"}}
INFO     [agent] 🛠️  Action 2/2: {"click_element":{"index":10}}
INFO     [agent] 📍 Step 4
INFO     [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO     [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Click the validate button
INFO     [agent] 🛠️  Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO     [agent] 📍 Step 5
INFO     [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO     [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Click the validate button
INFO     [agent] 🛠️  Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO     [agent] 📍 Step 6
INFO     [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO     [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Click the validate button
INFO     [agent] 🛠️  Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO     [agent] 📍 Step 7
INFO     [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO     [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Click the validate button
INFO     [agent] 🛠️  Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO     [agent] 📍 Step 8
INFO     [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO     [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Click the validate button
INFO     [agent] 🛠️  Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO     [agent] 📍 Step 9
INFO     [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO     [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Click the validate button
INFO     [agent] 🛠️  Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO     [agent] 📍 Step 10
INFO     [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO     [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO     [agent] 🎯 Next goal: Click the validate button
INFO     [agent] 🛠️  Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}

DENGARDEN avatar Apr 17 '25 08:04 DENGARDEN

Make sure to try on the latest 0.1.41 version. Also it's not going to work without use_vision=True + gpt-4o. Most other vision models are not nearly as successful for such a difficult task as a captcha.

pirate avatar Apr 17 '25 10:04 pirate

@pirate I a browser-use script which used to login and solve captcha, but now it's no longer working, captcha attempts are all wrong. I assume OpenAI model changed and made it worse.

Is there any way to outsource the captcha solving to another 3rd party web service? (i.e. browser-use gets image url, sends elsewhere for solving, inputs result, continues)

richardARPANET avatar Apr 25 '25 19:04 richardARPANET

For better captcha handling you can use our cloud

MagMueller avatar Sep 07 '25 02:09 MagMueller

With cloud it does also not work

mark-baumann avatar Oct 02 '25 18:10 mark-baumann