I had tried to run Captcha.py but not working as expected
Bug Description
I had run captcha.py and other scripts as well but giving error as "LangChainBetaWarning: The function load is in beta. It is actively being worked on, so the API may change.value['message'] = load(value['message'])", Could you please check and help me to resolve this.
Reproduction Steps
- Go to use-cases directory
- Run Captcha.py
- Script will run initially
- but ended up giving message on terminal as "LangChainBetaWarning: The function
loadis in beta. It is actively being worked on, so the API may change. value['message'] = load(value['message'])"
Code Sample
"""
Goal: Automates CAPTCHA solving on a demo website.
Simple try of the agent.
@dev You need to add OPENAI_API_KEY to your environment variables.
NOTE: captchas are hard. For this example it works. But e.g. for iframes it does not.
for this example it helps to zoom in.
"""
import os
import sys
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import asyncio
from langchain_openai import ChatOpenAI
from browser_use import Agent
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
if not os.getenv('OPENAI_API_KEY'):
raise ValueError('OPENAI_API_KEY is not set. Please add it to your environment variables.')
async def main():
llm = ChatOpenAI(model='gpt-4o')
agent = Agent(
task='go to https://captcha.com/demos/features/captcha-demo.aspx and solve the captcha',
llm=llm,
)
await agent.run()
input('Press Enter to exit')
if __name__ == "__main__":
asyncio.run(main())
Version
0.1.40
LLM Model
GPT-4o
Operating System
Windows 10
Relevant Log Output
"C:\Program Files\Python312\python.exe" C:\Automation\browser-use-main\browser-use-main\examples\use-cases\captcha.py
INFO [browser_use] BrowserUse logging setup complete with level info
INFO [root] Anonymized telemetry enabled. See https://docs.browser-use.com/development/telemetry for more information.
C:\Users\maniyar\AppData\Roaming\Python\Python312\site-packages\browser_use\agent\message_manager\views.py:59: LangChainBetaWarning: The function `load` is in beta. It is actively being worked on, so the API may change.
value['message'] = load(value['message'])
INFO [agent] 🚀 Starting task: go to https://captcha.com/demos/features/captcha-demo.aspx and solve the captcha
INFO [agent] 📍 Step 1
INFO [agent] 📍 Step 1
INFO [agent] 📍 Step 1
INFO [agent] 📍 Step 1
Process finished with exit code -1
Is there anything about it? I am getting the same behavior (opens empty browser window, loops through "INFO [agent] 📍 Step 1") for any code I am trying to run.
i have the same question, how could i solve this problem?
i have the same question. with blank browser windows, loops with Step 1, but browser did not show any content and url.
LangChainBetaWarning: The function load is in beta is a red herring, unrelated to the issue of seeing Step 1 over and over again.
One common cause of it getting stuck on startup is that playwright's await page.title() can hang indefinitely if for some reason that page's CDP channel is crashed. I fixed that particular bug in a commit here (as part of a bigger unrelated PR). Subscribe over there to be notified when it merges.
I have an issue related to Captcha.py. Is it normal to give wrong answers in this use case?
Code:
# Modified from https://github.com/browser-use/browser-use/blob/main/examples/use-cases/captcha.py
"""
Goal: Automates CAPTCHA solving on a demo website.
Simple try of the agent.
NOTE: captchas are hard. For this example it works. But e.g. for iframes it does not.
for this example it helps to zoom in.
"""
import os
import sys
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import asyncio
from browser_use import Agent
from dotenv import load_dotenv
from langchain_ollama import ChatOllama
# Load environment variables
load_dotenv()
async def main():
llm = ChatOllama(model="granite3.2:latest")
agent = Agent(
task="go to https://captcha.com/demos/features/captcha-demo.aspx and solve the captcha",
llm=llm,
)
await agent.run()
input("Press Enter to exit")
if __name__ == "__main__":
asyncio.run(main())
Log:
INFO [agent] 🚀 Starting task: go to https://captcha.com/demos/features/captcha-demo.aspx and solve the captcha
INFO [agent] 📍 Step 1
INFO [agent] 👍 Eval: Success - Navigated to the captcha page
INFO [agent] 🧠 Memory: Starting with step 1: Going to the captcha demo page. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Solve the captcha on the page
INFO [agent] 🛠️ Action 1/1: {"go_to_url":{"url":"https://captcha.com/demos/features/captcha-demo.aspx"}}
INFO [controller] 🔗 Navigated to https://captcha.com/demos/features/captcha-demo.aspx
INFO [agent] 📍 Step 2
INFO [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Click the validate button
INFO [agent] 🛠️ Action 1/2: {"input_text":{"index":8,"text":"captchaCode"}}
INFO [agent] 🛠️ Action 2/2: {"click_element":{"index":10}}
INFO [agent] 📍 Step 3
INFO [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Click the validate button
INFO [agent] 🛠️ Action 1/2: {"input_text":{"index":8,"text":"captchaCode"}}
INFO [agent] 🛠️ Action 2/2: {"click_element":{"index":10}}
INFO [agent] 📍 Step 4
INFO [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Click the validate button
INFO [agent] 🛠️ Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO [agent] 📍 Step 5
INFO [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Click the validate button
INFO [agent] 🛠️ Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO [agent] 📍 Step 6
INFO [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Click the validate button
INFO [agent] 🛠️ Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO [agent] 📍 Step 7
INFO [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Click the validate button
INFO [agent] 🛠️ Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO [agent] 📍 Step 8
INFO [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Click the validate button
INFO [agent] 🛠️ Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO [agent] 📍 Step 9
INFO [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Click the validate button
INFO [agent] 🛠️ Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
INFO [agent] 📍 Step 10
INFO [agent] ⚠ Eval: Failed - The captcha needs to be solved manually
INFO [agent] 🧠 Memory: Completed step 2: Entered the captcha code. Now, I need to click the 'validateCaptchaButton' to submit the form. No previous steps or information to remember.
INFO [agent] 🎯 Next goal: Click the validate button
INFO [agent] 🛠️ Action 1/1: {"input_text":{"index":8,"text":"captchaCode"}}
Make sure to try on the latest 0.1.41 version. Also it's not going to work without use_vision=True + gpt-4o. Most other vision models are not nearly as successful for such a difficult task as a captcha.
@pirate I a browser-use script which used to login and solve captcha, but now it's no longer working, captcha attempts are all wrong. I assume OpenAI model changed and made it worse.
Is there any way to outsource the captcha solving to another 3rd party web service? (i.e. browser-use gets image url, sends elsewhere for solving, inputs result, continues)
For better captcha handling you can use our cloud
With cloud it does also not work