AutoGPT
AutoGPT copied to clipboard
Cookies selected instead of page content on google'ing
Duplicates
- [X] I have searched the existing issues
Steps to reproduce 🕹
Ask autoGPT to assemble some list (example: investors list, client list, whatever)
Current behavior 😯
When googling for data:
SYSTEM: Command browse_website returned: ("Answer gathered from website: The text does not provide information abou t the XYZ. It explains how TechCrunch uses cookies for their websites and apps, and how users can manage their privacy settings.
Pops up very often even through the website has information. The reason is the cookies popup content is selected for analysis instead of actual content of the page.
Expected behavior 🤔
autoGPT reads the page and gathers data from it
Your prompt 📝
No response
same here, what to do about it
same here - it doesn't seem to work
I added the sentence "Ignore content from cookie consent popups, ads and disclaimers." to line 131 in autogpt/processing/text.py. So far I didn't have anymore problems with cookie banners, but I haven't tested much. More often than not I ran into the "too much tokens" problem.
An easy fix is to use a VPN and point it at a server outside of the EU (for example one in the US).
I'm having the same issue on nearly all websites. Whenever a cookie wall appears or even a 'subscribe to our newsletter' pop-up appears on the page, AutoGPT only reads inside those containers. How can I make AutoGPT ignore cookie walls and those other pop-ups when scraping a website?
same problem
same problem but only when I use the google api
Any update on this, I have the same issue where AutoGPT just scrapes pop up’s and doesn't read the whole page
I fixed this by configuring my Selenium class to close the pop up before it reads. I had to know the pop up buttons class to close it, for other pop ups, try using the disable pop up option in the Selenium class that handles all the commands or get the pop us class and close it that way.
@jeffmercury Thank you! Are you interested in a PR?
Hi @FarzanT yes
Hi @jeffmercury Can you please explain, what exactly to do for "fixed this by configuring my Selenium class to close the pop up before it reads"
Before trying my solution, I would like to state that a better way is to see if you can get the data you want as JSON from an API, instead of having to scrape the web for your information which introduces a lot of errors. That's what I'm doing otherwise try my solution below
This solution is not a guaranteed and requires some configuration. It will not work in Docker. There are two ways to disable pops ups. Adding the --disable-popup-blocking option to selenium or have selenium close the pop up by clicking it close button.
First update your .ENV with this : HEADLESS_BROWSER=True USE_WEB_BROWSER=chrome
First Option:
Then in your web_selenium.py find the scrape_text_with_seleniumfunction and add this :
options.add_argument("--disable-popup-blocking")
This should disable normal pop ups, however If pop-ups still persist, they might be implemented in a way that is not recognized as a standard JavaScript alert by Selenium. In this case, you would need to identify the specific HTML element of the pop-up and interact with it (e.g., click a close button).
Second Option:
Add this to the scrape_text_with_seleniumfunction() in the web_selenium.py file right after this driver.get(url) .
#If the pop up is a dialog box that cant be closed with the --disable-popup-blocking, find the close button and click it
try:
wait = WebDriverWait(driver, 2)
close_button = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#sds-dialog-0 > layout-splash-modal > div.grid-row.flex-column.tablet\:flex-row > div:nth-child(2) > div > button')))
close_button.click()
except Exception as e:
print(f"Error interacting with the close button: {e}")
Here we are telling selenium to wait 2 seconds then hit the pop ups close button. Use the browser tools to get the close button of your pop up CSS selector and pass it to the function above like I did.
Ensure you are using manual mode and explicitly tell the AI to use the browse_website function with questions on what to read on the page as arguments. If you get stuck, use chat gpt to help you, this is how I came up with this solution.
@Bubble007
Hi @jeffmercury, thank you very much. I think. that helps many. :-)
I can also confirm that this is an issue. auto-gpt cannot google anything since it only sees the cookie banner.
Confirm the issue as well
same here, seems like there is no easy solution available exept for moving the server...
This issue has automatically been marked as stale because it has not had any activity in the last 50 days. You can unstale it by commenting or removing the label. Otherwise, this issue will be closed in 10 days.
This issue was closed automatically because it has been stale for 10 days with no activity.