botasaurus icon indicating copy to clipboard operation
botasaurus copied to clipboard

Scraping issue with normal website

Open Anand-sk1324 opened this issue 6 months ago • 4 comments

When I try to scrape the https://geniusee.com/single-blog/fintech-regulation-legal-and-regulatory-aspects with browser decorator. The botosaurus just paused there. When analysed I have turned off load images and css and the target website tries to send the images anyway. This become a infinite loop. Please resolve this. Thank you.

Anand-sk1324 avatar Jun 20 '25 14:06 Anand-sk1324

I've tried with this code and everything worked:

from botasaurus.browser import browser, Driver
from botasaurus.lang import Lang
import time



@browser(output=None, headless=False, lang=Lang.Italian, add_arguments=['--disable-notifications'])
def scrape_heading_task(driver: Driver, data):    
  
   
    driver.get("https://geniusee.com/single-blog/fintech-regulation-legal-and-regulatory-aspects", timeout=10)
    
    driver.short_random_sleep()
    
    driver.save_screenshot("test.png")
	
    print(driver.current_url)
   
    print(driver.page_text)
	
	
# Initiate the web scraping task
scrape_heading_task()

drego85 avatar Jun 24 '25 08:06 drego85

Did you tried with images and CSS disabled?

Anand-sk1324 avatar Jun 25 '25 20:06 Anand-sk1324

Okay, I tried this code:

from botasaurus.browser import browser, Driver
from botasaurus.lang import Lang

@browser(output=None, headless=False, lang=Lang.Italian, add_arguments=['--disable-notifications'], block_images_and_css=True)
def scrape_heading_task(driver: Driver, data):    
  
   
    driver.get("https://geniusee.com/single-blog/fintech-regulation-legal-and-regulatory-aspects")
    
    driver.short_random_sleep()
    
    driver.save_screenshot("test.png")
	
    print(driver.current_url)
   
    print(driver.page_text)
	
	
# Initiate the web scraping task
scrape_heading_task()

The loading of the page in the browser never ends, some JavaScript code waits for the images to load.

TimeoutError: Document did not become ready within 60 seconds
Task failed for input: None

I can't help you unfortunately, I opened issues 258 for a similar case.

drego85 avatar Jun 26 '25 10:06 drego85