[Bug]: Unexpected error in _crawl_web at line 744 in _crawl_web (.venv/lib/python3.11/site-packages/crawl4ai/async_crawler_strategy.py):
crawl4ai version
0.6.3
Expected Behavior
the url is not able to scrape i am getting below error Unexpected error in _crawl_web at line 744 in _crawl_web (.venv/lib/python3.11/site-packages/crawl4ai/async_crawler_strategy.py
Current Behavior
Error: Unexpected error in _crawl_web at line 744 in _crawl_web (.venv/lib/python3.11/site-packages/crawl4ai/async_crawler_strategy.py): Error: Failed on navigating ACS-GOTO: Page.goto: Target page, context or browser has been closed Call log:
- navigating to "https://www.jonesday.com/en", waiting until "domcontentloaded"
Code context:
739 response = await page.goto(
740 url, wait_until=config.wait_until, timeout=config.page_timeout
741 )
742 redirected_url = page.url
743 except Error as e:
744 → raise RuntimeError(f"Failed on navigating ACS-GOTO:\n{str(e)}")
745
746 await self.execute_hook(
747 "after_goto", page, context=context, url=url, response=response, config=config
748 )
Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
async def scrape_website(website: str) -> dict:
"""
Function to scrape a website and return its content, links, and markdown representation.
Args:
website (str): The URL of the website to scrape.
Returns:
dict: A dictionary containing the scraped data, including URL, success status, error message, HTML content, markdown representation, and links.
"""
logger.info("Within the scrape website function %s:", website)
browser_config = BrowserConfig(
verbose=True,
browser_type="chromium",
headless=True,
user_agent_mode="random",
light_mode=True,
use_managed_browser=False,
extra_args=[
"--headless=new",
"--remote-allow-origins=*",
"--autoplay-policy=user-gesture-required",
"--single-process",
]
)
md_generator = DefaultMarkdownGenerator(
options={
"content_source": "cleaned_html",
"ignore_images": True,
"check_robots_txt": False,
"wait_for_images": False,
"scan_full_page": True,
}
)
run_config = CrawlerRunConfig(cache_mode=CacheMode.BYPASS,
markdown_generator=md_generator, verbose=False, page_timeout=180000
)
async with AsyncWebCrawler(config=browser_config) as crawler:
result = await crawler.arun(url=website, config=run_config)
crawl_data = {
"url": result.url,
"success": result.success,
"error_message": result.error_message,
"html": result.html,
"markdown": result.markdown,
"links": result.links,
}
logger.info("Scraping Status %s:",crawl_data['success'] )
return crawl_data
OS
Windows
Python version
3.10
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response