crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

Results 541 crawl4ai issues
Sort by recently updated
recently updated
newest added

## Description I've identified an issue in the Advanced Usage example in the README.md file. The current CSS selector used for extracting content from the NBC News business page is...

bug
enhancement

Hi, I tried to create a lambda layer for this library but it's not working, is there a lambda layer zip or docker image to use the library in lambda?

Many pages like https://www.wsj.com/world/china/chinas-patriotic-rhetoric-takes-a-violent-turn-6266ca09: are not crawlable. I've tried both sync and async mode, all returns failure: ``` [ERROR] 🚫 Failed to crawl https://www.nbcnews.com/business, error: Failed to crawl https://www.nbcnews.com/business: Timeout...

Needed to pass an Authorization header field to the LLM service, that is run on a own server with proxy/authentication in place. How is that possible?

I created aws lambda docker image, and it fails on this line from crawl4ai import AsyncWebCrawler ```{ "errorMessage": "[Errno 30] Read-only file system: '/home/sbx_user1051'", "errorType": "OSError", "requestId": "", "stackTrace": [...

enhancement
question

Hey thx for the lib :) Playing around with it trying to crawl: `https://mantine.dev/core/button/?t=props` If you have a quick answer why it doesn't work, that would be great, else I'll...

bug

could you please add the possibility to change the timeout, in some places and containers could take more than 60 seconds crawl4ai/crawl4ai/async_crawler_strategy.py line 251 response = await page.goto(url, wait_until="domcontentloaded", timeout=60000)

Traceback (most recent call last): File "C:\Users\57682\PycharmProjects\pythonProject\main.py", line 2, in from crawl4ai import AsyncWebCrawler File "C:\Users\57682\PycharmProjects\pythonProject\venv\Lib\site-packages\crawl4ai\__init__.py", line 3, in from .async_webcrawler import AsyncWebCrawler File "C:\Users\57682\PycharmProjects\pythonProject\venv\Lib\site-packages\crawl4ai\async_webcrawler.py", line 9, in from .chunking_strategy...