crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

Results 541 crawl4ai issues
Sort by recently updated
recently updated
newest added

### crawl4ai version 0.5.0 ### Expected Behavior All 680 product URLs passed to crawler.arun_many() should produce a corresponding result.extracted_content if the crawling and extraction process succeeds. Each result should be...

🐞 Bug
🩺 Needs Triage

### crawl4ai version 0.7.7 ### Expected Behavior Should parse the webpage correctly. ### Current Behavior When crawling this page: https://www.toshiba-lifestyle.com/th-en/blog/how-to-choose-the-right-laundry-product-for-you I get the following error: ``` [ERROR]... × https://www.toshiba-lif...laundry-product-for-you |...

🐞 Bug
📌 Root caused

### crawl4ai version latest ### Expected Behavior I successfully build a crawler request with BrowserConfig and CrawlerConfig with CSSExtraction, etc. Now I want to build a webhook strategy to not...

🐞 Bug
🩺 Needs Triage

## Summary Summary Ensures `BrowserConfig.to_dict()` emits JSON-safe data by converting nested ProxyConfig objects into dictionaries. Prevents `TypeError: Object of type ProxyConfig is not JSON serializable` in environments (like Docker) that...

### crawl4ai version 0.7.7 ### Expected Behavior When using `proxy_config` with `BrowserConfig`, the configuration should be serializable to JSON for use with the Docker API server's crawler pool. The `BrowserConfig.to_dict()`...

🐞 Bug
⚙️ In-progress
📌 Root caused

## Summary When scraping many URLs continuously, browser contexts accumulate in memory and are never cleaned up. The existing cleanup mechanism only runs when browsers go idle, which never happens...

### crawl4ai version 0.7.6 ### Expected Behavior When i set mean_delay, it should be delayed between requests ### Current Behavior It ignores mean_delay config ### Is this reproducible? Yes ###...

🐞 Bug
🩺 Needs Triage

## Summary Updates lxml dependency to 6.0. ## List of files changed and why To upgrade the lxml constraint and regenerate the lock file, these files were touched: pyproject.toml uv.lock...

hello, trying to use the link scoring feature with the following config but im getting the error below. crawl4ai running in docker. any idea what is wrong? ``` [LINK_EXTRACT] ℹ...

🐞 Bug
⚙️ In-progress

### crawl4ai version 0.6.3 ### Expected Behavior my example crawler: ``` llm_strategy = LLMExtractionStrategy( llm_config=self.llm_config, schema=PdfDoc.model_json_schema(), extraction_type="schema", instruction=""" From the crawled content, extract data from html - data in html...

🐞 Bug
⚙ Done
📌 Root caused