crawl4ai
crawl4ai copied to clipboard
ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
### crawl4ai version 0.6.3 ### Expected Behavior # [Bug]: Proxy Not Working with Oxylabs and Bright Data on v0.6.3 (`net::ERR_NO_SUPPORTED_PROXIES`) ## Description I'm experiencing persistent issues using both Oxylabs and...
## Summary This change enhances support for the `--disable-web-security` Chromium flag in Crawl4AI to properly bypass CORS restrictions during JavaScript execution. Previously, using this flag in `BrowserConfig.extra_args` would fail because...
### crawl4ai version 0.7.4 ### Expected Behavior It was supposed to deep crawl all the urls provided while doing arun_many. ### Current Behavior Instead of crawling with deep crawl, it...
### crawl4ai version 0.5.0.post2 ### Expected Behavior When crawling code blocks from the triton tutorial page: https://triton-lang.org/main/getting-started/tutorials/01-vector-add.html#sphx-glr-getting-started-tutorials-01-vector-add-py, space between `import` and the package name is omitted. The webpage contains several...
# Summary This PR fixes a concurrency bug in AsyncWebCrawler.arun_many() when using managed browsers. The issue was that all concurrent crawl tasks were fighting over one shared tab, causing failures....
### crawl4ai version v0.7.6 ### Expected Behavior Hello ! I am trying to use crawl4ai for concurrent authenticated crawling. However, I'm running into errors when combining cdp with arun_many. In...
## Description The Docker API server lags behind the Python library. This issue tracks adding endpoints/parameters to expose the following library features: ### 1. **Adaptive crawling** - AdaptiveCrawler, AdaptiveConfig, CrawlState,...
I think the title of the issue sums it up though, PyPDF2 hasn't been maintained since 2022 and is abandoned in favor of PyPDFv3 (up to v6 now). See: https://pypi.org/project/PyPDF2/
Add opt-in telemetry across environments: Python library, Docker/API server, and interactive notebooks. Goal: capture exceptions/crashes to improve stability. Telemetry module must be provider-agnostic (Sentry as first backend, easily swappable). ##...
### crawl4ai version 0.7.1 ### Expected Behavior Crawl all internal links in DFS order up to the specified depth and maximum number of pages, regardless of whether arun() or arun_many()...