crawl4ai
crawl4ai copied to clipboard
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
Dears, while building a tool with your wonderful library, I realized there was no support for disabling SSL verification for websites that either use HTTP only or use HTTPS with...
when crawling a website the robots.txt should be respected.
Hello, can you make a version to easily install it on https://pinokio.computer/?
Thank you for the great work and it is prominent! Previously I used Google Vertex AI (i.e. Gemini) for doing something similar to yours but this repository is way better...
Please add which python versions are working I am in python 3.8.0 Collecting numpy=1.26.0 (from crawl4ai[torch]) Note: you may need to restart the kernel to use updated packages. ERROR: Could...
Thanks for creating alternatives to [FireCrawl](https://github.com/mendableai/firecrawl) for LLMs! Here is a bit of a question: are there examples or shortcuts for crawling a whole blog (may not may not have...
It would be great if it supports pagination . I wanted to use this solution to scrape whole lengthy documentations of some great open source projects for making easy knowledge...
@unclecode I see that setting hook is not working as expected. I am setting a delay using code below : ``` def delay(driver): print("Delaying for 5 seconds...") time.sleep(5) print("Resuming...") crawler_strategy...
There's a example code for proxy, but I didn't see the proxy parameter in class AsyncWebCrawler()