crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

Results 541 crawl4ai issues
Sort by recently updated
recently updated
newest added

Hello I am trying to use crawl4ai with Ollama as backend as listed in the in the config.py and mentioned on the providers page of Ollama. Note that I already...

I found a problem with v0.3.6 where the screenshot would be blank. This problem seems to occur on sites that are reloaded using javascript like "https://0-0.energy" I got around this...

bug

It would be great if crawl4ai could scrape PDF files from websites and return both the PDF and a Markdown (MD) version of the content. Similar to this link https://arxiv.org/pdf/2402.06196...

The result.markdown doesn't include links. I've a use case where I'll be passing the markdown to LLM to identify the product details. Here I want to get the product details...

enhancement
question

Hi, We are trying to do the LLM extraction using the sample code provided [here](https://crawl4ai.com/mkdocs/examples/llm_extraction). This is how we have added the LLM details ``` async with AsyncWebCrawler(verbose=True) as crawler:...

question

```python from crawl4ai import WebCrawler from crawl4ai.chunking_strategy import SlidingWindowChunking from crawl4ai.extraction_strategy import LLMExtractionStrategy crawler = WebCrawler() crawler.warmup() strategy = LLMExtractionStrategy( provider='openai', api_token=os.getenv('OPENAI_API_KEY') ) loader = crawler.run(url=all_urls[0], extraction_strategy=strategy) chunker = SlidingWindowChunking(window_size=2000,...

question

# fix requests version ## Description change the `requests>=2.26.0,=2.26.0,

![微信图片_20241017225821](https://github.com/user-attachments/assets/c6379531-8a77-4a96-a697-1019ad849766) 在更开始安装完之后,调用这个示例没有问题,但是,在我将异步agnet加到的代码中之后,就会出现,这个错误,初夏这个错误之后,再将代码恢复示例的样子,无法运行。大佬有空看一下,谢谢 ![Uploading 微信图片_20241017230103.png…]()

❓ Question

I am trying to crawl links from websites, but it is either returning empty results or taking too long to retrieve the links. How can I implement a strategy to...