crawl4ai issues

crawling with Ollama uses OpenAI and requires API_token to be set

1

Hello I am trying to use crawl4ai with Ollama as backend as listed in the in the config.py and mentioned on the providers page of Ollama. Note that I already...

Praj-17

Failed to take a screenshot. The screenshot is blank.

2

I found a problem with v0.3.6 where the screenshot would be blank. This problem seems to occur on sites that are reloaded using javascript like "https://0-0.energy" I got around this...

ns-arakawa

bug

Enable PDF Scraping and Return Both PDF and MD Versions

1

It would be great if crawl4ai could scrape PDF files from websites and return both the PDF and a Markdown (MD) version of the content. Similar to this link https://arxiv.org/pdf/2402.06196...

jmontoyavallejo

Include links in the markdown

6

The result.markdown doesn't include links. I've a use case where I'll be passing the markdown to LLM to identify the product details. Here I want to get the product details...

ManojBhamsagar-Draup

enhancement

question

Unable to do LLM extraction with azure openai

6

Hi, We are trying to do the LLM extraction using the sample code provided [here](https://crawl4ai.com/mkdocs/examples/llm_extraction). This is how we have added the LLM details ``` async with AsyncWebCrawler(verbose=True) as crawler:...

MeghanaSrinath

question

how can i extract text from the CrawlResult?

5

```python from crawl4ai import WebCrawler from crawl4ai.chunking_strategy import SlidingWindowChunking from crawl4ai.extraction_strategy import LLMExtractionStrategy crawler = WebCrawler() crawler.warmup() strategy = LLMExtractionStrategy( provider='openai', api_token=os.getenv('OPENAI_API_KEY') ) loader = crawler.run(url=all_urls[0], extraction_strategy=strategy) chunker = SlidingWindowChunking(window_size=2000,...

deepak-hl

question

sujin502

❓ Question

How can we make crawling faster as it was getting slower for dynamically rendered website

2

I am trying to crawl links from websites, but it is either returning empty results or taking too long to retrieve the links. How can I implement a strategy to...

roshan-sinha-dev

crawl4ai
crawl4ai copied to clipboard

Metadata

crawling with Ollama uses OpenAI and requires API_token to be set

Failed to take a screenshot. The screenshot is blank.

Enable PDF Scraping and Return Both PDF and MD Versions

Include links in the markdown

Unable to do LLM extraction with azure openai

how can i extract text from the CrawlResult?

Scraper

fix requests version

异步报错，无法创建子进程

How can we make crawling faster as it was getting slower for dynamically rendered website

← Metadata

Owner

Metadata

crawl4ai crawl4ai copied to clipboard

Metadata

← Metadata

Owner

Metadata

crawl4ai
crawl4ai copied to clipboard