crawl4ai issues

[Bug]: Timeout Issue in Sequential PDF Processing Using AsyncWebCrawler & PDFCrawlerStrategy

### crawl4ai version 0.6.3 ### Expected Behavior - ### Current Behavior **Version:** `Crawl4AI v0.6.3` **Description:** When extracting raw text from a set of PDF URLs using `AsyncWebCrawler` with `PDFCrawlerStrategy` and...

ntohidi

🐞 Bug

⚙️ In-progress

[Bug]: maximum recursion depth exceeded error during crawl

3

### crawl4ai Version 0.6.2 ### Expected Behavior The crawler should successfully traverse and collect all valid pages up to the defined depth and page limit. ### Current Behavior The crawler...

Harinib-Kore

🐞 Bug

🩺 Needs Triage

[Bug]: CrawlResult Links not taking into account base tag

### crawl4ai version 0.6.3 ### Expected Behavior The links array in CrawlResult should be derived based on if there is a base tag on the page.

poryan

✨ Enhancement

📌 Root caused

[Bug]: "Example: Building a Knowledge Graph" produces error contents

### crawl4ai version 0.6.3 ### Expected Behavior I expect that running the script in docs [Example: Building a Knowledge Graph](https://docs.crawl4ai.com/extraction/llm-strategies/#9-example-building-a-knowledge-graph) will produce a `kb_result.json` file with knowledge graph data. ###...

mattrossman

📖 Documentation

[Bug]: faile to use dify crawl4ai plugin

### crawl4ai version docker unclecode/crawl4ai:0.6.0-r1 ### Expected Behavior "https://docs.crawl4ai.com/" need how to get crawl4ai_api_token ### Current Behavior 1、I haven pull docker unclecode/crawl4ai:0.6.0-r1 image and run it as fllows: # Make...

minglong-huang

🐞 Bug

🩺 Needs Triage

[Bug]: KeyError: 'provider' in LLMExtractionStrategy when using Ollama (llama3.2:3b)

### crawl4ai version Crawl4AI 0.6.3 ### Expected Behavior I expected to be able to use Crawl4AI 0.6.3 with a local Ollama model (llama3.2:3b) to extract structured data (news articles) from...

carla4av

🐞 Bug

🩺 Needs Triage

[Bug]: Cannot handle dynamic content

### crawl4ai version 1.6.3 ### Expected Behavior return dynamic content ### Current Behavior get an "Error: list index out of range", when re-use the same crawler session ### Is this...

github-gamma

🐞 Bug

🩺 Needs Triage

Bot issue

5

### crawl4ai version Crawl4AI 0.5.0.post8 ### Expected Behavior Hi, I'm new to Crawl4AI and I'm facing some issues that need clarification. I'm trying to scrape data from sites like PitchBook...

rashidwiizb

❓ Question

Feature/scraping strategy - refactor: Remove WebScrapingStrategy and fix metadata extraction (#995)

1

## Summary This PR refactors the content scraping strategy by removing the BeautifulSoup-based `WebScrapingStrategy` class and making `LXMLWebScrapingStrategy` the sole implementation. This simplifies the codebase by eliminating duplicate functionality while...

ntohidi

fix cleanup warning when no process listening on debug port

1

## Summary The following warning is raised on Linux when using `use_persistent_context=True` without any existing process listening to the debugging port: > [BROWSER]. ℹ pre-launch cleanup failed: Command '[['lsof', '-t',...

lbeziaud

crawl4ai
crawl4ai copied to clipboard

Metadata

[Bug]: Timeout Issue in Sequential PDF Processing Using AsyncWebCrawler & PDFCrawlerStrategy

[Bug]: maximum recursion depth exceeded error during crawl

[Bug]: CrawlResult Links not taking into account base tag

[Bug]: "Example: Building a Knowledge Graph" produces error contents

[Bug]: faile to use dify crawl4ai plugin

[Bug]: KeyError: 'provider' in LLMExtractionStrategy when using Ollama (llama3.2:3b)

[Bug]: Cannot handle dynamic content

Bot issue

Feature/scraping strategy - refactor: Remove WebScrapingStrategy and fix metadata extraction (#995)

fix cleanup warning when no process listening on debug port

← Metadata

Owner

Metadata

crawl4ai crawl4ai copied to clipboard

Metadata

← Metadata

Owner

Metadata

crawl4ai
crawl4ai copied to clipboard