crawl4ai issues

Language Support

2

Hi, Thanks for the great repository. I am new to this repository, I was curious to know if there is any support to change the language before I crawl a...

oaishi

question

A docs page is broken

The code section of: https://crawl4ai.com/mkdocs/examples/summarization/ is broken, pls. fix it

vanetreg

Screenshot must be taken after wait_for condition is met

2

screenshot=True, takes screenshot before wait_for finishes, so on webpages where data loads later it take screenshot of empty pages.

studio-anurag

enhancement

Add proxy functionality to AsyncWebCrawler and AsyncPlaywrightCrawler

7

This PR adds proxy functionality to the AsyncWebCrawler and AsyncPlaywrightCrawlerStrategy classes. Linked Issue: #116 - Modified AsyncWebCrawler to accept a `proxy` parameter. - Updated AsyncPlaywrightCrawlerStrategy to handle proxy settings when...

neelthepatel8

Please respect TDM Reservation Protocol

I know libertarians will not be happy but ... In Europe, scrapping websites for the purpose of Text and Data Mining and LLM training is **legal** (this is the good...

llemeurfr

Timeout setting

2

Timeout setting

openainext

question

cannot import name 'WebCrawler' from 'crawl4ai'

5

Hi, when I try to run crawl4ai with microsoft edge on windows, I have this error below, ( same code works for ubuntu on chrome) Traceback (most recent call last):...

gulnihalk

❓ Question

Improve discoverability in chatGPT and other coding assistants

1

Hi, Since this is a recent repository, if someone wants to generate code that uses this library in either chatGPT or any other coding assistant it doesn't work. Would it...

codelion

enhancement

spelling change in prompt and support to gpt-4o-mini

I was going through prompt and i encounter spelling mistake in prompt so helping with it :)

Darshan2104

Fix crawling error in AsyncWebCrawler

Related to #105 Fix the 'NoneType' object has no attribute 'get' error in `AsyncWebCrawler`. * **crawl4ai/async_webcrawler.py** - Add a check in the `arun` method to ensure `html` is not `None`...

theguy000

crawl4ai
crawl4ai copied to clipboard

Metadata

Language Support

A docs page is broken

Screenshot must be taken after wait_for condition is met

Add proxy functionality to AsyncWebCrawler and AsyncPlaywrightCrawler

Please respect TDM Reservation Protocol

Timeout setting

cannot import name 'WebCrawler' from 'crawl4ai'

Improve discoverability in chatGPT and other coding assistants

spelling change in prompt and support to gpt-4o-mini

Fix crawling error in AsyncWebCrawler

← Metadata

Owner

Metadata

crawl4ai crawl4ai copied to clipboard

Metadata

← Metadata

Owner

Metadata

crawl4ai
crawl4ai copied to clipboard